| Type: | Package | 
| Title: | Generate and Modify Synthetic Datasets | 
| Version: | 1.2.0 | 
| Date: | 2022-05-09 | 
| Author: | Francis Huang <flh3@hotmail.com> | 
| Maintainer: | Francis Huang <flh3@hotmail.com> | 
| Description: | Set of functions to create datasets using a correlation matrix. | 
| License: | GPL-3 | 
| NeedsCompilation: | no | 
| Packaged: | 2022-05-09 21:40:21 UTC; flh3 | 
| Repository: | CRAN | 
| Date/Publication: | 2022-05-09 21:50:02 UTC | 
Generate Synthetic Datasets
Description
Create synthetic datasets based on a correlation table. Additional functions can be used to rescale, transform, and reverse code variables.
Details
| Package: | gendata | 
| Type: | Package | 
| Version: | 1.1 | 
| Date: | 2012-02-27 | 
| License: | GPL-3 | 
Additional functions are for modifying the dataset. 
genmvnorm: 
creates the dataset (generates a multivariate normal dataset). 
recalib : for rescaling the dataset 
dtrans  : for giving a variable a new mean and standard deviation 
revcode : for reverse coding a variable
Author(s)
Francis Huang
Maintainer: Francis Huang <flh3@hotmail.com>
References
Fan, X., Felsovalyi, A., Sivo, S., & Keenan, S. (2002). SAS for Monte Carlo studies: A guide for quantitative researchers. SAS Institute.
See Also
genmvnorm revcode dtrans recalib
Data Transform
Description
Transforms variables in a dataset with a specified mean and standard deviation.
Usage
dtrans(data, m, sd, rnd = FALSE)
Arguments
| data | name of your dataset. | 
| m | indicate a vector of desired means. | 
| sd | indicate a vector of desired standard deviations. | 
| rnd | indicates if you want to round the numbers (no decimals).  | 
Author(s)
Francis Huang
Examples
sdata <- genmvnorm(cor = c(.7, .2, .3), k = 3, n = 500, seed = 12345)
cor(sdata)
summary(sdata)
#note: data are in z scores
s2 <- dtrans(sdata, c(0, 100, 50), c(1, 15, 10), rnd = FALSE)
summary(s2)
sd(s2[,2])
sd(s2[,3])
#note: variables X2 and X3 are now rescaled with the appropriate means and standard deviations.
head(s2)
s2 <- dtrans(sdata, c(0, 100, 50), c(1, 15, 10), rnd = TRUE)
#at times, you may want a dataset to not have decimals. use \code{rnd= TRUE}.
head(s2)
Genmvnorm
Description
Generates a multivariate normal dataset based on a specified correlation matrix.
Usage
genmvnorm(cor, k, n, seed = FALSE)
Arguments
| cor | Can be a correlation matrix– e.g., data<-cor(xyz)– or the lower half of a correlation matrix, e.g., for a 3 variable dataset, data<-c(.7,.3,.2)– useful for creating datasets without having to specify both halves of the correlation matrix. | 
| k | Indicate the number of variables in your dataset. | 
| n | Indicate the number of observations in your new synthetic dataset. | 
| seed | For reproducability of results, set a specific seed number. | 
Details
For creating synthetic datasets. Based on the SAS chapter by Fan et al. (2002).
Author(s)
Francis Huang
References
Based on:
Fan, X., Felsovalyi, A., Sivo, S., & Keenan, S. (2002). SAS for Monte Carlo studies: A guide for quantitative researchers. SAS Institute.
See Also
Examples
sdata<-genmvnorm(cor=c(.7,.2,.3),k=3,n=500,seed=12345)
cor(sdata)
#dataset above uses the lower half of a correlation table
#     1  .7  .2
#     .7  1  .3
#     .2 .3   1
# Can also use a correlation table
data(iris)
dat<-cor(iris[,1:3])
dat
sdata<-genmvnorm(cor=dat,k=3,n=100,seed=123)
cor(sdata)
#example above uses the IRIS dataset.
Recalibrate (rescale) Variables
Description
Rescale variables (one at a time) to have a new minimum and maximum value.
Usage
recalib(data, var, low, high)
Arguments
| data | the dataset to use. | 
| var | indicate the variable number (or variable name). | 
| low | Indicate the new minimum value. | 
| high | Indicate the new maximum value. | 
Details
Specify the rescaling of variables one at a time.
Author(s)
Francis Huang
See Also
Examples
sdata <- genmvnorm(cor = c(.7, .2, .3), k = 3, n = 500, seed = 12345)
cor(sdata)
summary(sdata[,1])
#note the min and max of variable X1
#changes variable one to have a minimum of 10 and a maximum of 50
#correlations remain the same
s2 <- recalib(sdata, 1, 10, 50)
cor(s2)
summary(s2[,1])
#note revised values of variable X1
Reverse Coding Variables
Description
Reverse codes variables
Usage
revcode(data, vars)
Arguments
| data | indicates your dataset. | 
| vars | indicates the variable number or name to reverse code. | 
Author(s)
Francis Huang