I'm trying to make a dataset with a set number of values with a specific mean and standard deviation. Is there any way I could do that, either on excel or rstudio?
I've tried =norminv(rand() on excel; however, it doesn't give the exact values (mean and SD) I want
DESIRED_M <- 10
DESIRED_SD <- 2
x <- runif(10,1,100)
y <- x * DESIRED_SD / sd(x)
y <- y - mean(y) + DESIRED_M
cat("mean = ", mean(y), "sd = ", sd(y))
Thank you so much!
Could you please explain what each line does? (especially set.seed())
So, the second group (the third and forth groups) is the dataset.
Also, is this the standard deviation of the population? Is there any way I could get dataset with a specific standard deviation of the population instead of the sample?
set.seed initializes the random number generator so you get the same random numbers every time.
I am converting ten random numbers x into ten numbers y with a pre-determined mean and sd.
sd(x) is the sample sd. If you prefer the population sd, then use
sqrt((n-1)/n) * sd(x) where here n=10.
You do not specify the desired distribution. For a normal distribution, use rnorm(). You will need to specify values for the number of observations, the mean and the standard deviation. For 50 values from a normal distribution with a mean of 100 and a standard deviation of 15 it would be:
rnorm(50, mean = 100, sd = 15)