simulation of data

Hello Everyone,

I have a data of 5 observations of 26 variables. each variables data values are different. some are categorical, some are characters and some are integers. \

i have to simulate the data for 10000 observations with the same 26 variables. but the variables value should generate randomly. please help me in this regard.


Random integers

integers <- sample(1:10000, replace = TRUE)

Random characters


original_books <- austen_books() %>%
  group_by(book) %>%
  mutate(linenumber = row_number(),
         chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]",
                                                 ignore_case = TRUE)))) %>%

tidy_books <- original_books %>%   unnest_tokens(word, text)
words <- tidy_books$word
characters <- sample(words, 10000)

Binary categorical variables

binaries <- rbinom(n=10000, size=1, prob=1/2)

Multivalued categorical variables

categories <- sample(iris$Species, 10000, replace = TRUE)

Dear Team,
Thanks for the immediate response,

Here I am attaching the sheet for your reference.

I need to extend the data,

for each row, i need to give the same type of values in the range for 10000 rows. if iam doing individually for each row. after that i need to combind every data row to make it 10000 observations with 26 variables.

for eg, in the a row. i have yes and no which is categorical. for 10000 rows i need yes and no in random.

i hope you are clear




(Attachment sd.xlsx is missing)

you're going to need to show what you are trying, what is happening, and then define a specific question about one aspect of your workflow.

We are not free consultants who are waiting to do your work for you. Your work is to ask a clearly defined question and show some initiative in trying what you are learning.


This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.