simulation of data

Hello Everyone,

I have a data of 5 observations of 26 variables. each variables data values are different. some are categorical, some are characters and some are integers. \

i have to simulate the data for 10000 observations with the same 26 variables. but the variables value should generate randomly. please help me in this regard.

Thanks,
Chinna

Random integers

integers <- sample(1:10000, replace = TRUE)

Random characters

library(janeaustenr)
library(dplyr)
library(stringr)
library(tidytext)

original_books <- austen_books() %>%
  group_by(book) %>%
  mutate(linenumber = row_number(),
         chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]",
                                                 ignore_case = TRUE)))) %>%
  ungroup()

tidy_books <- original_books %>%   unnest_tokens(word, text)
words <- tidy_books$word
characters <- sample(words, 10000)

Binary categorical variables

binaries <- rbinom(n=10000, size=1, prob=1/2)

Multivalued categorical variables

data(iris)
categories <- sample(iris$Species, 10000, replace = TRUE)

Dear Team,
Thanks for the immediate response,

Here I am attaching the sheet for your reference.

I need to extend the data,

for each row, i need to give the same type of values in the range for 10000 rows. if iam doing individually for each row. after that i need to combind every data row to make it 10000 observations with 26 variables.

for eg, in the a row. i have yes and no which is categorical. for 10000 rows i need yes and no in random.

i hope you are clear

Thanks,

Chinna

il

(Attachment sd.xlsx is missing)

you're going to need to show what you are trying, what is happening, and then define a specific question about one aspect of your workflow.

We are not free consultants who are waiting to do your work for you. Your work is to ask a clearly defined question and show some initiative in trying what you are learning.

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.