I am working with the R programming language. Suppose I have the following data:
#create data var_1 = rnorm(1000,10,10) var_2 <- c("1","0") var_2 <- sample(var_1, 1000, replace=TRUE, prob=c(0.3, 0.7)) response<- c("2", "1","0") response <- sample(response, 1000, replace=TRUE, prob=c(0.3, 0.4, 0.3)) my_data = data.frame(var_1, var_2, response) my_data$var_2 = as.factor(my_data$var_2) my_data$response = as.factor(my_data$response)
I wrote the following code that makes a histogram for the "factor" variable and a density plot for the "numerical" variable:
#load libraries library(ggplot2) library(gridExtra) #first plot p1 = ggplot(my_data) + geom_histogram(aes(x=var_1, fill=response), colour="grey50", alpha=0.5, position="identity")+ ggtitle("var_2 vs response") #second plot (for some reason, this does not look correct?) p2 = ggplot(my_data, aes(x = var_2, fill = response)) + geom_density(alpha = 0.5) + ggtitle("var_1 vs response") grid.arrange(p1, p2, ncol=2)
My question: Suppose I had a dataset that had many "factor" variables and "numerical" variables. Are there any functions in R that can automatically detect whether the variable is "factor" or "numerical", and then draw the corresponding graph (filled using the color of the "response variable")?
Would it have been possible to produce these graphs automatically, without manually instructing R to make the correct type of graph for each variable "type"? (e.g. suppose there were 10 variables in a dataset, would it be possible to make 10 of these graphs?)