How to plot categorical data in R?


#1

I have told R the names of my categorical data and tried to plot however I have encountered this error:

plot(dframe1$Household.equiptment, dframe1$Schisto.Visual, ylab = "BMI", xlab = "age")
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

The variable has 4 categories:
Charcoal Electric and Gas None Stove
1376 6 475 3
I'm not sure why r won't plot this data for me, can anyone help?


#2

Dean

Provide an example using reprex() and people might help.

It's really easy:

install.packages("reprex")
library(reprex)

Then select your code and copy to the clipboard.
Don't paste anywhere, just go to the Console and type reprex()
Then go to RStudio Community and paste into a comment.

More details here: FAQ: What's a reproducible example (`reprex`) and how do I do one?


#3

Okay, I have downloaded the package and copied my code to the clipboard but where do I post the code?


#4

Great, just go to the RStudio Console...


#5

This is because the plot() function can't make scatter plots with discrete variables and has no method for column plots either (you can't make a bar plot since you only have one value per category).
Consider using ggplot2 instead of base R for plotting.


#6
ggplot(data = dframe1) + (mapping = aes(x = dframe1$Household.equiptment, 
                           y = dframe1$Schisto.Visual, color = "blue"))
#> Error in ggplot(data = dframe1): could not find function "ggplot"

Created on 2019-01-10 by the reprex package (v0.2.1)

What about this, from R for data science however it doesn't seem to work with my data for some reason?


#7
library(ggplot2)
ggplot(data = dframe1) + (mapping = aes(x = dframe1$Household.equiptment, 
                                        y = dframe1$Schisto.Visual, color = "blue"))
#> Error in ggplot(data = dframe1): object 'dframe1' not found

Created on 2019-01-10 by the reprex package (v0.2.1)

You need to call the libraries in your reprex, it's like a sandbox environment so that it can be reproduced elsewhere. The first error you got was could not find function "ggplot", so you need to call library(ggplot2)

The next one you'll get is object 'dframe1' not found so you need to construct the data in your reprex too. @andresrcs also gave what is certainly the actual answer to your question but a well done reprex does wonders.


#8

This is an example that you can modify to work with your own data

library(ggplot2)
library(dplyr)
example_data <- iris %>% 
    group_by(Species) %>% 
    summarise(mean_sepal_lenght = mean(Sepal.Length))

example_data %>% 
    ggplot(aes(x = Species, y = mean_sepal_lenght)) +
    geom_col()

Created on 2019-01-10 by the reprex package (v0.2.1)

Also, consider changing your topic title to something like, "How to plot categorical data in R?" since R was not restricting you to doing so.


#9

Thank you, finally got it to work.

In this case then, how would I plot to see if there is correlation between the same two categories? the ggplot only gives me frequencies of the results for schistosomiasis.

Or how would I compute the same plot however just containing positive values for schisto Elisa? the threshold is 0.2, I was thinking as below:

library(ggplot2)
ggplot(data = dframe1) + (aes(x = dframe1$Household.equiptment, y = dframe1$Schisto.elisa[>0.2])) +
  geom_col()
#> Error: <text>:2:91: unexpected '>'
#> 1: library(ggplot2)
#> 2: ggplot(data = dframe1) + (aes(x = dframe1$Household.equiptment, y = dframe1$Schisto.elisa[>
#>                                                                                              ^

Created on 2019-01-10 by the reprex package (v0.2.1)


#10

It's hard to help you any further if you don't provide a reproducible example (including data) if you don't know how to include some data in your reprex here is a blog post by Mara that explains how to do it.