Question about data types importing from excel

Hi, I am a beginner who just started learning RStudio.

I am trying to plot my GPS marketshare data set using the command plot(gps$marketshare, gps$company) and I get the error message as shown in my screenshot:

However, this command works when using the default Iris data plot(iris$petal.length, iris$species)

I suspect this is because the data is of different types as typeof(company) outputs character while typeof(species) outputs integer.

May I know how to plot the data from my excel sheet and why is the imported data so different from the iris data.

Apologies if this is the wrong place to post this sort of questions and I would really appreciate any assistance!

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

Hi, thank you for the reply!

I tried my best to follow the instructions, is this helpful?

gps <- tibble::tribble(
           ~Company, ~MarketShare,
           "Garmin",          47L,
           "TomTom",          19L,
         "Magellan",          17L,
             "Milo",           7L,
            "Other",          10L
         )
head(gps)
#> # A tibble: 5 x 2
#>   Company  MarketShare
#>   <chr>          <int>
#> 1 Garmin            47
#> 2 TomTom            19
#> 3 Magellan          17
#> 4 Milo               7
#> 5 Other             10

The command I have issues with is plot(gps$company)

Thank you for the reply!

Indeed, plot() makes scatterplots by default, which won't work with characters. It need to be told explicitly that it's a categorical variable and not random characters. You do that by making Company a factor:

gps2 <- gps %>% mutate(Company=factor(Company))
plot(gps2$Company, gps2$MarketShare)

But there are also 2 other possibilities for this kind of data, that will automatically guess that Company is supposed to be a factor:

2nd method: use barplot

Barplots are commonly used for this kind of data, and make more sense if you have a single point per condition. The barplot() function automatically guesses that the x axis is supposed to be a factor, since it's the standard use case:

barplot(MarketShare ~ Company, data = gps)

3rd way: use ggplot2

The two previous functions are part of base R, but the ggplot2 package offers many additional possibilities with a consistent syntax, and might be worth learning about. It's part of the tidyverse, so automatically available if you already did library(tidyverse). It will also guess automatically that Company is a categorical variable.

library(ggplot2)  # or library(tidyverse)
# display as points
ggplot(gps) +
   geom_point(aes(x=Company, y=MarketShare))

# display as bars
ggplot(gps) +
   geom_col(aes(x=Company, y=MarketShare))

Thank you for the really comprehensive reply! I finally understand how to solve my issue!
Really appreciate the help!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.