Plot with ggplot

Hi, I'm trying to replicate the apple mobility plot https://www.apple.com/covid19/mobility/ with ggplot and having trouble with it. My data frame has the countries as rows and the dates a as column names.
Any help would be greatly appreciated. I tried to upload a screenshot and it didn't work

If you have the dates as columns, then you will need to reshape the data first:
https://tidyr.tidyverse.org/articles/tidy-data.html

Thanks, so I make the dates as the rows and the countries as the columns?

tidy data would have columns for country, date, value(i.e. number_of_cases), the rows would have the information that is appropriate under such headings.

No, you reshape the data to a 'long' format via e.g. tidyr::pivot_longer() such that your data looks like:

"USA", "2020-04-01", 100
"USA", "2020-04-02", 101
"Italy", "2020-04-01", 200 
"Italy", "2020-04-02", 201

maps1 %>%
pivot_longer(-Group.1, names_to = "Dates", values_to = "frequency")

Is still returning the same data frame as there original, am I doing something wrong?

We don't really have enough info to help you out. Could you ask this with a minimal REPR oducible EX ample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

Martin thanks for you advice. I have now got the data frame reshaped. I tried to add my code using datapasta package but it was giving me grief to. I've managed to plot the data using ggplot and have loads the geom_line from multiple data sets. Its all working but how do I create a legend representing each country?

If it's from multiple datasets you are making it more difficult than otherwise.
One dataset where the the country variable can be the colour mapping of the geom would be easier

How do I write that? Below is how I separated into county data frames and then plotted.
But you are suggesting that I can plot it from the original data frame?

AustraliaDataMapMobilityTransit <- subset(transitData, region =="Australia")
BelgiumDataMapMobilityTransit <- subset(transitData, region == "Belgium")
BrazilDataMapMobilityTransit <- subset(transitData, region == "Brazil")
CanadaDataMapMobilityTransit <- subset(transitData, region == "Canada")
CzechRepublicDataMapMobilityTransit <- subset(transitData, region == "Czech Republic")

ggplot()+
geom_line(data = AustraliaDataMapMobilityTransit, aes(x = Dates, y= frequency), colour = "red")+
geom_line(data=BelgiumDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "blue")+
geom_line(data=BrazilDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "yellow")+
geom_line(data=CanadaDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "black")+
geom_line(data=CzechRepublicDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "orange")


ggplot(data = transitData)+
geom_line( mapping = aes(x = Dates, y= frequency, colour = region))
1 Like

Thanks you are a legend. How do I only plot top 5 and bottom 5.

Top 5 countries and bottom 5 countries, each country has approx 100 rows and there is approx 90 odd countries in the list

how do you calculate a ranking of the counties ? i.e. you have a summarisation over their dates in mind ?

The countries are listed in order, but there is 100 entries for each country before it moves to the next Screen Shot 2020-04-22 at 10.05.50 pm

Or is there a way to specify what countries I want to include

Say I want to graph Australia, Canada,America and Russia

sorry friend, you are a little all over the place...
you can certainly filter your data on the way into your graph.
dplyr::filter() function is for that purpose.

filtered_data <- transitData %>% filter(region %in% c("Australia", "Canada"))

ggplot(data = filtered_data )+
geom_line( mapping = aes(x = Dates, y= frequency, colour = region))

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.