Problems with creating a table out of a tibble resp. plotting columns from different tibbles into a graph

Hey guys,

I do have the following dataframe with 45 million observations:

year month variable
 1992    1    0
 1992    1    1
 1992    1    1
 1992    2    0
 1992    2    1
 1992    2    0  

My goal is to count the frequency of the variable for each month of a year.
I was already able to generate these sums with cps_data as my dataframe and SKILL_1 as my variable.

cps_data %>%                                        
  group_by(YEAR, MONTH) %>%                         
               list(name = sum))

Logically, I obtained 348 different rows as a tibble. Now, I struggle to create a new table with these values. My new table should look similar to my tibble. How can I do that? Is there even a way? I've already tried to read in an excel file with a date range from 01/1992 - 01/2021 in order to obtain exactly 349 rows and then merge it with the rows of the tibble, but it did not work..

     # A tibble: 349 x 3
# Groups:   YEAR [30]
    YEAR          MONTH  name
   <dbl>      <int+lbl> <dbl>
 1  1992  1 [January]     499
 2  1992  2 [February]    482
 3  1992  3 [March]       485
 4  1992  4 [April]       457
 5  1992  5 [May]         434
 6  1992  6 [June]        470
 7  1992  7 [July]        450
 8  1992  8 [August]      438
 9  1992  9 [September]   442
10  1992 10 [October]     427
# ... with 339 more rows

many thanks in advance!!

You obtained it but struggle to have it ? Seems paradoxical, unless perhaps you are missing that when you calculate a result, to reuse that result in further processes you must assign it to a name. In R the
<- assignment operator is most often used for this.

a <- 1 + 2


I find it odd that the result would be 349 rows. There are 29 years between 2021 and 1992. Thus, 29 yrs x 12 mths = 348 rows.

If it is a year off either way, it will be 12 +/- 348 rows, not 349. In other words it would have to be 29 years plus 1 month for there to be 349 rows.

Thanks for your answer!
So the reason why I would like to transform it into a real table is because, in the end, I would like to plot a graph of variables from multiple tibbles.
For instance, I create two tibbles, named tibble1 and tibble2, and I would like to display the sum of both tibbles, which is SKILLSUM_1 and SKILLSUM_2, in one table so that I can plot a graph.

 tibble1 <- cps_data %>%                                        
  group_by(YEAR, MONTH) %>%                         
               list(SKILLSUM_1 = sum)) 

 tibble2 <- cps_data %>%                                        
   group_by(YEAR, MONTH) %>%                         
                list(SKILLSUM_2 = sum))

If I try then to create a plot with columns of two different tibbles, it won`t work. That's the actual reason why I would like to transform my tibble to a table, which would be one way to solve my problem. Maybe, I should have mentioned that...

The other way would to find a way how to plot variables of multiple tibbles into one graph. I tried it already in the example below, but unfortunately it did not work.. is there even a way?

ggplot(tibble1, tibble2, aes(x=YEAR)) + 
  geom_line(aes(y = SKILLSUM_1), color = "darkred") + 
  geom_line(aes(y = SKILLSUM_2), color = "steelblue", linetype="twodash")

Many thanks in advance

Hey thanks for your answer!
Sry for being unprecise. This is exactly the case! Date ranges from 01/1992 to 01/2021

I think we might be having just communication issues. It's not clear what a real table is, tables are tables. Tibbles are just a form of data.frame which are tabular data structures, I.e. tables.

If you want to combine two tibbles together, generally these actions are known as joining or merging. Dplyr has left_join etc.

So I've never seen 2 tables referenced in ggplot() function before. What I have done in the past is use the $ sign operator to call a column in another table. This is one of the beautiful things you can do in R that you couldn't do in SAS.

I also just join the tables together using left_join or join for a full join. But you can do something like this as long as the 2 columns are the same length:

ggplot(tibble1,, aes($YEAR)) + 
  geom_line(aes(y = .data$SKILLSUM_1), color = "darkred") + 
  geom_line(aes(y = tibble2$SKILLSUM_2), color = "steelblue", linetype="twodash")

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.