How to mutate a ratio for two populations by year

My dataframe is below.

   overseas_domestic_indicator ref_year count
   <chr>                          <dbl> <dbl>
 1 Domestic                        2014 17854
 2 Domestic                        2015 18371
 3 Domestic                        2016 18975
 4 Domestic                        2017 19455
 5 Domestic                        2018 19819
 6 Overseas                        2014  6491
 7 Overseas                        2015  7393
 8 Overseas                        2016  8594
 9 Overseas                        2017  9539
10 Overseas                        2018 10455

I want something like:

ref_year Domestic/Overseas
2014    2.75
2015    ...
...     ...

I am really new to tidyverse. I don't know how to do this using tidyverse. I tried to use mutate but I don't know how to clarify the count for Domestic and Overseas.

Thanks in advance,

Danny

Reshape your data to a wider format to make calculations easier

library(tidyverse)

# Sample data on a copy/paste friendly format
sample_df <- data.frame(
             stringsAsFactors = FALSE,
  overseas_domestic_indicator = c("Domestic","Domestic","Domestic","Domestic",
                                  "Domestic","Overseas","Overseas","Overseas",
                                  "Overseas","Overseas"),
                     ref_year = c(2014,
                                  2015,2016,2017,2018,2014,2015,2016,2017,
                                  2018),
                        count = c(17854,
                                  18371,18975,19455,19819,6491,7393,8594,
                                  9539,10455)
)

sample_df %>% 
    spread(overseas_domestic_indicator, count) %>% 
    mutate(rate = Domestic/Overseas)
#>   ref_year Domestic Overseas     rate
#> 1     2014    17854     6491 2.750578
#> 2     2015    18371     7393 2.484918
#> 3     2016    18975     8594 2.207936
#> 4     2017    19455     9539 2.039522
#> 5     2018    19819    10455 1.895648

Created on 2020-05-21 by the reprex package (v0.3.0)

Note: For future posts, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

1 Like

You could also use pivot_wider (names_from = column to be pivoted, values_from = values)

sample_df <- sample_df %>%
pivot_wider(names_from = overseas_domestic_indicator,
values_from = count) %>%
mutate (Ratio = Domestic/Overseas)

According to this vignette, Pivoting in tidyverse, pivot_wider is an improvement on spread(), However, Hadley Wickham's tweet, advises that spread() and gather() "will stay around forever"

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.