Hello everyone, thank you in advance. I have these two data frames:
A tibble: 214 x 4
Groups: provincia_deteccion [16]
fecha provincia_deteccion n fa
dttm chr int int>
1 2020-03-11 Sancti Spíritus 3 3
2 2020-03-13 Villa Clara 1 1
3 2020-03-16 La Habana 1 1
4 2020-03-17 Camagüey 1 1
5 2020-03-17 La Habana 1 2
6 2020-03-18 Holguín 1 1
7 2020-03-18 Sancti Spíritus 1 4
8 2020-03-18 Villa Clara 2 3
9 2020-03-19 Granma 1 1
10 2020-03-19 00:00:00 Holguín 1 2
...with 204 more rows
n: case report and fa: cumulative frequency
y este otro:
b
A tibble: 24 x 4
Groups: provincia_deteccion [7]
fecha provincia_deteccion fallecidos faa_fallecidos
1 2020-03-12 00:00:00 Sancti Spíritus 1 1
2 2020-03-26 00:00:00 La Habana 1 1
3 2020-03-28 00:00:00 Villa Clara 1 1
4 2020-03-29 00:00:00 Ciego de Ávila 1 1
5 2020-03-29 00:00:00 La Habana 1 2
6 2020-03-30 00:00:00 La Habana 1 3
7 2020-04-04 00:00:00 Isla de la Juventud 1 1
8 2020-04-04 00:00:00 Sancti Spíritus 1 2
9 2020-04-05 00:00:00 La Habana 1 4
10 2020-04-06 00:00:00 La Habana 1 5
donde: faa_fallecidos: is cumulative frequency
I need to add to the data frame (b) a field that is the (fa) of data frame (a) according to date and province. I try to do it with the left_join function but it didn't work for me because if there is no data matching the date in both data frames it doesn't add them. For examplo:
prueba<- b %>%left_join(a)
result
fecha provincia_deteccion fallecidos faa_fallecidos n fa
1 2020-03-18 00:00:00 Sancti Spíritus 1 1 1 4
2 2020-03-26 00:00:00 La Habana 1 1 2 25
3 2020-03-28 00:00:00 Villa Clara 1 1 NA NA
4 2020-03-29 00:00:00 Ciego de Ávila 1 1 4 15
5 2020-03-29 00:00:00 La Habana 1 2 8 42
6 2020-03-30 00:00:00 La Habana 1 3 8 50
7 2020-04-04 00:00:00 Isla de la Juventud 1 1 1 1
8 2020-04-04 00:00:00 Sancti Spíritus 1 2 1 18
9 2020-04-05 00:00:00 La Habana 1 4 15 109
10 2020-04-06 00:00:00 La Habana 1 5 6 115
By that time "Villa Clara" and "Isla de la Juventud" already had many accumulated cases.
Thank you for allowing us to express ourselves in this forum.