JeffF
January 10, 2023, 8:59pm
1
I have some data for false crawls (FA) on certain dates. On some days there were no false crawls such 22-05-08.
activity
activity_date
FA
2022-05-06
FA
2022-05-07
FA
2022-05-09
FA
2022-05-15
FA
2022-05-16
FA
2022-05-16
FA
2022-05-16
FA
2022-05-16
FA
2022-05-16
FA
2022-05-17
FA
2022-05-17
If a count is performed such as
data_count <-data |> count(activity_date, name = ("crawls"))
the expected results are:
activity_date
crawls
2022-05-06
1
2022-05-07
1
2022-05-09
1
2022-05-15
1
2022-05-16
5
2022-05-17
2
I want to be able to show that there were zero false crawls on the missing days such that a table such as this results:
activity_date
crawls
2022-05-06
1
2022-05-07
1
2022-05-08
0
2022-05-09
1
2022-05-10
0
2022-05-11
0
2022-05-12
0
2022-05-13
0
2022-05-14
0
2022-05-15
1
2022-05-16
5
2022-05-17
2
How can I do this, add in those missing days for which there were no false crawls?
Thanks,
Jeff
Below is one way to achieve the desired output.
library(tidyverse)
data = tribble(
~activity, ~activity_date,
'FA', '2022-05-06',
'FA', '2022-05-07',
'FA', '2022-05-09',
'FA', '2022-05-15',
'FA', '2022-05-16',
'FA', '2022-05-16',
'FA', '2022-05-16',
'FA', '2022-05-16',
'FA', '2022-05-16',
'FA', '2022-05-17',
'FA', '2022-05-17'
) |>
mutate(activity_date = as.Date(activity_date))
out = data.frame(activity_date = seq(min(data$activity_date),
max(data$activity_date),
'days')) %>%
left_join(
count(data, activity_date, name = 'crawls')
) %>%
mutate(crawls = replace_na(crawls, 0))
#> Joining, by = "activity_date"
out
#> activity_date crawls
#> 1 2022-05-06 1
#> 2 2022-05-07 1
#> 3 2022-05-08 0
#> 4 2022-05-09 1
#> 5 2022-05-10 0
#> 6 2022-05-11 0
#> 7 2022-05-12 0
#> 8 2022-05-13 0
#> 9 2022-05-14 0
#> 10 2022-05-15 1
#> 11 2022-05-16 5
#> 12 2022-05-17 2
Created on 2023-01-10 with reprex v2.0.2.9000
1 Like
JeffF
January 11, 2023, 12:46am
3
Yes, that worked perfectly. I split out the left_join to understand what was going on. I used this approach to my full data set (115 observations) and applied cumsum of the counts to get a third column of what I wanted to plot.
Thanks much,
Jeff
1 Like
system
Closed
January 18, 2023, 12:47am
4
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.