create time series in grouped data frame

I have a data frame with page impressions for the last year of different URLs.

[1] "path" "date" "impressions"
I want to group the date df by path and add a time series to each group. I tried this:

path_metrics = gsm %>% group_by(path, date)  %>% mutate(date=as.Date(date)) %>% mutate(ts =  zoo(date, clicks))

I got this error message:

Problem with mutate() input ..1. x “value” : attempt to assign invalid coredata to zoo object :information_source: Input ..1 is zoo(date, impressions). :information_source: The error occurred in group 9: path = "/foo/", date = 2021-05-08

Here is the error trace:

Problem with `mutate()` input `..1`.
x indexes overlap
ℹ Input `..1` is `zoo(date, clicks)`.
ℹ The error occurred in group 600: date = "2021-05-25".
  1. ├─`%>%`(...)
  2. ├─dplyr::mutate(., zoo(date, clicks))
  3. ├─, zoo(date, clicks))
  4. │ └─dplyr:::mutate_cols(.data, ...)
  5. │   ├─base::withCallingHandlers(...)
  6. │   ├─base::withCallingHandlers(...)
  7. │   └─vctrs::vec_unchop(chunks <- vec_cast_common(!!!chunks), rows)
  8. ├─vctrs:::base_c_invoke(...)
  9. │ └─vctrs:::base_c(xs)
 10. │   └─rlang::exec("c", !!!xs, .env = ns_env("base"))
 11. ├─base::c(...)
 12. ├─zoo:::c.zoo(...)
 13. │ └─zoo::rbind.zoo(...)
 14. │   └─base::stop("indexes overlap")
 15. └─base::.handleSimpleError(...)
 16.   └─dplyr:::h(simpleError(msg, call))
indexes overlap

How can I get a group df with containing time series?
In the next step I want to compare the impressions of each URL find those that have same senonality.

I've never tried to do this! One alternative possibility is to coerce the time series to numeric with as.numeric() and store the start and frequency time series parameters in new data frame columns. That way, after grouping, you can recreate the time series easily with a mutate.

A reprex would be very helpful. First, why do you group by both path and date? Doesn't this give you groups with just one observation each? Second, I thought that zoo requires vectors, not a dataframe, but I have not used zoo in a long time. Third, your use of mutate( ) appears to be trying to add ts to the gsm dataframe as a new variable. Fourth, it is confusing when you change from impressions to clicks. Are they different?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.