Balance the unbalance data frame with duplicated columns

Hi, I have a following data frame:

df <- tibble(
  Date = (c("01-01-2005", "01-02-2005", "01-04-2005", "01-04-2005")),
  item_value = c(5,6,8,1),
  item_name = c("a", "b", "b", "a")
)

I would like create a balanced data frame that will look like this:

Date                  item_value            item_name
01-01-2005                5                        a
01-01-2005                0                        b
01-02-2005                0                        a
01-02-2005                6                        b
01-03-2005                0                        a
.
.
.

I tried this:

df %>% complete(df, nesting(Date, item_name), fill = list(item_value = 0))

but this error occurs:

Error: Column names Date, item_name must not be duplicated.
Use .name_repair to specify repair.

How can I fix it please? Thanks a lot!

Since you are dealing with implicit missing values on a time variable I think tsibble would be more appropriate.

library(dplyr)
library(tsibble)
library(lubridate)

df <- tibble(
    Date = c("01-01-2005", "01-02-2005", "01-04-2005", "01-04-2005"),
    item_value = c(5,6,8,1),
    item_name = c("a", "b", "b", "a")
)

df %>%
    mutate(Date = mdy(Date)) %>% 
    as_tsibble(key = item_name, index = Date) %>%
    fill_gaps(item_value = 0, .full = TRUE) %>%
    arrange(Date, item_name)
#> # A tsibble: 8 x 3 [1D]
#> # Key:       item_name [2]
#>   Date       item_value item_name
#>   <date>          <dbl> <chr>    
#> 1 2005-01-01          5 a        
#> 2 2005-01-01          0 b        
#> 3 2005-01-02          0 a        
#> 4 2005-01-02          6 b        
#> 5 2005-01-03          0 a        
#> 6 2005-01-03          0 b        
#> 7 2005-01-04          1 a        
#> 8 2005-01-04          8 b
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.