Data constains implicit gaps in time error with Tsibble library

Hello! I am new to time series and I am having trouble with a code snippet for Tsibble. I have tried to plot a seasonal plot using the gg_season funtion. My dataframe has two columns: the complete date and the temperature on a daily basis. However, for some reason, it returns this message when I run the code below:

Error in check_gaps():
! data contains implicit gaps in time. You should check your data and convert implicit gaps into explicit missing values using tsibble::fill_gaps() if required.
Run rlang::last_trace() to see where the error occurred.

I have check and I don't have any gaps. As I expected, I have 365 rows with no missing values.

What am I missing or doing wrong? I wasn't able to find an answer myself.

# Loading libraries
library(tidyverse)
library(tsibble)
library(ggplot2)
library(feasts)
library(tsibbledata)
library(dplyr)

# Altering the dataframe
temp <- temp %>%
mutate(Daily = ymd(Date)) %>%
select(-Date) %>%
as_tsibble(key = c(Temperatura), index = Daily)
temp <- temp[order(temp$Daily),]
temp

temp %>%
  gg_season(Temperature)

It would be helpful to see a sample of your data. Run dput(head(temp, 31)) and paste the result in a post. That should have the first 31 days for both Date and Temperature (or Temperatura?).

If these are the only two variables, then there is no need to have a key in as_tsibble(). It is also unclear why you need to change the order based on the Daily variable. Is temp not in chronological order? If you simply convert Date to ymd format without changing the name, that leaves:

temp <- temp %>%
mutate(Date = ymd(Date)) %>%
as_tsibble(index = Date)

temp

As @EconProf notes data is needed to make the code example complete. Here is an example

library(feasts)
#> Loading required package: fabletools
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
# 365 daily maximum temperatures
d <- data.frame(
  dates = seq(lubridate::ymd("2010-01-01"), length.out = 365, by = 1),
  maxtemps = c(
  6,1,7,7,10,16,16,8,6,15,22,22,28,31,28,35,34,34,23,
  28,30,33,36,35,34,16,11,7,9,17,16,16,21,21,29,33,28,24,
  25,26,24,25,25,23,24,30,34,35,37,31,35,31,32,23,20,26,32,
  38,38,38,42,41,39,42,41,47,37,39,42,47,43,45,64,60,50,58,
  64,45,39,50,55,61,55,42,51,52,52,59,75,76,81,72,61,67,60,
  52,56,52,67,63,69,68,69,74,71,61,65,67,69,71,61,66,69,60,
  61,63,60,69,73,69,64,64,57,79,55,54,48,52,58,58,46,51,53,
  71,74,74,76,75,79,82,65,82,88,95,89,80,81,85,91,81,78,82,
  73,78,83,70,78,75,70,73,66,76,62,67,69,76,80,89,86,76,82,
  79,91,80,82,86,84,84,75,74,79,86,88,92,85,85,88,87,85,87,
  86,83,79,85,93,85,90,92,84,80,86,88,77,86,83,83,86,94,83,
  82,74,86,87,86,95,90,82,85,89,96,95,90,93,92,88,86,77,79,
  78,78,80,86,85,88,88,78,73,79,88,89,94,92,86,77,73,61,69,
  73,71,63,69,65,71,73,79,74,68,66,56,64,60,64,80,77,65,74,
  61,61,65,75,66,74,72,70,56,59,65,74,71,74,87,83,81,80,72,
  66,69,61,66,59,56,60,68,53,68,65,61,61,63,42,42,53,52,49,
  52,56,56,46,45,57,61,63,69,68,52,43,36,35,39,35,37,31,39,
  29,34,31,24,33,27,26,28,45,41,36,20,20,20,25,18,21,17,18,
  34,28,22,8,6,10,17,20,15,17,15,26,31,32,27,26,20,24,21,
  34,37,42,24))

# to tsibble

dts <- as_tsibble(d)
#> Using `dates` as index variable.

# plot

dts |> gg_season()
#> Plot variable not specified, automatically selected `y = maxtemps`


dts
#> # A tsibble: 365 x 2 [1D]
#>    dates      maxtemps
#>    <date>        <dbl>
#>  1 2010-01-01        6
#>  2 2010-01-02        1
#>  3 2010-01-03        7
#>  4 2010-01-04        7
#>  5 2010-01-05       10
#>  6 2010-01-06       16
#>  7 2010-01-07       16
#>  8 2010-01-08        8
#>  9 2010-01-09        6
#> 10 2010-01-10       15
#> # ℹ 355 more rows

Created on 2023-04-23 with reprex v2.0.2

If your data has a datetime column, rather than the date column, that could be the source of the error.

Thanks to both of you. I removed the key, as I only had to columns, and with that the plot is working correctly. I am not sure why, but I'll leave it as it is when I have only two variables.

I had the order column because for some reason it wasn't ordered after applying the as_tsibble. This is working now too after removing the key.

Thanks a lot! I'm new to R and is SO different from Python!

1 Like

The key to my understanding of R was to start thinking of it as school algebra—f(x) = y

where

x is an object containing information to be transformed somehow
y is the object to be derived from x
f is the function object to transform x to y; it may be, and often is, composite, like f(g(x) = y

Given the origins of R this makes sense. It began as S a statistics workbench in an interactive environment. An advantage is keeping focus on what to do rather than how to do it.

1 Like

Suppose your temperature data was for two different cities, so the variables are Date, Temperature and City. Each date would appear in two different rows, one for each city. The combination of Date (as the index) and City (as the key) would define a unique observation. For your data, each date is a unique observation without needing a key variable.

2 Likes

Thanks for this! It's clear now! Thanks a lot, I truly appreciate it.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.