Split the time series dataset

Hi,is there way to name the timeseries dataset into training and test data by 80 and 20 percent. For example

df <- structure(list(Date = structure(c(3L, 4L, 5L, 6L, 1L, 2L), .Label = c("1/9/2019", 
"2/9/2019", "27/8/2019", "28/8/2019", "29/8/2019", "30/8/2019"
), class = "factor"), Val = c(1, 2, 3, 4, 5, 6)), class = "data.frame", row.names = c(NA, 
-6L))

df
Date         Val     Split
27/8/2019     2    Training
28/8/2019     2    Training
29/8/2019     4    Training
30/8/2019     4    Training
28/9/2019     8       Test
29/9/2019     9       Test

One possible way:

dataset <- data.frame(Date = c("27/8/2019", "28/8/2019", "29/8/2019", "30/8/2019", "1/9/2019", "2/9/2019"),
                      Val = c(1, 2, 3, 4, 5, 6))

dataset$Split <- rep(x = c("Training", "Test"),
                     times = c(floor(x = 0.8 * nrow(x = dataset)), ceiling(x = 0.2 * nrow(x = dataset))))

dataset
#>        Date Val    Split
#> 1 27/8/2019   1 Training
#> 2 28/8/2019   2 Training
#> 3 29/8/2019   3 Training
#> 4 30/8/2019   4 Training
#> 5  1/9/2019   5     Test
#> 6  2/9/2019   6     Test

Created on 2019-10-15 by the reprex package (v0.3.0)

And, then if you want to split it, you can do this (among many other possibilities):

zeallot::`%<-%`(x = c(test, train),
                value = split(x = dataset,
                              f = dataset$Split))

However, if splitting is your objective, and not creating a new column, then probably this is better:

train_indices <- seq_len(length.out = floor(x = 0.8 * nrow(x = dataset)))
train <- dataset[train_indices,]
test <- dataset[-train_indices,]

Hope this helps.

Short feedback on your code
  1. If you provide code to create your data and then show us the output of what you want and not what you have, it's confusing.
  2. For small datasets, and especially if they are data.frame, dput is a bit too much. Perhaps directly creating as I did in my code above makes it more readable.
2 Likes

Perfect thanks a lot

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.