How to split data according to the time values?

Hi, I am a R newbie and even though I've already googled a lot, I couldn't find any solution to my problem. I have a group of (physiological) data in which every value has its own specific time. I need to split them according to the timestamp (such as "video started") stored in the another file (they are both dataframes). Shortly, I need to split the file into 9 "chunks" during which something specific happened (4 videos and 5 resting periods) so I might have a dataframe video1 and see only data that fall into this time period. I am really not sure how to do that as the timestamps are single values. Could someone give me an advice, please?

Hi @Anysha, welcome to R and to the RStudio Community.

It's easier for people to help you if you can show in text something you've already tried to do, and to give a sample of the data for people to access as well. It's harder to work with screenshots! The standard way to present your code for help is by using a reprex.

In this situation, it looks like it should not be too hard to give you a hint as to how to achieve what you need. I'd use dplyr's case_when.

library(dplyr)

results <- HR_raw %>%
  mutate(period = case_when(
    sessionTime > timestamps$timestamp[9] ~ "rest5",
    sessionTime > timestamps$timestamp[8] ~ "video4",
    sessionTime > timestamps$timestamp[7] ~ "rest4",

  / etc etc /

    sessionTime > 0 ~ "rest1"
)

Then you can dplyr::filter() results according to period, if you need to create separate dataframes.
Your timestamp columns need to be numeric, by the way, obviously!

(I haven't tested the above code for errors - just wrote it by hand :thinking:)

1 Like
library(dplyr)

event_time <- sort(round(sample(50000, 8)/1000, 3))
timestamps <- c(0, sort(round(sample(50000, 2)/1000, 1)), 50)

results <- as_tibble(event_time) %>% 
  mutate(period = case_when(
    value > timestamps[3] ~ "period3",
    value > timestamps[2] ~ "period2",
    value > timestamps[1] ~ "period1"
  ))

results %>% 
  pull(period) %>% 
  unique() %>% 
  purrr::map(~ filter(results, period == .))
#> [[1]]
#> # A tibble: 4 x 2
#>   value period 
#>   <dbl> <chr>  
#> 1  2.26 period1
#> 2  9.21 period1
#> 3 13.5  period1
#> 4 21.0  period1
#> 
#> [[2]]
#> # A tibble: 1 x 2
#>   value period 
#>   <dbl> <chr>  
#> 1  24.7 period2
#> 
#> [[3]]
#> # A tibble: 3 x 2
#>   value period 
#>   <dbl> <chr>  
#> 1  31.2 period3
#> 2  32.5 period3
#> 3  35.5 period3

Created on 2020-09-25 by the reprex package (v0.3.0)

Dear francis,

you helped me tremendously, thank you! I used the first code as its understanding is closer to my level of skills. :slight_smile: I have just one question - is there any way how to use the information from message column to name for the "period" column? It works perfectly for the rest periods, but as the videos are presented in randomized order, I could not use the same script for different data (for example, in which video4 is played first). I tried something like timestamps$message, but obviously that not work. Thank you very much again! (PS: I'll learn how to use reprex next time!)

1 Like

It's a good question.

The simplest way to do it is to look up the label for the period from your message column in timestamps, using the same syntax with $ for the column name and [ ] to specify the item number in the column. My example would be:

library(dplyr)

# random sample data for videos
timestamps <- tibble(
  timestamp = c(0, sort(round(sample(50000, 2)/1000, 1)), 50),
  message = c(paste0("period", 1:(length(timestamp)-1)), "end")
)

# random timings for the events to be classified
session_time <- sort(round(sample(50000, 8)/1000, 3))

# use dplyr::case_when to categorise and label times
tibble::as_tibble_col(session_time, column_name = "session_time") %>% 
  mutate(period = case_when(
    session_time > timestamps$timestamp[3] ~ timestamps$message[3],
    session_time > timestamps$timestamp[2] ~ timestamps$message[2],
    session_time > timestamps$timestamp[1] ~ timestamps$message[1]
    )
  )
#> # A tibble: 8 x 2
#>   session_time period 
#>          <dbl> <chr>  
#> 1         2.14 period1
#> 2         6.28 period1
#> 3         8.84 period1
#> 4        21.8  period2
#> 5        22.5  period2
#> 6        25.4  period2
#> 7        32.2  period2
#> 8        48.6  period3

Created on 2020-09-27 by the reprex package (v0.3.0)

I haven't been able to work out a way of getting the case-when statements (session_time > ...) to construct themselves out automatically! I feel like there should be. But for now I have just written them in by hand. If you had many different videos, it could become a very long case_when statement!