Hi Mirthe,
In the first data set there is 2 samples per second, and in the second data set, there is 1 sample every 2 seconds. Therefore, sampling in the first data set is 4x's faster than the second.
One approach is to take every 4th row of the first data set, which would then give you 1 observation every 2 seconds, the same as the second data set.
library("tibble")
set.seed(123)
df1 <- tibble(second = rep(1:50, each = 2),
sample_id = 1:100, x = rnorm(100))
head(df1)
#> # A tibble: 6 x 3
#> second sample_id x
#> <int> <int> <dbl>
#> 1 1 1 -0.560
#> 2 1 2 -0.230
#> 3 2 3 1.56
#> 4 2 4 0.0705
#> 5 3 5 0.129
#> 6 3 6 1.72
# As a comparison for sampler 2
df2 <- tibble(second = seq(1, 50, by = 2),
sample_id = 1:25, x = rnorm(25))
df1_sub <- df1[seq(1, nrow(df1), 4), ]
head(df1_sub)
#> # A tibble: 6 x 3
#> second sample_id x
#> <int> <int> <dbl>
#> 1 1 1 -0.560
#> 2 3 5 0.129
#> 3 5 9 -0.687
#> 4 7 13 0.401
#> 5 9 17 0.498
#> 6 11 21 -1.07
head(df2)
#> # A tibble: 6 x 3
#> second sample_id x
#> <dbl> <int> <dbl>
#> 1 1 1 -0.710
#> 2 3 2 0.257
#> 3 5 3 -0.247
#> 4 7 4 -0.348
#> 5 9 5 -0.952
#> 6 11 6 -0.0450
cor(df1_sub[["x"]], df2[["x"]])
#> [1] 0.3125283
Created on 2020-07-31 by the reprex package (v0.3.0)