How to assign a value based off closest date/time record

First I'd just like to apologise if this is very obvious step or I've asked the question incorrectly. Very new to the R Studio code world.

I have two .csv's

CSV1 includes includes 2 columns

  • column1 (DateTime yyyy-hh-dd hh:mm:ss) in 10 minute increments over a number of years
  • column2 a value associated with that timestamp either 'High' or 'Low'

CSV2 includes a number of columns but all I'm really interested in for this exercise is

  • column1 (DateTime yyyy-hh-dd hh:mm:ss) full of random times depending on when the GPS signal was received

What I want to achieve is something like the following, create a new column in CSV2

CSV2$HighorLow <- and then I want it to lookup from CSV1 the record that has the closest DateTime to what was specified in CSV2 and in the new 'HighorLow' column, put in the value that was in CSV1$column2

Hope this makes sense and many thanks for any advice. Cheers

Can you please share a small part of the data set in a copy-paste friendly format?

In case you don't know how to do it, there are many options, which include:

  1. If you have stored the data set in some R object, dput function is very handy.

  2. In case the data set is in a spreadsheet, check out the datapasta package. Take a look at this link.

Thanks andresrcs,

Here are small portions of the data

The below is CSV1 that I talked about in the first post.

tidedata <- tibble::tribble(
                            ~tideDT, ~reading,  ~Tide,
              "2015-09-01 00:00:00",     3.75, "High",
              "2015-09-01 00:10:00",    3.836, "High",
              "2015-09-01 00:20:00",     3.91, "High",
              "2015-09-01 00:30:00",     3.97, "High",
              "2015-09-01 00:40:00",    4.012, "High",
              "2015-09-01 00:50:00",    4.039, "High",
              "2015-09-01 01:00:00",    4.045, "High",
              "2015-09-01 01:10:00",    4.029, "High",
              "2015-09-01 01:20:00",    3.998, "High",
              "2015-09-01 01:30:00",     3.95, "High",
              "2015-09-01 01:40:00",    3.885, "High",
              "2015-09-01 01:50:00",    3.812, "High",
              "2015-09-01 02:00:00",    3.729, "High",
              "2015-09-01 02:10:00",    3.636, "High",
              "2015-09-01 02:20:00",    3.525, "High",
              "2015-09-01 02:30:00",    3.405, "High",
              "2015-09-01 02:40:00",    3.285, "High",
              "2015-09-01 02:50:00",    3.143, "High",
              "2015-09-01 03:00:00",    3.009, "High",
              "2015-09-01 03:10:00",    2.865, "High",
              "2015-09-01 03:20:00",    2.718, "High",
              "2015-09-01 03:30:00",    2.568, "High",
              "2015-09-01 03:40:00",    2.424, "High",
              "2015-09-01 03:50:00",    2.287, "High",
              "2015-09-01 04:00:00",    2.139, "High",
              "2015-09-01 04:10:00",    1.995, "High",
              "2015-09-01 04:20:00",    1.862, "High",
              "2015-09-01 04:30:00",    1.737, "High",
              "2015-09-01 04:40:00",    1.605, "High",
              "2015-09-01 04:50:00",    1.475,  "Low",
              "2015-09-01 05:00:00",     1.35,  "Low",
              "2015-09-01 05:10:00",    1.229,  "Low",
              "2015-09-01 05:20:00",    1.116,  "Low",
              "2015-09-01 05:30:00",    1.005,  "Low",
              "2015-09-01 05:40:00",    0.899,  "Low",
              "2015-09-01 05:50:00",    0.802,  "Low",
              "2015-09-01 06:00:00",    0.708,  "Low",
              "2015-09-01 06:10:00",    0.618,  "Low",
              "2015-09-01 06:20:00",     0.54,  "Low",
              "2015-09-01 06:30:00",    0.465,  "Low",
              "2015-09-01 06:40:00",     0.41,  "Low",
              "2015-09-01 06:50:00",    0.359,  "Low",
              "2015-09-01 07:00:00",    0.335,  "Low",
              "2015-09-01 07:10:00",     0.33,  "Low",
              "2015-09-01 07:20:00",     0.35,  "Low",
              "2015-09-01 07:30:00",    0.401,  "Low"
              )

tidedata$tideDT <- as.POSIXct(tidedata$tideDT , format = "%Y-%m-%d %H:%M:%S")

head(tidedata)
#> # A tibble: 6 x 3
#>   tideDT              reading Tide 
#>   <dttm>                <dbl> <chr>
#> 1 2015-09-01 00:00:00    3.75 High 
#> 2 2015-09-01 00:10:00    3.84 High 
#> 3 2015-09-01 00:20:00    3.91 High 
#> 4 2015-09-01 00:30:00    3.97 High 
#> 5 2015-09-01 00:40:00    4.01 High 
#> 6 2015-09-01 00:50:00    4.04 High

and the following is CSV2

telemetrydata <- tibble::tribble(
                         ~id,             ~DateTime,     ~lat,     ~lon, ~qi, ~sex, ~ageclass,   ~depth.exp,    ~depth.HT,
                   "QA58295", "2015-10-01 00:02:00", -23.7676, 151.2965,  6L,  "F",      "SA",  -3.15134787,  -3.23284787,
                   "QA43123", "2015-10-01 00:10:00", -23.7733, 151.2915,  6L,  "F",       "A", -3.006766321, -3.124266321,
                   "QA58206", "2015-10-01 00:40:00", -23.6856, 151.3048,  4L,  "F",      "SA", -18.72804075, -19.05964075,
                   "QA58284", "2015-10-01 02:15:00", -23.7637, 151.1582,  8L,  "M",       "A", -3.187366148, -3.316566148,
                   "QA58284", "2015-10-01 03:08:00", -23.7639, 151.1579,  8L,  "M",       "A", -2.772221329, -3.289721329,
                   "QA58295", "2015-10-01 03:45:00", -23.7693, 151.2944,  5L,  "F",      "SA", -1.472480488, -3.271480488,
                   "QA58284", "2015-10-01 04:15:00",  -23.764, 151.1579,  8L,  "M",       "A",  -1.84678562,  -3.30158562,
                    "K28651", "2015-10-01 06:55:00", -23.7553, 151.3169,  5L,  "M",       "A", -3.795818538, -6.773818538,
                   "QA58291", "2015-10-01 07:27:00", -24.5293, 152.0331,  6L,  "M",       "A",   1.11056039,  -1.76943961,
                   "QA58295", "2015-10-01 07:45:00", -23.7768, 151.3097,  5L,  "F",      "SA", -0.857955725, -3.567955725,
                   "QA58291", "2015-10-01 08:01:00",  -24.531, 152.0311,  4L,  "M",       "A",  0.228332558, -2.224067442,
                    "K28651", "2015-10-01 10:51:00", -23.7556, 151.3167,  6L,  "M",       "A", -5.854780024, -6.496980024,
                    "K28651", "2015-10-01 11:30:00", -23.7557, 151.3169,  6L,  "M",       "A", -6.165598638, -6.633598638,
                   "QA58295", "2015-10-01 11:49:00",  -23.778, 151.3076,  6L,  "F",      "SA", -3.701196658, -4.026996658,
                    "K28651", "2015-10-01 13:03:00", -23.7557, 151.3167,  7L,  "M",       "A", -6.487376042, -6.493576042,
                    "K28651", "2015-10-01 13:34:00", -23.7551,  151.317,  8L,  "M",       "A",  -6.83297897,  -6.93147897,
                   "QA58284", "2015-10-01 13:45:00", -23.7656, 151.1546,  6L,  "M",       "A", -2.520284467, -2.527484467,
                   "QA58295", "2015-10-01 13:55:00",  -23.778, 151.3074,  6L,  "F",      "SA",  -3.62481876,  -3.83581876,
                    "K28651", "2015-10-01 14:16:00",  -23.754, 151.3155,  4L,  "M",       "A", -6.554817435, -6.915717435,
                   "QA58206", "2015-10-01 14:38:00", -23.7435, 151.3051,  4L,  "F",      "SA", -7.002649565, -7.531949565,
                    "K28651", "2015-10-01 14:56:00", -23.7541, 151.3152,  9L,  "M",       "A", -5.983471077, -6.757471077,
                   "QA58284", "2015-10-01 14:58:00", -23.7655, 151.1554,  8L,  "M",       "A", -2.360926318, -2.579626318,
                   "QA58206", "2015-10-01 15:29:00", -23.7433, 151.3051,  5L,  "F",      "SA", -6.276333921, -7.385933921,
                   "QA58295", "2015-10-01 16:20:00", -23.7773, 151.3075,  5L,  "F",      "SA", -1.696819658, -3.390319658
                   )

telemetrydata$DateTime <- as.POSIXct(telemetrydata$DateTime , format = "%Y-%m-%d %H:%M:%S")

head(telemetrydata)
#> # A tibble: 6 x 9
#>   id     DateTime              lat   lon    qi sex   ageclass depth.exp depth.HT
#>   <chr>  <dttm>              <dbl> <dbl> <int> <chr> <chr>        <dbl>    <dbl>
#> 1 QA582~ 2015-10-01 00:02:00 -23.8  151.     6 F     SA           -3.15    -3.23
#> 2 QA431~ 2015-10-01 00:10:00 -23.8  151.     6 F     A            -3.01    -3.12
#> 3 QA582~ 2015-10-01 00:40:00 -23.7  151.     4 F     SA          -18.7    -19.1 
#> 4 QA582~ 2015-10-01 02:15:00 -23.8  151.     8 M     A            -3.19    -3.32
#> 5 QA582~ 2015-10-01 03:08:00 -23.8  151.     8 M     A            -2.77    -3.29
#> 6 QA582~ 2015-10-01 03:45:00 -23.8  151.     5 F     SA           -1.47    -3.27

I hope I've done that correctly. I should note, that while these two snapshots of the data don't have overlapping dates, that the entire datasets do in fact overlap.

Is this what you need?

library(tidyverse)
library(lubridate)

# Sample data on a copy/paste friendly format, replace this with your actual data set
tidedata <- data.frame(
  stringsAsFactors = FALSE,
            tideDT = as.POSIXct(c("2015-09-01 00:00:00",
                       "2015-09-01 00:10:00","2015-09-01 00:20:00",
                       "2015-09-01 00:30:00","2015-09-01 00:40:00","2015-09-01 00:50:00",
                       "2015-09-01 01:00:00","2015-09-01 01:10:00",
                       "2015-09-01 01:20:00","2015-09-01 01:30:00","2015-09-01 01:40:00",
                       "2015-09-01 01:50:00","2015-09-01 02:00:00",
                       "2015-09-01 02:10:00","2015-09-01 02:20:00","2015-09-01 02:30:00",
                       "2015-09-01 02:40:00","2015-09-01 02:50:00",
                       "2015-09-01 03:00:00","2015-09-01 03:10:00","2015-09-01 03:20:00",
                       "2015-09-01 03:30:00","2015-09-01 03:40:00",
                       "2015-09-01 03:50:00","2015-09-01 04:00:00","2015-09-01 04:10:00",
                       "2015-09-01 04:20:00","2015-09-01 04:30:00",
                       "2015-09-01 04:40:00","2015-09-01 04:50:00","2015-09-01 05:00:00",
                       "2015-09-01 05:10:00","2015-09-01 05:20:00",
                       "2015-09-01 05:30:00","2015-09-01 05:40:00","2015-09-01 05:50:00",
                       "2015-09-01 06:00:00","2015-09-01 06:10:00",
                       "2015-09-01 06:20:00","2015-09-01 06:30:00","2015-09-01 06:40:00",
                       "2015-09-01 06:50:00","2015-09-01 07:00:00",
                       "2015-09-01 07:10:00","2015-09-01 07:20:00","2015-09-01 07:30:00")),
           reading = c(3.75,3.836,3.91,3.97,4.012,
                       4.039,4.045,4.029,3.998,3.95,3.885,3.812,3.729,
                       3.636,3.525,3.405,3.285,3.143,3.009,2.865,2.718,
                       2.568,2.424,2.287,2.139,1.995,1.862,1.737,1.605,
                       1.475,1.35,1.229,1.116,1.005,0.899,0.802,0.708,0.618,
                       0.54,0.465,0.41,0.359,0.335,0.33,0.35,0.401),
              Tide = c("High","High","High","High",
                       "High","High","High","High","High","High","High",
                       "High","High","High","High","High","High","High",
                       "High","High","High","High","High","High","High",
                       "High","High","High","High","Low","Low","Low","Low",
                       "Low","Low","Low","Low","Low","Low","Low","Low",
                       "Low","Low","Low","Low","Low")
)


telemetrydata <- data.frame(
  stringsAsFactors = FALSE,
                id = c("QA58295","QA43123","QA58206",
                       "QA58284","QA58284","QA58295","QA58284","K28651",
                       "QA58291","QA58295","QA58291","K28651","K28651",
                       "QA58295","K28651","K28651","QA58284","QA58295","K28651",
                       "QA58206","K28651","QA58284","QA58206","QA58295"),
          DateTime = as.POSIXct(c("2015-09-01 00:02:00",
                       "2015-09-01 00:10:00","2015-09-01 00:40:00",
                       "2015-09-01 02:15:00","2015-09-01 03:08:00","2015-09-01 03:45:00",
                       "2015-09-01 04:15:00","2015-09-01 06:55:00",
                       "2015-09-01 07:27:00","2015-09-01 07:45:00","2015-09-01 08:01:00",
                       "2015-09-01 10:51:00","2015-09-01 11:30:00",
                       "2015-09-01 11:49:00","2015-09-01 13:03:00","2015-09-01 13:34:00",
                       "2015-09-01 13:45:00","2015-09-01 13:55:00",
                       "2015-09-01 14:16:00","2015-09-01 14:38:00","2015-09-01 14:56:00",
                       "2015-09-01 14:58:00","2015-09-01 15:29:00",
                       "2015-09-01 16:20:00")),
               lat = c(-23.7676,-23.7733,-23.6856,
                       -23.7637,-23.7639,-23.7693,-23.764,-23.7553,-24.5293,
                       -23.7768,-24.531,-23.7556,-23.7557,-23.778,-23.7557,
                       -23.7551,-23.7656,-23.778,-23.754,-23.7435,
                       -23.7541,-23.7655,-23.7433,-23.7773),
               lon = c(151.2965,151.2915,151.3048,
                       151.1582,151.1579,151.2944,151.1579,151.3169,152.0331,
                       151.3097,152.0311,151.3167,151.3169,151.3076,
                       151.3167,151.317,151.1546,151.3074,151.3155,151.3051,
                       151.3152,151.1554,151.3051,151.3075),
                qi = c(6L,6L,4L,8L,8L,5L,8L,5L,
                       6L,5L,4L,6L,6L,6L,7L,8L,6L,6L,4L,4L,9L,8L,
                       5L,5L),
               sex = c("F","F","F","M","M","F",
                       "M","M","M","F","M","M","M","F","M","M","M","F",
                       "M","F","M","M","F","F"),
          ageclass = c("SA","A","SA","A","A","SA",
                       "A","A","A","SA","A","A","A","SA","A","A","A",
                       "SA","A","SA","A","A","SA","SA"),
         depth.exp = c(-3.15134787,-3.006766321,
                       -18.72804075,-3.187366148,-2.772221329,-1.472480488,
                       -1.84678562,-3.795818538,1.11056039,-0.857955725,
                       0.228332558,-5.854780024,-6.165598638,-3.701196658,
                       -6.487376042,-6.83297897,-2.520284467,-3.62481876,-6.554817435,
                       -7.002649565,-5.983471077,-2.360926318,-6.276333921,
                       -1.696819658),
          depth.HT = c(-3.23284787,-3.124266321,
                       -19.05964075,-3.316566148,-3.289721329,-3.271480488,
                       -3.30158562,-6.773818538,-1.76943961,-3.567955725,
                       -2.224067442,-6.496980024,-6.633598638,-4.026996658,
                       -6.493576042,-6.93147897,-2.527484467,-3.83581876,-6.915717435,
                       -7.531949565,-6.757471077,-2.579626318,-7.385933921,
                       -3.390319658)
)

# Relevant code
telemetrydata %>% 
    mutate(tideDT = round_date(DateTime, unit = '10 min')) %>% 
    left_join(tidedata)
#> Joining, by = "tideDT"
#>         id            DateTime      lat      lon qi sex ageclass   depth.exp
#> 1  QA58295 2015-09-01 00:02:00 -23.7676 151.2965  6   F       SA  -3.1513479
#> 2  QA43123 2015-09-01 00:10:00 -23.7733 151.2915  6   F        A  -3.0067663
#> 3  QA58206 2015-09-01 00:40:00 -23.6856 151.3048  4   F       SA -18.7280408
#> 4  QA58284 2015-09-01 02:15:00 -23.7637 151.1582  8   M        A  -3.1873661
#> 5  QA58284 2015-09-01 03:08:00 -23.7639 151.1579  8   M        A  -2.7722213
#> 6  QA58295 2015-09-01 03:45:00 -23.7693 151.2944  5   F       SA  -1.4724805
#> 7  QA58284 2015-09-01 04:15:00 -23.7640 151.1579  8   M        A  -1.8467856
#> 8   K28651 2015-09-01 06:55:00 -23.7553 151.3169  5   M        A  -3.7958185
#> 9  QA58291 2015-09-01 07:27:00 -24.5293 152.0331  6   M        A   1.1105604
#> 10 QA58295 2015-09-01 07:45:00 -23.7768 151.3097  5   F       SA  -0.8579557
#> 11 QA58291 2015-09-01 08:01:00 -24.5310 152.0311  4   M        A   0.2283326
#> 12  K28651 2015-09-01 10:51:00 -23.7556 151.3167  6   M        A  -5.8547800
#> 13  K28651 2015-09-01 11:30:00 -23.7557 151.3169  6   M        A  -6.1655986
#> 14 QA58295 2015-09-01 11:49:00 -23.7780 151.3076  6   F       SA  -3.7011967
#> 15  K28651 2015-09-01 13:03:00 -23.7557 151.3167  7   M        A  -6.4873760
#> 16  K28651 2015-09-01 13:34:00 -23.7551 151.3170  8   M        A  -6.8329790
#> 17 QA58284 2015-09-01 13:45:00 -23.7656 151.1546  6   M        A  -2.5202845
#> 18 QA58295 2015-09-01 13:55:00 -23.7780 151.3074  6   F       SA  -3.6248188
#> 19  K28651 2015-09-01 14:16:00 -23.7540 151.3155  4   M        A  -6.5548174
#> 20 QA58206 2015-09-01 14:38:00 -23.7435 151.3051  4   F       SA  -7.0026496
#> 21  K28651 2015-09-01 14:56:00 -23.7541 151.3152  9   M        A  -5.9834711
#> 22 QA58284 2015-09-01 14:58:00 -23.7655 151.1554  8   M        A  -2.3609263
#> 23 QA58206 2015-09-01 15:29:00 -23.7433 151.3051  5   F       SA  -6.2763339
#> 24 QA58295 2015-09-01 16:20:00 -23.7773 151.3075  5   F       SA  -1.6968197
#>      depth.HT              tideDT reading Tide
#> 1   -3.232848 2015-09-01 00:00:00   3.750 High
#> 2   -3.124266 2015-09-01 00:10:00   3.836 High
#> 3  -19.059641 2015-09-01 00:40:00   4.012 High
#> 4   -3.316566 2015-09-01 02:20:00   3.525 High
#> 5   -3.289721 2015-09-01 03:10:00   2.865 High
#> 6   -3.271480 2015-09-01 03:50:00   2.287 High
#> 7   -3.301586 2015-09-01 04:20:00   1.862 High
#> 8   -6.773819 2015-09-01 07:00:00   0.335  Low
#> 9   -1.769440 2015-09-01 07:30:00   0.401  Low
#> 10  -3.567956 2015-09-01 07:50:00      NA <NA>
#> 11  -2.224067 2015-09-01 08:00:00      NA <NA>
#> 12  -6.496980 2015-09-01 10:50:00      NA <NA>
#> 13  -6.633599 2015-09-01 11:30:00      NA <NA>
#> 14  -4.026997 2015-09-01 11:50:00      NA <NA>
#> 15  -6.493576 2015-09-01 13:00:00      NA <NA>
#> 16  -6.931479 2015-09-01 13:30:00      NA <NA>
#> 17  -2.527484 2015-09-01 13:50:00      NA <NA>
#> 18  -3.835819 2015-09-01 14:00:00      NA <NA>
#> 19  -6.915717 2015-09-01 14:20:00      NA <NA>
#> 20  -7.531950 2015-09-01 14:40:00      NA <NA>
#> 21  -6.757471 2015-09-01 15:00:00      NA <NA>
#> 22  -2.579626 2015-09-01 15:00:00      NA <NA>
#> 23  -7.385934 2015-09-01 15:30:00      NA <NA>
#> 24  -3.390320 2015-09-01 16:20:00      NA <NA>

Created on 2021-04-05 by the reprex package (v2.0.0)

1 Like

I'll need to look at the output more closely to make sure it's correct but I believe this has done it!

Thank you very much for all your help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.