Creating trip number column based on consecutive days grouped by year

I am working with a long term sub-tidal ecological data set from California's Channel Islands.

I would like to generate a column in my data frame that shows the unique trip number for the survey year. There are usually around 10 survey trips per year and each trip is typically 5 days (Monday-Friday). There are some exceptions including day trips and weekend trips. I am mostly concerned with overnight trips (ok to exclude daytrips and weekend dates). Trips are usually conducted on a biweekly basis during the summer months so counting consecutive days would be a good way to establish what constitutes a trip. This will help me filter my data frame by year/trip number in order to build an itinerary for each trip.

My eventual goal is to make an interactive map in my shiny app that shows the route these trips took and to include tables with the buoy data for each day on a given trip. This will then be linked to forecast data to look for past trips with similar oceanographic and meteorological conditions to help with trip planning. To get an idea of what this app looks like visit my current version here: http://35.185.209.53/

The data I am using is stored as a gist here: https://gist.github.com/cullen-molitor/a80235fed396780245fe12f84093135f

Here is a bit of code to get started but I am at a loss for what to do next. I tried looking at some other examples but nothing seems to quite fit what I am trying to do and I am not clever enough to alter them.

library(dplyr)

Visit_Dates <- read_csv("Visit_Dates_Filtered.csv") %>%
group_by(SurveyYear) %>%
mutate(Trip_Number = NEED_HELP_HERE)

Some kind of relevant links that I couldn't make work are:

r-counting-consecutive-days

count-consecutive-days-by-group

Desired output:

SiteNumber	 SiteName	        IslandName	      SurveyYear	Date	  Day_of_Week	Month	Trip_Number
12	         Cathedral Cove	    Anacapa Island	  1982	        8/16/1982 Monday	    August	1
13	         Landing Cove	    Anacapa Island	  1982	        8/17/1982 Tuesday	    August	1
9	         Scorpion Anchorage	Santa Cruz Island 1982	        8/18/1982 Wednesday	    August	1
6	         Rodes Reef	        Santa Rosa Island 1982	        8/19/1982 Thursday	    August	1
1	         Wyckoff Ledge	    San Miguel Island 1982	        8/20/1982 Friday	    August	1
11	         Admirals Reef	    Anacapa Island	  1982	        8/30/1982 Monday	    August	2

Any help would be greatly appreciated.

The essential grouping logic is captured in the second link you provided. Here it is applied to your data.

library(dplyr, warn.conflicts = FALSE)

data <- readr::read_csv("Visit_Dates_Filtered.csv")
Parsed with column specification:
cols(
  SiteNumber = col_double(),
  SiteName = col_character(),
  IslandName = col_character(),
  SurveyYear = col_double(),
  Date = col_date(format = ""),
  Day_of_Week = col_character(),
  Month = col_character()
)

data <- data %>%
  arrange(Date) %>% 
  group_by(Trip_Number = cumsum(c(TRUE, diff(Date) > 1))) %>% 
  ungroup()

head(data, n = 10)
# A tibble: 10 x 8
   SiteNumber SiteName            IslandName           SurveyYear Date       Day_of_Week Month  Trip_Number
        <dbl> <chr>               <chr>                     <dbl> <date>     <chr>       <chr>        <int>
 1          3 Johnson's Lee North Santa Rosa Island          1982 1982-05-03 Monday      May              1
 2          3 Johnson's Lee North Santa Rosa Island          1982 1982-05-04 Tuesday     May              1
 3          8 Pelican Bay         Santa Cruz Island          1982 1982-07-19 Monday      July             2
 4          7 Fry's Harbor        Santa Cruz Island          1982 1982-07-20 Tuesday     July             2
 5          9 Scorpion Anchorage  Santa Cruz Island          1982 1982-07-22 Thursday    July             3
 6         15 Arch Point          Santa Barbara Island       1982 1982-08-02 Monday      August           4
 7         13 Landing Cove        Anacapa Island             1982 1982-08-04 Wednesday   August           5
 8         11 Admiral's Reef      Anacapa Island             1982 1982-08-05 Thursday    August           5
 9         11 Admiral's Reef      Anacapa Island             1982 1982-08-06 Friday      August           5
10          1 Wyckoff Ledge       San Miguel Island          1982 1982-08-18 Wednesday   August           6

Created on 2020-05-11 by the reprex package (v0.3.0)

2 Likes

Thank you! I didn’t realize you could create a new column directly from group_by(). This is such a simple solution and I really appreciate the help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.