Creating a new column based on conditions of 2 other rows

Hi all,

I am struggling with creating a new column based on conditions in two other rows.
My goal is to assign a number in a new column if the SiteCode is "a" and the date range falls in between a specific time frame.
I am not even sure where to start. Any help would be great.

My dummy data looks like this:

Date	SiteCode
1/1/2020	a
1/2/2020	a
1/3/2020	a
1/4/2020	a
1/5/2020	a
1/6/2020	a
1/7/2020	a
1/8/2020	a
1/9/2020	a
1/10/2020	a
1/1/2020	b
1/2/2020	b
1/3/2020	b
1/4/2020	b
1/5/2020	b
1/6/2020	b
1/7/2020	b
1/8/2020	b
1/9/2020	b
1/10/2020	b

This is what I'm trying to do:
If the date range is between 1/1/2020 - 1/3/2020 and the SiteCode is a, then newcol = 1.
If the date range is between 1/4/2020 - 1/7/2020 and the SiteCode is a, then newcol=2.
If the date range is between 1/8/2020 - 1/10/2020 and the SiteCode is a, then newcol=3.
If the date range is between 1/1/2020 - 1/6/2020 and the SiteCode is b, then newcol = 1.
If the date range is between 1/7/2020 - 1/8/2020 and the SiteCode is b, then newcol=2.
If the date range is between 1/9/2020 - 1/10/2020 and the SiteCode is b, then newcol=3.

I simplified this data to a single year, but my data spans multiple years so any recommendation would be immensely helpful.
I was thinking of writing ifelse statements, but I do not know how to string together two conditionals with date ranges.

Thank you for your help in advance. I'm learning my way around here.

I find dplyr::case_when() ideal for this sort of thing because it makes the conditions much easier to decipher than multiple if...else statements.

library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)

data <- tribble(~ Date, ~ SiteCode,
                "1/1/2020", "a",
                "1/2/2020", "a",
                "1/3/2020", "a",
                "1/4/2020", "a",
                "1/5/2020", "a",
                "1/6/2020", "a",
                "1/7/2020", "a",
                "1/8/2020", "a",
                "1/9/2020", "a",
                "1/10/2020", "a",
                "1/1/2020", "b",
                "1/2/2020", "b",
                "1/3/2020", "b",
                "1/4/2020", "b",
                "1/5/2020", "b",
                "1/6/2020", "b",
                "1/7/2020", "b",
                "1/8/2020", "b",
                "1/9/2020", "b",
                "1/10/2020", "b")

data$Date <- dmy(data$Date)

data <- data %>% 
  mutate(newcol = case_when(
    between(Date, dmy("1/1/2020"), dmy("1/3/2020")) & SiteCode == "a" ~ 1L,
    between(Date, dmy("1/4/2020"), dmy("1/7/2020")) & SiteCode == "a" ~ 2L,
    between(Date, dmy("1/8/2020"), dmy("1/10/2020")) & SiteCode == "a" ~ 3L,
    between(Date, dmy("1/1/2020"), dmy("1/6/2020")) & SiteCode == "b" ~ 1L,
    between(Date, dmy("1/7/2020"), dmy("1/8/2020")) & SiteCode == "b" ~ 2L,
    between(Date, dmy("1/9/2020"), dmy("1/10/2020")) & SiteCode == "b" ~ 3L)
    )

print(data)
#> # A tibble: 20 x 3
#>    Date       SiteCode newcol
#>    <date>     <chr>     <int>
#>  1 2020-01-01 a             1
#>  2 2020-02-01 a             1
#>  3 2020-03-01 a             1
#>  4 2020-04-01 a             2
#>  5 2020-05-01 a             2
#>  6 2020-06-01 a             2
#>  7 2020-07-01 a             2
#>  8 2020-08-01 a             3
#>  9 2020-09-01 a             3
#> 10 2020-10-01 a             3
#> 11 2020-01-01 b             1
#> 12 2020-02-01 b             1
#> 13 2020-03-01 b             1
#> 14 2020-04-01 b             1
#> 15 2020-05-01 b             1
#> 16 2020-06-01 b             1
#> 17 2020-07-01 b             2
#> 18 2020-08-01 b             2
#> 19 2020-09-01 b             3
#> 20 2020-10-01 b             3

Created on 2020-05-27 by the reprex package (v0.3.0)

It worked! Thank you so much.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.