to create a column of data (Home/Away, Home=1, Away=0). This data column will be collected from the "Opponent" column. If there is a "@" symbol in front of the team letter value, an away value of "0" will be assigned, if there is no "@" symbol, the team in question is at home and a "1" value will be assigned.
create a data column calculating how many days between games. this has been successfully achieved with the following code:
Create a new data column called "Travel" (yes/no; yes = 1, no=0) which is determined by 2 factors, both of which must be true: a) the number of days between games (BetweenTEAM) must be >=2 . AND the most recent game must be "away" (Home/Away =0)
In this way more column are added to the data frame so data is shown for each of the given dates.
I hope this makes sense. Any help on goal 1 and 3 would be greatly appreciated !
I look forward to chatting with you. Please let me know if I can improve my question at all.
# don't use lower case date or df; those are names of built-ins
daf <- data.frame(Date, Team, Opponent)
daf$Park <- ifelse(grepl("@",daf$Opponent),"Away","Home")
daf
#> Date Team Opponent Park
#> 1 2021-07-01 A B Home
#> 2 2021-07-04 A @C Away
#> 3 2021-07-06 A @D Away
#> 4 2021-07-08 A B Home
#> 5 2021-07-09 A E Home
#> 6 2021-07-14 A @E Away
#> 7 2021-07-16 A C Home
suppressPackageStartupMessages(library(tidyverse))
Team <- c("A", "A", "A", "A", "A", "A", "A")
Opponent <- c("B", "@C", "@D", "B", "E", "@E", "C")
Date <- c("2021-07-01", "2021-07-04", "2021-07-06", "2021-07-08",
"2021-07-09", "2021-07-14", "2021-07-16")
daf <- data.frame(Date, Team, Opponent)
# Park: Home=1, Away=0
daf$Park <- ifelse(grepl("@",daf$Opponent), 0, 1)
daf
#> Date Team Opponent Park
#> 1 2021-07-01 A B 1
#> 2 2021-07-04 A @C 0
#> 3 2021-07-06 A @D 0
#> 4 2021-07-08 A B 1
#> 5 2021-07-09 A E 1
#> 6 2021-07-14 A @E 0
#> 7 2021-07-16 A C 1
# Days between consecutive games
daf %>%
mutate(BETWEEN0 = as.numeric(difftime(Date, lag(Date, 1))),
BetweenTEAM = ifelse(is.na(BETWEEN0), 0, BETWEEN0)) %>%
select(-BETWEEN0) -> daf
# Create a new data column called "Travel" (yes/no; yes = 1, no=0) which is
# determined by 2 factors, both of which must be true: a) the number of days
# between games (BetweenTEAM) must be >=2 . AND the most recent game must be
# "away" (Home/Away =0)
daf %>%
mutate(MostRecentAway = ifelse(lag(Park, 1) == 0, 1, 0),
Travel = ifelse((BetweenTEAM >= 2) & (MostRecentAway == 0), 1, 0)) -> daf
daf
#> Date Team Opponent Park BetweenTEAM MostRecentAway Travel
#> 1 2021-07-01 A B 1 0 NA 0
#> 2 2021-07-04 A @C 0 3 0 1
#> 3 2021-07-06 A @D 0 2 1 0
#> 4 2021-07-08 A B 1 2 1 0
#> 5 2021-07-09 A E 1 1 0 0
#> 6 2021-07-14 A @E 0 5 0 1
#> 7 2021-07-16 A C 1 2 1 0