Hello everyone, thank you in advanced for assisting me with this issue. I have the following data frame example:
'''
Team <- c("A", "A", "A", "A", "A", "A", "A")
Opponent <- c("B", "@C", "@D", "B", "E", "@E", "C")
date<- ("2021-07-01", "2021-07-04", "2021-07-06", "2021-07-08", "2021-07-09", "2021-07-14", "2021-07-16")
df <- data.frame(date, Team, Opponent)
'''
The goal is the following;
-
to create a column of data (Home/Away, Home=1, Away=0). This data column will be collected from the "Opponent" column. If there is a "@" symbol in front of the team letter value, an away value of "0" will be assigned, if there is no "@" symbol, the team in question is at home and a "1" value will be assigned.
-
create a data column calculating how many days between games. this has been successfully achieved with the following code:
'''
Select(df) %>%
mutate(BETWEEN0=as.numeric(difftime(date,lag(date,1))),BetweenTEAM=ifelse(is.na(BETWEEN0),0,BETWEEN0)) %>%
select(-BETWEEN0)
'''
- Create a new data column called "Travel" (yes/no; yes = 1, no=0) which is determined by 2 factors, both of which must be true: a) the number of days between games (BetweenTEAM) must be >=2 . AND the most recent game must be "away" (Home/Away =0)
In this way more column are added to the data frame so data is shown for each of the given dates.
I hope this makes sense. Any help on goal 1 and 3 would be greatly appreciated !
I look forward to chatting with you. Please let me know if I can improve my question at all.