I simulated this dataset which contains 1000 coin flips - then I calculated the number of "2 Flip Sequences":
Coin <- c('H', 'T')
Results = sample(Coin,1000, replace = TRUE)
My_Data = data.frame(id = 1:1000, Results)
Pairs = data.frame(first = head(My_Data$Results, -1), second = tail(My_Data$Results, -1))
Final = as.data.frame(table(Pairs))
first second Freq
1 H H 255
2 T H 245
3 H T 246
4 T T 253
I am curious - is it possible to extend the above code for "3 Flip Sequences"?
For example - I tried modifying parts of the code to see how the results change (and hoped to stumble across the correct way to write this code):
# First Attempt
Pairs = data.frame(first = head(My_Data$Results, -1), second = head(My_Data$Results, -1) , third = tail(My_Data$Results, -1))
Final = as.data.frame(table(Pairs))
first second third Freq
1 H H H 255
2 T H H 245
3 H T H 0
4 T T H 0
5 H H T 0
6 T H T 0
7 H T T 246
8 T T T 253
# Second Attempt
Pairs = data.frame(first = head(My_Data$Results, -1), second = tail(My_Data$Results, -1) , third = tail(My_Data$Results, -1))
Final = as.data.frame(table(Pairs))
first second third Freq
1 H H H 255
2 T H H 0
3 H T H 0
4 T T H 245
5 H H T 246
6 T H T 0
7 H T T 0
8 T T T 253
I am not sure which of these options are correct?
- In general, I am looking to understand the logic as to how I can adapt the above code for an "arbitrary number of coin flips" (e.g. "4 flip sequences", "5 flip sequences", etc.)
- Also, this might not be the most efficient way to calculate these frequencies - I would also be interested in learning about other ways that might be more efficient ( e.g. as the overall size of the data increases).
Thanks!