creating a column based on information from another column . every 2 shift , one day increasing as day(1,2,3,4,5...150)

df_ <- data.frame(shift=rep(1:2,each=8,time=4))
Based on this column shift I would like to create another one named day every 2 shift will be one day but it will increase consecutively up to 150 days because in the original data there are 300 shift .
shift
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 2
10 2
11 2
12 2
13 2
14 2
15 2
16 2
17 1
18 1
19 1
20 1
21 1
22 1
23 1
24 1
25 2
26 2
27 2
28 2
29 2
30 2
31 2

Please show an example of the final result you want to have.

df_ <- data.frame(shift=rep(1:2,each=8,time=4))

Based on this column shift I would like to create another one named day every 2 shift will be one day but it will increase consecutively up to 150 days because in the original data there are 300 shift .

Result that I would like to have.

>  X1 data
> 1   1    1
> 2   1    1
> 3   2    1
> 4   2    1
> 5   1    2
> 6   1    2
> 7   2    2
> 8   2    2
> 9   1    3
> 10  1    3
> 11  2    3
> 12  2    3
````Preformatted text`
> 

###I have been checking this

x1=c(1, 1, 2, 2, 1,1,2, 2,1,1,2,2)

library(tidyverse)

x = data.frame(X1 = x1) %>%
mutate(data = case_when(
row_number() == 1 ~ 0,
X1 == lag(X1) ~ 1,
TRUE ~ 1
)) %>%
mutate(data = cumsum(data)) %>%
fill(data) %>%

>   mutate(data = data %% 2+1)
> x

It seems you want to number the rows in groups of 4. I would do that with integer division.

DF <- data.frame(X1 = rep(c(1,1,2,2),4))
DF
#>    X1
#> 1   1
#> 2   1
#> 3   2
#> 4   2
#> 5   1
#> 6   1
#> 7   2
#> 8   2
#> 9   1
#> 10  1
#> 11  2
#> 12  2
#> 13  1
#> 14  1
#> 15  2
#> 16  2
library(dplyr)
DF <- DF |> mutate(Day = (row_number()-1) %/% 4 + 1)
DF
#>    X1 Day
#> 1   1   1
#> 2   1   1
#> 3   2   1
#> 4   2   1
#> 5   1   2
#> 6   1   2
#> 7   2   2
#> 8   2   2
#> 9   1   3
#> 10  1   3
#> 11  2   3
#> 12  2   3
#> 13  1   4
#> 14  1   4
#> 15  2   4
#> 16  2   4

Created on 2023-03-25 with reprex v2.0.2

How will I do if not always the amount of ones and two for the sequences are same amount , let say for day one I can have 5 ones and 3 twos but for the day 2 I will have 3 ones and 4 twos . How will I do on that case.

#>    X1 Day
#> 1   1   1
#> 2   1   1
#> 3   1   1
#> 4   2   1
#> 5   1   2
#> 6   1   2
#> 7   2   2
#> 8   2   2
#> 9   2   2
#> 10  1   3
#> 11  2   3
#> 12  2   3
#> 13  1   4

Here is one way to handle that.

DF <- data.frame(X1 = c(1,1,1,2,1,1,2,2,2,1,2,2,1))
X1vec <- DF$X1
DayVec <- vector("numeric", length = length(X1vec))
DayIdx <- 1
for(i in seq_along(X1vec)) {
  DayVec[i] <- DayIdx
  if(X1vec[i] == 2 && i < length(X1vec) && X1vec[i+1] == 1) DayIdx <- DayIdx + 1
}
DF$Day <- DayVec
DF
#>    X1 Day
#> 1   1   1
#> 2   1   1
#> 3   1   1
#> 4   2   1
#> 5   1   2
#> 6   1   2
#> 7   2   2
#> 8   2   2
#> 9   2   2
#> 10  1   3
#> 11  2   3
#> 12  2   3
#> 13  1   4

Created on 2023-03-25 with reprex v2.0.2

1 Like

Thank you for your answer , I did that but is giving me an error , what can I do now ?

shift2 <- DF$shift2
DayVec <- vector("numeric", length = length(shift2))
DayIdx <- 1
for(i in seq_along(shift2)) {
  DayVec[i] <- DayIdx
  if(shift2[i] == 2 && i < length(shift2) && shift2[i+1] == 1) DayIdx <- DayIdx + 1
}

but is giving me an error Error
in if (shift2[i] == 2 && i < length(shift2) && shift2[i + 1] == :
missing value where TRUE/FALSE needed

if I placed it like this using nrow

shift <- DF$shift2
DayVec <- vector("numeric", length = length(shift))
DayIdx <- 1
for(i in nrow(shift)) {
  DayVec[i] <- DayIdx
  if(shift[i] == 2 && i < length(shift) && shift[i+1] == 1) DayIdx <- DayIdx + 1
}
DF$Day <- DayVec
DF

the Day column is created but just fill up of ceros .

Since nrow(shift) returns a single number, your for loop controlled by for(i in nrow(shift))will only run once.
Please post the output of

shift2 <- DF$shift2
dput(shift2)
1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, NA)

is it because the Na?there is one Na.

Thank you very much !! :smiley: :smile:
it was the Na, it worked now, I just placed 2 there should not be Na.
Nobody has ever helped me so much like you guys here in this forum, I really appreciated it so much.
Thank you !!

Yes, the NA is the problem. There are 91 elements in shift2. When i = 90, the test X1vec[i+1] == 1 returns NA. You need to decide what the correct response is to an NA and then modify the tests within the if() to act accordingly. Can NA appear anywhere in the vector or just at the end?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.