How can I use calculated values by formula in a new column for other rows in new column in R?

I have one column, A, that I need to use it to make a new column:

DT<- data.table(A = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3), B=0)

In a new column, B, I want to use a formula to separate every 1...9. So, the output will be like this:

A  B
1  1
2  1
3  1
4  1
5  1
6  1 
7  1
8  1
9  1
1  2
2  2
3  2
4  2 
5  2
6  2
7  2 
8  2 
9  2
1  3 
2  3
3  3 

I did in Excel with the below formula: = IF(A2=1,IF(A1>1,B1+1,B1),B1

I tried this formula in R with different methods, shift(), lag(), ...:

DT$B = ifelse(DT$A == 1, ifelse(shift(DT$A, 1L, type="lag")>1, 
              DT$B<- 1+shift(DT$B, 1L, type="lag"), 
              DT$B<-shift(DT$B, 1L, type="lag")),
              DT$B<-shift(DT$B, 1L, type="lag"))

The results for the above formula was:

   A  B
 1: 1 NA
 2: 2 NA
 3: 3 NA
 4: 4  1
 5: 5  1
 6: 6  1
 7: 7  1
 8: 8  1
 9: 9  1
10: 1  1
11: 2  1
12: 3  1
13: 4  1
14: 5  1
15: 6  1
16: 7  1
17: 8  1
18: 9  1
19: 1  1
20: 2  1
21: 3  1
    A  B

I would appreciate if you could help me. Thanks, Milad

Hello there,

The quickest way to achieve the desired output would simply be to make use of rep along with sorting it. Let me know if this solves your problem :slight_smile:

library(data.table)

DT<- data.table(A = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3), B= sort(rep(1:3,9)))
#> Warning in as.data.table.list(x, keep.rownames = keep.rownames, check.names
#> = check.names, : Item 1 has 21 rows but longest item has 27; recycled with
#> remainder.

DT
#>     A B
#>  1: 1 1
#>  2: 2 1
#>  3: 3 1
#>  4: 4 1
#>  5: 5 1
#>  6: 6 1
#>  7: 7 1
#>  8: 8 1
#>  9: 9 1
#> 10: 1 2
#> 11: 2 2
#> 12: 3 2
#> 13: 4 2
#> 14: 5 2
#> 15: 6 2
#> 16: 7 2
#> 17: 8 2
#> 18: 9 2
#> 19: 1 3
#> 20: 2 3
#> 21: 3 3
#> 22: 1 3
#> 23: 2 3
#> 24: 3 3
#> 25: 4 3
#> 26: 5 3
#> 27: 6 3
#>     A B

Created on 2020-09-26 by the reprex package (v0.3.0)

2 Likes

That works. Thanks, Man

1 Like

Thanks, That's great!
Could you please give me some advice if I have more than one replicate for each number in A? for example: [B is my expect output]
DT<- data.table(A=c(1,1,1,1,2,3,4,4,5,6,7,8,8,9,1,2,3,4,5,5,5,6,6,6,7,7,7,8,9,1,2,2,2,3))
The number of each value might be different.
The expected output [B] is:

A 1,1,1,1,2,3,4,4,5,6,7,8,8,9,1,2,3,4,5,5,5,6,6,6,7,7,7,8,9,1,2,2,2,3
B 1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3

Thanks again,
Milad

After shifting the column(col1), calculate the difference in the columns. This is A- col1.
Negative values mark the beginning of new series. Mark those as 1 (col2 here) and take cumulative sum.

library(data.table)
DT<- data.table(A=c(1,1,1,1,2,3,4,4,5,6,7,8,8,9,1,2,3,4,5,5,5,6,6,6,7,7,7,8,9,1,2,2,2,3))
DT[,col1:=shift(A,1,fill = 1000)][,col2:=ifelse(A-col1<0,1,0)][,B:=cumsum(col2)]
DT
#>     A col1 col2 B
#>  1: 1 1000    1 1
#>  2: 1    1    0 1
#>  3: 1    1    0 1
#>  4: 1    1    0 1
#>  5: 2    1    0 1
#>  6: 3    2    0 1
#>  7: 4    3    0 1
#>  8: 4    4    0 1
#>  9: 5    4    0 1
#> 10: 6    5    0 1
#> 11: 7    6    0 1
#> 12: 8    7    0 1
#> 13: 8    8    0 1
#> 14: 9    8    0 1
#> 15: 1    9    1 2
#> 16: 2    1    0 2
#> 17: 3    2    0 2
#> 18: 4    3    0 2
#> 19: 5    4    0 2
#> 20: 5    5    0 2
#> 21: 5    5    0 2
#> 22: 6    5    0 2
#> 23: 6    6    0 2
#> 24: 6    6    0 2
#> 25: 7    6    0 2
#> 26: 7    7    0 2
#> 27: 7    7    0 2
#> 28: 8    7    0 2
#> 29: 9    8    0 2
#> 30: 1    9    1 3
#> 31: 2    1    0 3
#> 32: 2    2    0 3
#> 33: 2    2    0 3
#> 34: 3    2    0 3
#>     A col1 col2 B

Created on 2020-09-26 by the reprex package (v0.3.0)

1 Like

Thank you so much, @srbhgaur

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.