Create a new variable with a condition

I have this data set

structure(list(ANO_REF = c(2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L), MES_REF = c(12L, 11L,
10L, 9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L, 6L, 12L, 11L, 11L, 10L,
9L, 9L, 9L), FLUXO = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), NPC = c(103920617L,
103920617L, 103920617L, 103920617L, 103920617L, 103920617L, 103920617L,
103920617L, 103920617L, 103920617L, 103920617L, 103920617L, 106320750L,
106320750L, 106320750L, 106320750L, 106320750L, 106320750L, 106320750L,
106320750L), PAIS_COD = c("ES", "ES", "ES", "ES", "ES", "ES",
"ES", "ES", "ES", "ES", "ES", "ES", "AT", "BE", "BE", "BE", "BE",
"BE", "BE", "BE"), NC8 = c(7051900L, 7051900L, 7051900L, 7051900L,
7051900L, 7051900L, 7051900L, 7051900L, 7051900L, 7051900L, 7051900L,
7051900L, 22042189L, 22042169L, 22042169L, 22042189L, 22042169L,
22042138L, 22042169L, 22042189L), VF = c(14297, 17326, 33461,
37192, 39201, 35816, 25126, 28974, 16166, 14058, 11517, 5381,
2526, 11664, 6401, 5135, 10, 540, 1188, 2676), ML = c(16000,
23000, 31750, 32000, 34500, 36750, 22250, 31500, 23160, 13250,
11000, 7250, 166.5, 1458, 315, 253.5, 7.5, 108, 162, 222), US = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 167, 1458, 315, 256, 8, 108,
162, 222), NPCPAISCOD = c("103920617ES", "103920617ES", "103920617ES",
"103920617ES", "103920617ES", "103920617ES", "103920617ES", "103920617ES",
"103920617ES", "103920617ES", "103920617ES", "103920617ES", "106320750AT",
"106320750BE", "106320750BE", "106320750BE", "106320750BE", "106320750BE",
"106320750BE", "106320750BE")), row.names = c("1508849", "1442666",
"1378683", "1314381", "1259396", "1195766", "309403", "246636",
"183714", "119203", "57345", "124", "1121677", "2312830", "2241638",
"2241639", "2169806", "2103631", "2103632", "2103633"), class = "data.frame")

and I want to create a new variable that counts the number of records for each NPCPAISCOD record.

This means from line 1 to 12 1,2, ..., 12 which have 103920617ES, line 13 is 1 (it is a single record) which have 103920617ES, line 14 is 1 (it is a single record) which have 106320750AT, from line 14 to 20 1,2, ..., 16 which have 106320750BE.

This means the new variable starts counting conditional to a change in NPCPAISCOD variable. When this one changes we have a new counting until the next change.

Thank you.

I think this will do it,

 library(tidyverse)
 dat1   %>%  group_by(NPCPAISCOD)   %>%  
          mutate(recs = 1:n())
```
   
Solution shamelessly copied from one by Woodward.

@jrkrideau

I 've used this line of code successfully.

F2_2016_2018$num <- sequence(rle(F2_2016_2018$NPCPAISCOD)$lengths)

Thank you very much for your help.

Very nice, I had forgotten about rle()

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.