Hi!
Hopeful someone understand my question..and my english
How can I number different episodes?
I want that when collums "personID" and "time" are same in different rows, they get same number to new variable "episodenumber". Data is huge, thousands rows.
Example:
personID time episodenumber
wx 15.6.2020 1
wx 15.6.2020 1
aaa 1.7.2019 2
aaa 20.8.2021 3
aaa 20.8.2021 3
oiy 19.12.2020 4
Thank you!
Saija
FJCC
August 16, 2021, 12:49pm
2
Here is one method that numbers the episodes in alphabetical order. It uses the fact that factors are stored as integers.
DF <- data.frame(personID=c("wx","wx","aaa","aaa","aaa","oiy"),
+ time = as.Date("2020-06-15", "2020-06-15", "2019-07-01",
+ "2021-08-20", "2021-08-20", "2020-12-19"))
DF
personID time
1 wx 2021-08-16
2 wx 2021-08-16
3 aaa 2021-08-16
4 aaa 2021-08-16
5 aaa 2021-08-16
6 oiy 2021-08-16
library(tidyr)
DF <- DF %>% unite(col =Episode, personID:time, remove = FALSE)
DF
Episode personID time
1 wx_2021-08-16 wx 2021-08-16
2 wx_2021-08-16 wx 2021-08-16
3 aaa_2021-08-16 aaa 2021-08-16
4 aaa_2021-08-16 aaa 2021-08-16
5 aaa_2021-08-16 aaa 2021-08-16
6 oiy_2021-08-16 oiy 2021-08-16
DF <- DF %>% mutate(Episode = factor(Episode),
Episode = as.numeric(Episode))
DF
Episode personID time
1 3 wx 2021-08-16
2 3 wx 2021-08-16
3 1 aaa 2021-08-16
4 1 aaa 2021-08-16
5 1 aaa 2021-08-16
6 2 oiy 2021-08-16
And here another method:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
DF <- data.frame(personID=c("wx","wx","aaa","aaa","aaa","oiy"),
time = as.Date(lubridate::parse_date_time(c("2020-06-15", "2020-06-15", "2019-07-01",
"2021-08-20", "2021-08-20", "2020-12-19"),"%Y-%m-%d"))
)
DF
#> personID time
#> 1 wx 2020-06-15
#> 2 wx 2020-06-15
#> 3 aaa 2019-07-01
#> 4 aaa 2021-08-20
#> 5 aaa 2021-08-20
#> 6 oiy 2020-12-19
DF %>%
group_by(personID,time) %>%
mutate (episodenumber=cur_group_id()) %>%
ungroup()
#> # A tibble: 6 x 3
#> personID time episodenumber
#> <chr> <date> <int>
#> 1 wx 2020-06-15 4
#> 2 wx 2020-06-15 4
#> 3 aaa 2019-07-01 1
#> 4 aaa 2021-08-20 2
#> 5 aaa 2021-08-20 2
#> 6 oiy 2020-12-19 3
Created on 2021-08-16 by the reprex package (v2.0.0)
1 Like
system
Closed
August 23, 2021, 1:46pm
6
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.