how to make a loop column in R?

I want to make a new column loop as below:
The first column is the index of the household. the second column is the index of person in that household . the third column is the index of trip of each day in each family during a day. the zoneOfHome is the zone of the home of that household. start_zone is a zone that a person start his/her trip from there and end_zone is the zone of a place that a person travel to there. last column is the purpose of each trip.
A loop is a sequence of trips which starts from home and ends at home.
I want a new column 'loop' which determine loop of each trip of household member.

tibble::tribble(
  ~Household, ~person, ~trip, ~ZoneOfHome, ~start_zone, ~end_zone,   ~purpose,
          1L,      1L,    1L,         22L,         22L,       13L,     "work",
          1L,      1L,    2L,         22L,         13L,       22L,     "home",
          1L,      1L,    3L,         22L,         22L,       34L, "shopping",
          1L,      1L,    4L,         22L,         34L,       22L,     "home",
          1L,      2L,    1L,         22L,         22L,       13L,     "work",
          1L,      2L,    2L,         22L,         13L,       22L,     "home",
          2L,      1L,    1L,         15L,         15L,       15L,     "work",
          2L,      1L,    2L,         15L,         15L,       15L,     "home",
          2L,      1L,    3L,         15L,         15L,       45L, "shopping",
          2L,      1L,    4L,         15L,         45L,       15L,     "home",
          3L,      1L,    1L,         17L,          6L,       17L,     "home",
          3L,      1L,    2L,         17L,         17L,       10L,     "work",
          3L,      1L,    3L,         17L,         10L,       17L,     "home"
  )

Foe each person a loop is start when start_zone=zone until end_zone=zone and purpose=home:

Household   person    trip    ZoneOfHome    start_zone   end_zone       loop
   1           1        1      22              22           13            1
   1           1        2      22              13           22            1
   1           1        3      22              22           34            2
   1           1        4      22              34           22            2
   1           2        1      22              22           13            1
   1           2        2      22              13           22            1
   2           1        1      15              15           15            1
   2           1        2      15              15           15            1
   2           1        3      15              15            45           2
   2           1        4      15              45           15            2
   3           1        1      17              6            17            -
   3           1        2      17              17           10            1
   3           1        3      17              10           17            1

The way you divide the 2 households first persons data into two trips seems arbitrary. Elsewhere you count having started from home zone the next return to home zone as a loop. For this case though you violate that and assume that you can tell that all the movements from from 15 to 15 are 1 trip rather than multiple trips. Without access to data from another column to explain this choice, I believe it's impossible to fully program the computer to your intended outcome as the data is insufficient

I have one more column which shows when people back home.

can you edit your example above to include that ?

could you see edit? I just did it

Thanks,
to prepare your example for others, I pasted what you provided into excel, used text to columns splitting on fixed width for excel to make a table from it. Then used copy and paste to R, with the datapasta package, using the tribble option

tibble::tribble(
  ~Household, ~person, ~trip, ~ZoneOfHome, ~start_zone, ~end_zone,   ~purpose,
          1L,      1L,    1L,         22L,         22L,       13L,     "work",
          1L,      1L,    2L,         22L,         13L,       22L,     "home",
          1L,      1L,    3L,         22L,         22L,       34L, "shopping",
          1L,      1L,    4L,         22L,         34L,       22L,     "home",
          1L,      2L,    1L,         22L,         22L,       13L,     "work",
          1L,      2L,    2L,         22L,         13L,       22L,     "home",
          2L,      1L,    1L,         15L,         15L,       15L,     "work",
          2L,      1L,    2L,         15L,         15L,       15L,     "home",
          2L,      1L,    3L,         15L,         15L,       45L, "shopping",
          2L,      1L,    4L,         15L,         45L,       15L,     "home",
          3L,      1L,    1L,         17L,          6L,       17L,     "home",
          3L,      1L,    2L,         17L,         17L,       10L,     "work",
          3L,      1L,    3L,         17L,         10L,       17L,     "home"
  )

thanks, do you have solution for that?

I'm afraid I'm now busy, I intend to check back on Sunday if I have time.

I show how I can calculate loop to take the same values as your target_loop example,
I tried to apply your own rule logic where possible.

library(tidyverse)
df_1<-tibble::tribble(
  ~Household, ~person, ~trip, ~ZoneOfHome, ~start_zone, ~end_zone,   ~purpose,~target_loop,
  1L,      1L,    1L,         22L,         22L,       13L,     "work",  1,
  1L,      1L,    2L,         22L,         13L,       22L,     "home", 1,
  1L,      1L,    3L,         22L,         22L,       34L, "shopping", 2,
  1L,      1L,    4L,         22L,         34L,       22L,     "home", 2,
  1L,      2L,    1L,         22L,         22L,       13L,     "work", 1,
  1L,      2L,    2L,         22L,         13L,       22L,     "home", 1,
  2L,      1L,    1L,         15L,         15L,       15L,     "work", 1,
  2L,      1L,    2L,         15L,         15L,       15L,     "home", 1,
  2L,      1L,    3L,         15L,         15L,       45L, "shopping", 2,
  2L,      1L,    4L,         15L,         45L,       15L,     "home", 2,
  3L,      1L,    1L,         17L,          6L,       17L,     "home", NA_real_,
  3L,      1L,    2L,         17L,         17L,       10L,     "work", 1,
  3L,      1L,    3L,         17L,         10L,       17L,     "home",1
)

df_2<-df_1 %>% mutate(loop_end = (end_zone==ZoneOfHome & purpose=="home"),
                      loop_start = (start_zone == ZoneOfHome),
                      loop = NA_integer_) %>%
  group_by(Household,person)

loops_per_householdperson <- function(df){
  for (i in 1:nrow(df)){
    if(df$loop_start[[i]]){
      if(i>1)
        df$loop[[i]] <- coalesce(df$loop[[i-1]],0L) + 1L
      else
        df$loop[[i]] <- 1L
    } else {
      if(i>1)
      df$loop[[i]] <-  df$loop[[i-1]] 
    }
    #special rule for case of subtable 3 where loop_start needs supressing normal behaviour 
    #because its also a loop end with purpose  of home which seems to be the relevant difference
    #from subtable 1 
    if(i>1 & df$loop_start[[i]] &  df$loop_end[[i]] & df$purpose[[i]]=='home')
      df$loop[[i]] <- df$loop[[i-1]]
  }
  return (df)
}
  

df3 <- group_map(.tbl = df_2,
          .f = ~loops_per_householdperson(.),
          keep = TRUE)

df4 <- bind_rows(df3)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.