Understanding Loop behaviour

I am try to trying to understand how loops in operate. I have a simple dataframe xx which is as follows

COMPANY_NUMBER NUMBER_OF_YEARS
#0070837 3
#0070837 3
#0070837 3
1000403 4
1000403 4
1000403 4
1000403 4
10029943 3
10029943 3
10029943 3
10037980 4
10037980 4
10037980 4
10037980 4
10057418 3
10057418 3
10057418 3
1009550 4
1009550 4
1009550 4
1009550 4

The code I have written is

  while (i  <= nrow(xx1) )
              
          {
          
               for (j in 1:xx1$NUMBER_OF_YEARS[i])
         {
                           xx1$I[i]  <- i
                           xx1$J[j]  <- j
                  xx1$NUMBER_OF_YEARS_j[j] <- xx1$NUMBER_OF_YEARS[j]
           }
          i=i + (xx1$NUMBER_OF_YEARS[i] ) 
                        }

After running the code I want my dataframe to look like

|COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|

|#0070837 |3| |1| |1|
|#0070837 |3| |1| |2|
|#0070837 |3| |3| |3|
|1000403 |4| |1| |1|
|1000403 |4| |1| |2|
|1000403 |4| |1| |3|
|1000403 |4| |4| |4|
|10029943 |3| |1| |1|
|10029943 |3| |1| |2|
|10029943 |3| |3| |3|
|10037980 |4| |1| |1|
|10037980 |4| |1| |2|
|10037980 |4| |1| |3|
|10037980 |4| |4| |4|
|10057418 |3| |1| |1|
|10057418 |3| |1| |1|
|10057418 |3| |1| |1|
|1009550 |4| |1| |1|
|1009550 |4| |1| |2|
|1009550 |4| |1| |3|
|1009550 |4| |4| |4|

I get the correct value of I but in the wrong row but the vaule of J is correct in the first iteration and then it goes to 1

Any help will be greatly appreciated

You don't really need a loop to do this. You can used vectorised operations. Check out the code below:

library (dplyr)

xx  %>% 
group_by(company_number, number_of_years)  %>% 
mutate (I = row_number(),
                 J = row_number(),
                I = if_else(I < max(I), 1, I))  %>% 
ungroup()

1 Like

I agree, but get error

suppressPackageStartupMessages({
  library(dplyr)
})

xx <- data.frame(
   COMPANY_NUMBER = c(1000403L,1000403L,1000403L,
                      1000403L,10029943L,10029943L,10029943L,10037980L,
                      10037980L,10037980L,10037980L,10057418L,10057418L,10057418L,
                      1009550L,1009550L,1009550L,1009550L),
  NUMBER_OF_YEARS = c(4L,4L,4L,4L,3L,3L,3L,4L,
                      4L,4L,4L,3L,3L,3L,4L,4L,4L,4L)
)

xx  %>% 
  group_by(COMPANY_NUMBER, NUMBER_OF_YEARS)  %>% 
  mutate (I = row_number(),
          J = row_number(),
          I = if_else(I < max(I), 1, I))  %>% 
  ungroup()
#> Error: Problem with `mutate()` input `I`.
#> x `false` must be a double vector, not an integer vector.
#> ℹ Input `I` is `if_else(I < max(I), 1, I)`.
#> ℹ The error occurred in group 1: COMPANY_NUMBER = 1000403, NUMBER_OF_YEARS = 4.

Anyone fresh to R from one of the procedural languages would expect this to work, but loops in R run in a local, not a global environment. And, worse, each trip through the loop discards state from the previous loop. There is no i++ operator to do that internally. You'd have to set it externally to the loop.

Under the hood, R is implemented either functionally, chaining other functions, or through its limited procedural facility. Mostly, R is exposed to the user as functions.

Procedural language approaches beyond the simplest are candidates for Rcpp or retriculate, which allow objects and functions from C++ and Python, respectively. A package for Go has been rumored. Another approach is to make a system call to another interpreted or compiled language.

It's worth the pain to embrace the vectorized and other functional tools.

Change the last if_else as follows:

if_else(I < max(I), 1L, I)

Change the 1 to 1L to force it to be integer. That should work.

Vishal, thanks for proposing the solution which looks very elegant-I ran it but it didn't work. It gives the following error:

Error: Problem with mutate() input I.
x false must be a double vector, not an integer vector.
i Input I is if_else(I < max(I), 1, I).
i The error occurred in group 1: COMPANY_NUMBER = "#0070837", NUMBER_OF_YEARS = 8.
Run rlang::last_error() to see where the error occurred.

Vishal answered this in the above post

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.