 # Variable names within a loop

Hi there,
I am switching from Stata, therefore please be kind with me.
I got a dataset of ratings in multiply categories and want to analyse them. My loop worked well, but now I want to know if the Interrater-Agreement is better when we exclude the first two observation of each rater.
Whatever I do if i write a simple code everything works well, exept I insert it within a loop. In Stata I could use a foreach loop and everywhere where I write `i' the it would use the current variable. This seems to be different in R. Can you give me a hint how the code should look like that every "i" is "v1" in the first round "v2" and so on.

Thank you

``````df.raw <- data.frame(
raterid = c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10),rep(6,10)),
videoid = c(1,2,3,4,5,6,7,8,9,10),
num = sample(1:10),
v1 = sample(1:2, 3, replace=TRUE),
v2 = sample(1:2, 3, replace=TRUE),
v3 = sample(1:2, 3, replace=TRUE),
v4 = sample(1:2, 3, replace=TRUE),
v5 = sample(1:2, 3, replace=TRUE),
v6 = sample(1:2, 3, replace=TRUE)
)

#works
df<- df.raw %>%
select("raterid", "videoid", v1) %>%
# dosn´t work
n <- c("v1") #Later, I want to insert all dimensions here.
for(i in n){
df <- df.raw %>%
mutate(i = replace(i, num < 3, NA)) %>%
select("raterid", "videoid", i) #%>% works after problem is solved
}
``````

Edit: Kind of Reprex & minor changes

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

Hi @nirgrahamuk,

I produced an example. However, "num" is not equal every ten rows, but there are 1 to 10 in every ten rows. However, my problem is the difference between the working single and the loop.

Hi @Rapha,

I tried some tidyverse "magic" with purrr, nesting, gathering and spreading

If I understand you correctly, this is the desired outcome

``````desired_outcome_for_v1 <- df.raw %>%
select("raterid", "videoid", v1) %>%
``````

Gathering to get the names v1 to v6 into rows:

``````> df.raw %>%
+     gather(key = "v_name", value = "rating", v1:v6)
# A tibble: 360 x 5
raterid videoid   num v_name rating
<dbl>   <dbl> <int> <chr>   <int>
1       1       1     2 v1          2
2       1       2     9 v1          2
3       1       3     5 v1          1
4       1       4     3 v1          2
5       1       5     7 v1          2
6       1       6    10 v1          1
7       1       7     1 v1          2
8       1       8     4 v1          2
9       1       9     6 v1          1
10       1      10     8 v1          2
# ... with 350 more rows
``````

Then grouping and nesting:

``````> df.raw %>%
+     gather(key = "v_name", value = "rating", v1:v6) %>%
+     group_by(v_name) %>%
+     nest()
# A tibble: 6 x 2
v_name data
<chr>  <list>
1 v1     <tibble [60 x 4]>
2 v2     <tibble [60 x 4]>
3 v3     <tibble [60 x 4]>
4 v4     <tibble [60 x 4]>
5 v5     <tibble [60 x 4]>
6 v6     <tibble [60 x 4]>
``````

``````nested_outcome <- df.raw %>%
gather(key = "v_name", value = "rating", v1:v6) %>%
group_by(v_name) %>%
nest() %>%
mutate(DESIRED_TABLE = map(data, ~.x %>%
select(raterid, videoid, rating) %>%
nested_outcome

# A tibble: 6 x 3
v_name data              DESIRED_TABLE
<chr>  <list>            <list>
1 v1     <tibble [60 x 4]> <tibble [10 x 7]>
2 v2     <tibble [60 x 4]> <tibble [10 x 7]>
3 v3     <tibble [60 x 4]> <tibble [10 x 7]>
4 v4     <tibble [60 x 4]> <tibble [10 x 7]>
5 v5     <tibble [60 x 4]> <tibble [10 x 7]>
6 v6     <tibble [60 x 4]> <tibble [10 x 7]>
``````

And checking if correct (hopefully)

``````# check for desired outcome
all(nested_outcome\$DESIRED_TABLE[] == desired_outcome_for_v1)
``````

Or in a wider form:

``````nested_outcome_wide <- nested_outcome %>%
select(-data) %>%

# A tibble: 1 x 6
v1                v2                v3                v4                v5                v6
<list>            <list>            <list>            <list>            <list>            <list>
1 <tibble [10 x 7]> <tibble [10 x 7]> <tibble [10 x 7]> <tibble [10 x 7]> <tibble [10 x 7]> <tibble [10 x 7]>
``````

Is that the result you are looking for? Because I was not quite sure what your desired outcome was.

Hi @smichal,

thank you, that´t impressive and well explained. I got the similar results with my for()-loop, but it only saved the results temporally. This will cut down a bunch of code later on.
However, my problem with the code was later on. I want to exclude all observations within the first two video (num <=2) of every rater in every dimension. I tried:

``````for(i in n){
df <- df.raw %>%
mutate(i = replace(i, num < 3, NA)) %>%
select("raterid", "videoid", i)
``````

Where "n" are the variables v1 - v6
My loop generated tables where the whole column v1 were NA not those which contained observations of the first two videos. When I used the actual variable name "v1" instead of "i" it worked, but killed the possibility of a loop. Therefore my actual question was how to insert "i" properly as a identifier of a variable instead of characters.
I still want to know it, but I liked your solution. Therefore a new question: How to enter a function which replaces the rating value with NA if the video is one of the first two (num <=2).

``````nested_outcome <- df.raw %>%
gather(key = "v_name", value = "rating", v1:v6) %>%
group_by(v_name) %>%
nest() %>%
mutate(DESIRED_TABLE = map(data, ~.x %>%
select(raterid, videoid, rating) %>%
``````

And if you or anybody is bored. I used a for()-loop to replace all values "-98" with NA in my actual dataset. It worked, but it looks overly complicated. Is there a better way?

``````n <- c(1,2,3,4,5,6)
for(i in n){
nested_outcome[][[i]][nested_outcome[][[i]] == -98] <- NA
}
Thank you
Rapha
``````

I think you would modify smichal solution like so:

``````(df<- df.raw %>%
gather(key = "v_name", value = "rating", v1:v6)  %>%
mutate(rating = ifelse(num<=2,NA,rating)) %>%
group_by(v_name) %>%
nest() %>%  mutate(DESIRED_TABLE = map(data, ~.x %>%
select(raterid, videoid, rating) %>%
``````

inserting a mutate between gather and grouping

Hi @Rapha,

I'm getting a bit puzzled about what you actually want to achieve.

Is it something like this?

``````# explicit
df.raw %>%
mutate(v1 = ifelse(df.raw\$num < 3, NA, v1),
# ... and so on
v6 = ifelse(df.raw\$num < 3, NA, v6))
``````

If yes, you could use mutate_at in the first place without my lengthy nested solution:

``````df.raw %>%
# Apply a function to the variables v1 to v6
# The "." in the function refers back to df.raw at the beginning of the pipe
mutate_at(vars(v1:v6), function(.x){ifelse(.\$num < 3, NA, .x)})

# A tibble: 60 x 9
raterid videoid   num    v1    v2    v3    v4    v5    v6
<dbl>   <dbl> <int> <int> <int> <int> <int> <int> <int>
1       1       1     5     2     1     2     1     2     2
2       1       2     1    NA    NA    NA    NA    NA    NA
3       1       3    10     1     2     1     2     1     1
4       1       4     4     2     1     2     1     2     2
5       1       5     6     2     2     1     1     1     2
6       1       6     3     1     2     1     2     1     1
7       1       7     9     2     1     2     1     2     2
8       1       8     2    NA    NA    NA    NA    NA    NA
9       1       9     8     1     2     1     2     1     1
10       1      10     7     2     1     2     1     2     2
# ... with 50 more rows
``````