Write functions to iterate over a vector

I think my question is best asked by showing an example.

This simple functions works well with a singular input, but also works fine when given a vector, it simply returns a vector of answers.

simplefun <- function(x,y) {
  n <- x+y
  return(n)
}

Single input for x:

> simplefun(2,3)
[1] 5

Vector input for x:

> simplex <- 1:10
> simplefun(simplex,3)
 [1]  4  5  6  7  8  9 10 11 12 13

So far so good.

Here is a dataframe and function that works fine with a single input, but fails with an input vector:

df <- tibble(name = c("bob", "bob", "iga", "iga", "gilbert" ,"gilbert"),
             result = c(1,2,1,2,1,2),
             val = c(3,5,6,7,8,9))

lookup <- function(n, r) {
  v <- df %>%
    filter(name == n) %>% 
    filter(result == r) %>% 
    pull(val)

  return(v)
}

Single input:

> lookup("bob", 2)
[1] 5

Vector input:

> lookup(names, 2)
[1] 7

I expected to see an output of [1] 5 7 , but that's not what happens.


In my mind I thought R basically treated inputs like this with something like a for loop: feeding element 1 into the function, then element 2, and so on. Playing around with this, I gather that's not what's happening.

I was actually able to get my actual function to work with lapply/sapply, but it was very slow.

So my question is: What's the best / correct way to write the function like the lookup function above so it can take a vector as an input (and therefore be used in a mutate() function operating on a dataframe)?

What is R actually doing in my two above functions that results in one returning a vector and the other not?

Thanks very much,

Luke

R doesn't "automagically" perform a loop when you use vectors, it performs vector operations when functions or operators applied on vector class objects have applicable methods for them(are vectorized).

In your first function, you are adding a single integer to a vector, so R has a method for this where it creates a vector of the same length filled with the single integer so effectively does.

c(1, 2, 3, 4, 5) + c(3, 3, 3, 3, 3)
#> [1] 4 5 6 7 8

In your second function, the == operator can't compare vectors of different sizes on a 1 to 1 correspondence so only the first element is used, a way to solve this is to use the %in% operator, which compares each element on the left side with all the elements of the vector on the right side.

library(tidyverse)

df <- tibble(name = c("bob", "bob", "iga", "iga", "gilbert" ,"gilbert"),
             result = c(1,2,1,2,1,2),
             val = c(3,5,6,7,8,9))

lookup <- function(n, r) {
    v <- df %>%
        filter(name %in% n) %>% 
        filter(result == r) %>% 
        pull(val)
    
    return(v)
}

lookup("bob", 2)
#> [1] 5

names <- c("bob", "iga")

lookup(names, 2)
#> [1] 5 7

Created on 2022-06-05 by the reprex package (v2.0.1)

In general you can use purrr iteration to perform a function for each of some input (without taking responsibility for writing an explicit for loop).
Example where I use the original lookup which is only fit to return a single exact lookup (and not adjusted by andresrcs to be able to match many).


library(tidyverse)


df <- tibble(name = c("bob", "bob", "iga", "iga", "gilbert" ,"gilbert"),
             result = c(1,2,1,2,1,2),
             val = c(3,5,6,7,8,9))

lookup <- function(n, r) {
  v <- df %>%
    filter(name == n) %>% 
    filter(result == r) %>% 
    pull(val)
  
  return(v)
}

lookup("bob", 2)
#> [1] 5

names <- c("bob", "iga")
lookup(names,2)

# this is purrr::map / and purrr::set_names
map(names,~lookup(.x,2)) %>% set_names(names)

https://r4ds.had.co.nz/iteration.html

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.