 # Trying to understand pmap()

Hi,
I am still trying to wrap my head around purrr package.
I stumbled upon this example:

``````l <- list(rnorm(10),
rnorm(100),
rnorm(1000))
pmap_dbl(list(l, 20, TRUE), ~ mean(..1, ..2, ..3))
``````

I want to understand what 20 and TRUE are here for:

``````pmap_dbl(list(l, 20, TRUE)
``````

and what are ..1, ..2 and ..3 arguments stand for ?

Trying to figure it out I found that:

``````mean(c(20, TRUE))

mean(c(20, FALSE))

``````

give different results: 10.5 and 10 respectively.

If somebody could guide me, please ?

1 Like

This does not seem like a helpful example of using pmap. It is iterating over the three elements of l with the mean function using l as the first argument, 20 and the second argument and TRUE as the third.

Your example of `mean(c(20, TRUE))` is completely different. it calculates the mean of 20 and TRUE where TRUE = 1.

``````library(purrr)
l <- list(rnorm(10),
rnorm(100),
rnorm(1000))
pmap_dbl(list(l, 20, TRUE), ~ mean(..1, ..2, ..3))
#>   0.08393619  0.15101796 -0.03647648
#mean takes three arguments, x, trim and na.rm, see the help for mean
mean(l[], 20, TRUE)
#>  0.08393619
mean(l[], 20, TRUE)
#>  0.151018
mean(l[], 20, TRUE)
#>  -0.03647648

#passing 20 to the trim argument of mean is equivalent to passing 0.5, see ?mean
pmap_dbl(list(l, 0.5, TRUE), ~ mean(..1, ..2, ..3))
#>   0.08393619  0.15101796 -0.03647648
``````

Created on 2020-04-05 by the reprex package (v0.3.0)

1 Like

That example seems unnecessarily complicated and confusing. `pmap` (parallel map) allows you to map over any number of arguments. So, for example, the code below runs `rnorm` three times, first with n=3, mean=2, and sd=1, then with n=6, mean=4, sd=3, and so on:

``````l = list(n=c(3,6,9),
m=c(2,4,6),
s=c(1,3,5))

pmap(l, rnorm)
``````
``````[]
 2.900625 2.851770 2.727715

[]
 6.209506 2.943611 6.116547 7.901074 4.114756 1.062149

[]
  9.968806  9.932534  4.447684 14.494424  2.027031  7.742189
 -5.327005  5.188974 11.654325
``````

`pmap` automatically entered the arguments in order into `rnorm`. If I wanted to change the order, I would use `..1` to refer to the first element of `l`, `..2` to refer to the second element, and so on:

``````pmap(l, ~rnorm(..3, ..1, ..2))
``````
``````[]
 2.088908

[]
 2.403335 8.907356 2.762236

[]
 10.6025107 -1.4235823  0.5314492  6.2786926  2.7870523
``````

You can get the same result as in your example with `map` as follows:

``````map_dbl(l, mean, 20, TRUE)
``````

`mean` has three arguments: `x`, which is the vector of values for which you want the mean. `trim`, which will trim a fraction of the values, and `na.rm` which will strip missing values before calculating the mean. In the `map_dbl` example I gave above, `map_dbl` iterates over each element of `l`. But for each iteration, we want `trim=20` and `na.rm=TRUE`. To make that happen, we put those arguments after the `mean` function and they get entered by position into `mean`. If we don't want to have to worry about entering arguments in the right order, we can name them and they will get passed correctly:

``````map_dbl(l, mean, na.rm=TRUE, trim=20)
``````

In the example you gave, `list(l, 20, TRUE)` is a three-element list. The first element is `l`, which is itself a three-element list and the second and third elements are `20` and `TRUE`. `pmap` operates in parallel and recycles `20` and `TRUE` for each of the three calls it makes to `mean`. Not only can this be done more simply with `map_dbl`, using `20` for trim is confusing, because `trim` can never be greater than 0.5. When a value greater than 0.5 is entered, `mean` silently reduces it to 0.5.

Understanding arguments to functions is something that I had a lot of trouble with (and shouldn't have in retrospect).

One of the hard things to get used to in `R` is the concept that everything is an `object` that has properties. Some objects have properties that allow them to operate on other objects to produce new objects. Those are `functions`.

Think of `R` as school algebra writ large: f(x) = y, where the objects are f, a function, x, an object (and there may be several) termed the `argument` and y is an object termed a `value`, which can be as simple as a single number (aka an `atomic vector`) or a very packed object with a multitude of data and labels.

And, because functions are also objects, they can be arguments to other functions, like the old g(f(x)) = y. (Trivia, this is called being a first class object.)

Although there are function objects in `R` that operate like control statements in imperative/procedural language, they are best used "under the hood." As it presents to users interactively, `R` is a functional programming language. Instead of saying

take this, take that, do this, then do that, then if the result is this one thing, do this other thing, but if not do something else and give me the answer

in the style of most common programming languages. However, `R` allows the user to say

use this function to take this argument and turn it into the value I want for a result

Every function has a `signature`

``````pmap_dbl(.l, .f, ...)
``````

The first, .l, is a

.l A list of vectors, such as a data frame. The length of .l determines the number of arguments that .f will be called with. List names will be used if present.

.f A function, formula, or vector (not necessarily atomic).
\cdots If a formula, e.g. ~ .x + 2, it is converted to a function
for more arguments, use ..1, ..2, ..3 etc

Breaking down

``````pmap_dbl(list(l, 20, TRUE), ~ mean(..1, ..2, ..3))
``````

\ldots is represented by `list(l, 20, TRUE`) as .l and ` ~ mean(..1, ..2, ..3))` as .f. What does it do?

Let's take the .l argument first.

``````str(list(l, 20, TRUE))
List of 3
\$ :List of 3
..\$ : num [1:10] -0.638 -0.287 1.057 -2.519 -1.835 ...
..\$ : num [1:100] -0.456 1.446 0.448 0.855 -0.106 ...
..\$ : num [1:1000] 0.159 0.573 0.587 -0.267 0.352 ...
\$ : num 20
\$ : logi TRUE
``````

It's a list with a `list` of three elements, 20 and TRUE. The list within the list is a list of lists.

``````str(l)
List of 3
\$ : num [1:10] -0.638 -0.287 1.057 -2.519 -1.835 ...
\$ : num [1:100] -0.456 1.446 0.448 0.855 -0.106 ...
\$ : num [1:1000] 0.159 0.573 0.587 -0.267 0.352 ...
``````

each of those is aso a list

`````` str(l)
List of 1
\$ : num [1:10] -0.638 -0.287 1.057 -2.519 -1.835 ...
``````

We're now almost down to something we can take a `mean` of.

``````mean(l[])
 -0.6441985
``````

But that's not the argument to `pmap_dbl` that's being used

``````mean(c(l[],20,TRUE))
 1.213168
``````

which is nicely abbreviated by `..1`.

This is a long-winded way of saying that objects in `R` can easily go very deep, and it pays to take a pause to do this knd of anlaysis.

Thank you @FJCC, @joels, @technocrat for your detailed explanations.
This is much better to read:

``````map_dbl(l, mean, na.rm=TRUE, trim=20)
``````

than

``````pmap_dbl(list(l, 20, TRUE)
``````

where arguments are not explicitly written down.
I didn't know that TRUE in short could stand for na.rm = TRUE and that 20, means the same as trim=20.

Thank you again I will study now what you wrote in order to comprehend it.
kind regards,
Andrzej

When you divided this into separate one by one calls:

I tried this:

``````pmap_dbl(list(l, 20, TRUE), ~ mean(..1=l[], ..2=l[], ..3=l[]))
``````

but it returned an error:

What kind of mistake did I do here ?

Thank you @joels,
So if I understand it right, this function:

calculates mean() with options trim = 20 and na.rm = TRUE for each element of this list:

Is it correct ?
If so, I think that would be nicer to create that list like this:

``````l <- list(a = rnorm(10),
b= rnorm(100),
c= rnorm(1000))
``````

and I checked it worked, but for my learning purposes this is more convenient to divide it into smaller pieces.

One more question, sometimes I can see a code when in pmap()
the users use { } curly braces. Would it be an option to use it in my example as well or is this only for usage of the tidyverse pipes and purrr together ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

``````pmap_dbl(list(l, 20, TRUE), ~ mean(x=..1, trim=..2, na.rm=..3))
``````

`..1` refers to the first element of `list(l, 20, TRUE)`, which is `l`.
`..2` refers to the second element of `list(l, 20, TRUE)`, which is `20`.
`..3` refers to the third element of `list(l, 20, TRUE)`, which is `TRUE`.

Note also that the `mean` function's arguments are `x`, `trim`, and `na.rm`, rather than `..1`, `..2`, and `..3`.