MyFunc <- function(DF) {
DF %>% group_by(B) %>%
mutate(NewCol = 1:n()) %>%
arrange(B)
}
The above part of my code defines a new function that takes one argument named DF. It processes DF through the steps within the braces, grouping by B, mutating it to add NewCol, and sorting by B and then returns the result of that process. It would have been clearer if I had written
MyFunc <- function(DF) {
tmp <- DF %>% group_by(B) %>%
mutate(NewCol = 1:n()) %>%
arrange(B)
return(tmp)
}
After running that code, I can pass a data frame that has a column named B into MyFunc and get back a data frame with the additional NewCol. Below is an example of NewFunc acting on the first element of the LIST I defined in my previous post.
library(dplyr)
LIST <- list(A = data.frame(D = 1:6, B = rep(LETTERS[1:3], 2)),
C = data.frame(E = 2:7, B = rep(LETTERS[1:3], 2)))
LIST
#> $A
#> D B
#> 1 1 A
#> 2 2 B
#> 3 3 C
#> 4 4 A
#> 5 5 B
#> 6 6 C
#>
#> $C
#> E B
#> 1 2 A
#> 2 3 B
#> 3 4 C
#> 4 5 A
#> 5 6 B
#> 6 7 C
MyFunc <- function(DF) {
DF %>% group_by(B) %>%
mutate(NewCol = 1:n()) %>%
arrange(B)
}
subList <- MyFunc(LIST[[1]])
subList
#> # A tibble: 6 x 3
#> # Groups: B [3]
#> D B NewCol
#> <int> <fct> <int>
#> 1 1 A 1
#> 2 4 A 2
#> 3 2 B 1
#> 4 5 B 2
#> 5 3 C 1
#> 6 6 C 2
Created on 2019-09-16 by the reprex package (v0.2.1)
MyFunc is no different than a standard R function like mean() that returns the average of whatever is passed to it, except that MyFunc is very simple, with no error handling or flexibility.
I coupled MyFunc with map(). What map() does is act on each element of the list that is given as its first argument using the function that is given as its second argument.
The call
map(LIST, MyFunc)
just acts on each element of LIST with MyFunc.