How to transform a dataframe in a map_if call?

dplyr
purrr

#1

Consider this simple example

mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
           x = c(1,2,3,5,6,7),
           y = c(3,5,6,4,3,2))

# A tibble: 6 x 3
  group     x     y
  <chr> <dbl> <dbl>
1 a         1     3
2 a         2     5
3 a         3     6
4 b         5     4
5 b         6     3
6 b         7     2

Here, I want to nest by group, and multiply the corresponding list-column dataframe by a given constant ONLY IF the group is a.

Something like:

> mydata %>% group_by(group) %>% 
>   nest() %>% 
>   mutate(flipped_df = map_if(data, group %in% c('a'), ~.x*-1))
> # A tibble: 2 x 3
>   group data             flipped_df          
>   <chr> <list>           <list>              
> 1 a     <tibble [3 x 2]> <data.frame [3 x 2]>
> 2 b     <tibble [3 x 2]> <tibble [3 x 2]>

I dont understand why I dont have a tibble anymore here. Any ideas?
Thanks!


#2

I was able to come-up with some solution, but still does not work (because of the date?!)



mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
           data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
           y = c(3,5,6,4,3,2))

flip <- function(df) {
  mutate_if(df, is.double, funs(-.))}

mydata %>% group_by(group) %>% 
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), flip))
Error in mutate_impl(.data, dots) : 
  Evaluation error: Evaluation error: unary - is not defined for "Date" objects..

#3

I'm not sure I understand your desired output here. Do you want to keep only group a? If not, you haven't defined a behaviour for group b.

From the purrr docs:

map() , map_if() and map_at() always return a list. See the modify() family for versions that return an object of the same type as the input.

Currently, here's what you're generating at each step.

library(tidyverse)
mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     x = c(1,2,3,5,6,7),
                     y = c(3,5,6,4,3,2))
mydata %>%
  group_by(group) 
#> # A tibble: 6 x 3
#> # Groups:   group [2]
#>   group     x     y
#>   <chr> <dbl> <dbl>
#> 1 a         1     3
#> 2 a         2     5
#> 3 a         3     6
#> 4 b         5     4
#> 5 b         6     3
#> 6 b         7     2


mydata %>%
  group_by(group) %>%
  nest()
#> # A tibble: 2 x 2
#>   group data            
#>   <chr> <list>          
#> 1 a     <tibble [3 × 2]>
#> 2 b     <tibble [3 × 2]>

mydata %>%
  group_by(group) %>%
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), ~.x*-1))
#> # A tibble: 2 x 3
#>   group data             flipped_df          
#>   <chr> <list>           <list>              
#> 1 a     <tibble [3 × 2]> <data.frame [3 × 2]>
#> 2 b     <tibble [3 × 2]> <tibble [3 × 2]>

Created on 2018-09-11 by the reprex package (v0.2.0.9000).


#4

Hi Mara! thanks for helping.

What i want is simple: I want to multiply the dataframe contained in the list-column data by minus one only if the corresponding group is in a given list (say its a).

Of course, that makes sense only for columns that are numeric. which is why I use a map_if here. However, it seems this is not working, as R is complaning about the date variable

My example above should be more clear. Please let me know


#5

That's because a negative date is not a thing, I believe.

library(lubridate)

mydate <- ymd("2018-01-01")
-mydate
#> Error in `-.Date`(mydate): unary - is not defined for "Date" objects
mydate * -1
#> Error in Ops.Date(mydate, -1): * not defined for "Date" objects

Created on 2018-09-11 by the reprex package (v0.2.0.9000).


#6

thanks! but isnt my flip function filtering the dates out?


#7

No. I'm actually unclear on what you're trying to do with your conditional map, since you pass data (which is all dates), and group (where you have a).

library(tidyverse)
library(lubridate)

mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
                     y = c(3,5,6,4,3,2))

mydata %>%
  filter(group %in% c('a'))
#> # A tibble: 3 x 3
#>   group data           y
#>   <chr> <date>     <dbl>
#> 1 a     2018-01-01     3
#> 2 a     2018-01-01     5
#> 3 a     2018-01-01     6

Created on 2018-09-11 by the reprex package (v0.2.0.9000).


#8

It doesn't since dates are double:

is.double(Sys.Date())
#> [1] TRUE

Created on 2018-09-11 by the reprex package (v0.2.0).


#9

ha! that is the issue. thanks! but how can I modify the double but not the dates? Using is.numeric istead filters out the dates, but also removes the double....

so is there a way to say, mutate the doubles - but not the dates?


#10

using

flip <- function(df) {
  mutate_if(df, (is.double) && !(is.Date), funs(-.))}

does not work


#11

You can define a small helper function to test for Dateness:

library(tidyverse)
library(lubridate)
mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
                     y = c(3,5,6,4,3,2))

not_date <- function(x){
  is.double(x) && !is.Date(x)
}

flip <- function(df) {
  mutate_if(df, not_date, funs(-.))}

res <- mydata %>% group_by(group) %>% 
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), flip))

Created on 2018-09-11 by the reprex package (v0.2.0).


#12

cool! thanks! at the same time a bit weird that I have to use a helper function here?!


#13

You don't have to, it's just pulling out a piece of logic. Since you want only items that are doubles, and dates are doubles, but you don't want dates, you're specifying those "rules" regardless of where you do so.


#14

As @mara said, you don't have to pull this function out:

library(tidyverse)
library(lubridate)
mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
                     y = c(3,5,6,4,3,2))

flip <- function(df) {
  mutate_if(df, funs(is.double(.) && !is.Date(.)), funs(-.))}

res <- mydata %>% group_by(group) %>% 
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), flip))

Created on 2018-09-11 by the reprex package (v0.2.0).