How to transform a dataframe in a map_if call?

Consider this simple example

mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
           x = c(1,2,3,5,6,7),
           y = c(3,5,6,4,3,2))

# A tibble: 6 x 3
  group     x     y
  <chr> <dbl> <dbl>
1 a         1     3
2 a         2     5
3 a         3     6
4 b         5     4
5 b         6     3
6 b         7     2

Here, I want to nest by group, and multiply the corresponding list-column dataframe by a given constant ONLY IF the group is a.

Something like:

> mydata %>% group_by(group) %>% 
>   nest() %>% 
>   mutate(flipped_df = map_if(data, group %in% c('a'), ~.x*-1))
> # A tibble: 2 x 3
>   group data             flipped_df          
>   <chr> <list>           <list>              
> 1 a     <tibble [3 x 2]> <data.frame [3 x 2]>
> 2 b     <tibble [3 x 2]> <tibble [3 x 2]>

I dont understand why I dont have a tibble anymore here. Any ideas?
Thanks!

I was able to come-up with some solution, but still does not work (because of the date?!)



mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
           data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
           y = c(3,5,6,4,3,2))

flip <- function(df) {
  mutate_if(df, is.double, funs(-.))}

mydata %>% group_by(group) %>% 
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), flip))
Error in mutate_impl(.data, dots) : 
  Evaluation error: Evaluation error: unary - is not defined for "Date" objects..

I'm not sure I understand your desired output here. Do you want to keep only group a? If not, you haven't defined a behaviour for group b.

From the purrr docs:

map() , map_if() and map_at() always return a list. See the modify() family for versions that return an object of the same type as the input.

Currently, here's what you're generating at each step.

library(tidyverse)
mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     x = c(1,2,3,5,6,7),
                     y = c(3,5,6,4,3,2))
mydata %>%
  group_by(group) 
#> # A tibble: 6 x 3
#> # Groups:   group [2]
#>   group     x     y
#>   <chr> <dbl> <dbl>
#> 1 a         1     3
#> 2 a         2     5
#> 3 a         3     6
#> 4 b         5     4
#> 5 b         6     3
#> 6 b         7     2


mydata %>%
  group_by(group) %>%
  nest()
#> # A tibble: 2 x 2
#>   group data            
#>   <chr> <list>          
#> 1 a     <tibble [3 × 2]>
#> 2 b     <tibble [3 × 2]>

mydata %>%
  group_by(group) %>%
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), ~.x*-1))
#> # A tibble: 2 x 3
#>   group data             flipped_df          
#>   <chr> <list>           <list>              
#> 1 a     <tibble [3 × 2]> <data.frame [3 × 2]>
#> 2 b     <tibble [3 × 2]> <tibble [3 × 2]>

Created on 2018-09-11 by the reprex package (v0.2.0.9000).

1 Like

Hi Mara! thanks for helping.

What i want is simple: I want to multiply the dataframe contained in the list-column data by minus one only if the corresponding group is in a given list (say its a).

Of course, that makes sense only for columns that are numeric. which is why I use a map_if here. However, it seems this is not working, as R is complaning about the date variable

My example above should be more clear. Please let me know

That's because a negative date is not a thing, I believe.

library(lubridate)

mydate <- ymd("2018-01-01")
-mydate
#> Error in `-.Date`(mydate): unary - is not defined for "Date" objects
mydate * -1
#> Error in Ops.Date(mydate, -1): * not defined for "Date" objects

Created on 2018-09-11 by the reprex package (v0.2.0.9000).

thanks! but isnt my flip function filtering the dates out?

No. I'm actually unclear on what you're trying to do with your conditional map, since you pass data (which is all dates), and group (where you have a).

library(tidyverse)
library(lubridate)

mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
                     y = c(3,5,6,4,3,2))

mydata %>%
  filter(group %in% c('a'))
#> # A tibble: 3 x 3
#>   group data           y
#>   <chr> <date>     <dbl>
#> 1 a     2018-01-01     3
#> 2 a     2018-01-01     5
#> 3 a     2018-01-01     6

Created on 2018-09-11 by the reprex package (v0.2.0.9000).

2 Likes

It doesn't since dates are double:

is.double(Sys.Date())
#> [1] TRUE

Created on 2018-09-11 by the reprex package (v0.2.0).

2 Likes

ha! that is the issue. thanks! but how can I modify the double but not the dates? Using is.numeric istead filters out the dates, but also removes the double....

so is there a way to say, mutate the doubles - but not the dates?

using

flip <- function(df) {
  mutate_if(df, (is.double) && !(is.Date), funs(-.))}

does not work

You can define a small helper function to test for Dateness:

library(tidyverse)
library(lubridate)
mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
                     y = c(3,5,6,4,3,2))

not_date <- function(x){
  is.double(x) && !is.Date(x)
}

flip <- function(df) {
  mutate_if(df, not_date, funs(-.))}

res <- mydata %>% group_by(group) %>% 
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), flip))

Created on 2018-09-11 by the reprex package (v0.2.0).

2 Likes

cool! thanks! at the same time a bit weird that I have to use a helper function here?!

You don't have to, it's just pulling out a piece of logic. Since you want only items that are doubles, and dates are doubles, but you don't want dates, you're specifying those "rules" regardless of where you do so.

2 Likes

As @mara said, you don't have to pull this function out:

library(tidyverse)
library(lubridate)
mydata <- data_frame(group = c('a', 'a', 'a', 'b', 'b', 'b'),
                     data = c(ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01'),ymd('2018-01-01')),
                     y = c(3,5,6,4,3,2))

flip <- function(df) {
  mutate_if(df, funs(is.double(.) && !is.Date(.)), funs(-.))}

res <- mydata %>% group_by(group) %>% 
  nest() %>% 
  mutate(flipped_df = map_if(data, group %in% c('a'), flip))

Created on 2018-09-11 by the reprex package (v0.2.0).

2 Likes