Dplyr: mutate_all & rowwise

Enuma · November 11, 2019, 6:14pm

Suppose you have an EEG dataset where every column is an electrode E1....E129.
First, you want to compute the spatial standard deviation GFP. How would you rewrite rowSD using dplyr rowwise? Next you want to normalize each electrode by dividing it with GFP. What goes into the mutate_all? Thanks!

library(dplyr)
library(RFunctionsSN)
tibble(E1 = rnorm(10), E2 = rnorm(10), E3 = rnorm(10), E4 = rnorm(10)) %>% 
  mutate(GFP = rowSD(.)) %>% 
  mutate_all()

technocrat · November 11, 2019, 6:55pm

Is this the rowSD() function from https://github.com/snandi/RFunctionsSN/blob/master/R/rowSD.R

A reproducible example, called a reprex is really helpful.

And is GFP global field power, or what?

mattwarkentin · November 11, 2019, 6:56pm

Hi @Enuma,

I'm not actually that familiar with rowwise, but here is one way to do it by pivot_longer(), doing your calculations, and then pivot_wider() to get back to the same original shape.

tibble(E1 = rnorm(10), E2 = rnorm(10), E3 = rnorm(10), E4 = rnorm(10)) %>% 
  mutate(id = 1:n()) %>% 
  pivot_longer(E1:E4) %>% 
  group_by(id) %>% 
  mutate(row_sd = sd(value),
         normalized = value / row_sd) %>% 
  pivot_wider(names_from = name, values_from = normalized, id_cols = id)

Enuma · November 11, 2019, 7:35pm

Sorry, I thought it is exhaustive enough. Yes, rowwSD is from RFunctionsSN, that's why I call it at the beginning. GFP is global field power, yes.

Enuma · November 11, 2019, 7:39pm

Thanks, this is also good. I wanted to learn rowwise on this particular example, because I struggled with other online examples.

mattwarkentin · November 11, 2019, 8:01pm

Okay, here is another more row-wise approach. Let me know if this is more in line with what you hoped, @Enuma.

tibble(E1 = rnorm(10), E2 = rnorm(10), E3 = rnorm(10), E4 = rnorm(10)) %>% 
  mutate(row_sd = pmap_dbl(., lift_vd(sd))) %>% 
  mutate_at(vars(E1:E4), list(norm = ~ . / row_sd))

Or if you want to remove the original E1:E4 variables:

tibble(E1 = rnorm(10), E2 = rnorm(10), E3 = rnorm(10), E4 = rnorm(10)) %>% 
  mutate(row_sd = pmap_dbl(., lift_vd(sd))) %>% 
  transmute_at(vars(E1:E4), list(norm = ~ . / row_sd))

technocrat · November 11, 2019, 8:08pm

The reason I asked is that RFunctionSN isn't a package, as such, just a collection of functions, which makes them hard to track down.

For an N x 19 data frame, rowwise operations become extremely inefficient as N increases. Converted to a matrix object, applying a function to rows is much faster and dividing all terms in a matrix also. Converting back to a tibble is possible with enframe, although you'll probably need to add back the original colnames

Enuma · November 11, 2019, 9:20pm

Thanks, that is a useful info.

Enuma · November 11, 2019, 9:21pm

Nice, seems quite elegant! Thank you.

system · November 18, 2019, 9:21pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.