Okay, so I'm looking for a tidy solution to the following:
Given a tibble d:
library('tidyverse')
set.seed(733744)
n = 10
d = tibble(x1 = rnorm(n), x2 = rnorm(n),
y1 = rnorm(n), y2 = rnorm(n),
z1 = rnorm(n), z2 = rnorm(n))
I.e.
> d
# A tibble: 10 x 6
x1 x2 y1 y2 z1 z2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1.40 -1.59 0.0458 -0.558 0.484 0.794
2 1.24 0.124 -0.0210 -1.57 -0.234 -2.30
3 -0.234 -1.93 0.804 0.845 -1.90 0.00116
4 0.549 1.12 -0.221 -0.421 0.169 1.11
5 0.633 -0.140 0.00652 -0.200 0.202 1.12
6 -0.257 0.963 -1.86 -0.208 0.237 0.544
7 0.283 -0.152 1.47 0.423 0.747 0.518
8 2.31 -1.46 -0.908 0.603 -0.506 0.850
9 -0.616 0.165 0.651 -0.0481 -1.05 -0.619
10 0.559 -1.18 0.878 -1.19 -1.91 1.02
I want to calculate the row-wise sd() across columns x and y, but NOT z. The following works, but is not tidy IMHO:
# Equivalent to rowMeans, but for sd
rowSds = function(x){ return(apply(x, 1, sd)) }
# Calculate sd across columns, where the column name contain x or y
d %>% mutate(sd_xy = d %>% select(matches("x|y")) %>% rowSds)
So I am looking for something like this, but without having to write out the variable names:
d %>% rowwise %>% mutate(sd_xy = c(x1, x2, y1, y2) %>% sd) %>% ungroup
There must be some NSE way to do this elegantly?