I have a dataframe that contains a variable named "Marker" that present two values for each of the samples tested.
The dataframe (as an example) is, as follows:
Sample.File Sample.Name Marker value
1 a a_1 xxx 16
2 a a_1 xxx 18
3 a a_1 yyy 16
4 a a_1 yyy 20
5 a a_1 zzz 9
6 a a_1 zzz 13
7 b b_1 xxx 10
8 b b_1 xxx 10
9 b b_1 yyy 6
10 b b_1 yyy 12
11 b b_1 zzz 14
12 b b_1 zzz 14
provided by the following code:
data <- data.frame(
Sample.File = as.factor(c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b",
"b", "b")),
Sample.Name = as.factor(c("a_1", "a_1", "a_1", "a_1", "a_1", "a_1", "b_1",
"b_1", "b_1", "b_1", "b_1", "b_1")),
Marker = as.factor(c("xxx", "xxx", "yyy", "yyy", "zzz", "zzz", "xxx",
"xxx", "yyy", "yyy", "zzz", "zzz")),
value = c(16L, 18L, 16L, 20L, 9L, 13L, 10L, 10L, 6L, 12L, 14L, 14L)
)
I'd like to transpose my dataframe maintening the columns Sample.File and Sample.Name for all the collected samples, and obtaining new variables (e.g. xxx & xxx.1, yyy & yyy.1, zzz & zzz.1) for the column labelled as "value".
The table I'd like to achieve looks like the following:
Sample.File Sample.Name xxx xxx.1 yyy yyy.1 zzz zzz.1
1 a a_1 16 18 16 20 9 13
2 b b_1 10 10 6 12 14 14
I'd like to use a code without writing the name of the labels reported into "Marker" column (since I could obtain up to 100 different labels).
I tried to use the following code but I couldn't achieve my goal:
library(dplyr)
library(tidyr)
data %>%
gather(Sample.File, Sample.Name) %>%
spread(value)
since I obtained the following error:
Error: `var` must evaluate to a single number or a column name, not a double vector
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
attributes are not identical across measure variables;
they will be dropped
I'd be very grateful if anybody could attend to this matter!