Mean of columns and insertion of results as a row

Hi, I recently started using RStudio and I continue my slow progression.
I mostly use it for quick data analysis and generation of plots from scientific results (biology).

I am currently working on the analysis of a set of data and I would like to represent my results as a side by side bars histogram.
I measured 2 samples from 6 proteins and I would like to measure the mean and insert it as a new row and later, look how to generate a side by side bars histogram.

I tried different approach but it is much more complicated that what I tough.

I used the colMean command, without considering the first column but after i do not know how to include my results into a new row.

My real data also include more than 20 columns, is there an efficient way to work a large amount of data and not a small set of data like my reprex.

Thank you in advance.

data.frame(
  stringsAsFactors = FALSE,
              ...1 = c("Val1", "Val2"),
             Prot1 = c(0.525, 0.534),
             Prot2 = c(0.469, 0.476),
             Prot3 = c(0.464, 0.457),
             Prot4 = c(0.525, 0.509),
             Prot5 = c(0.509, 0.511)
)
#>   ...1 Prot1 Prot2 Prot3 Prot4 Prot5
#> 1 Val1 0.525 0.469 0.464 0.525 0.509
#> 2 Val2 0.534 0.476 0.457 0.509 0.511

Your data layout is not very convenient to work with. I would reshape it as shown below and then use ggplot to make a bar plot. Is that the kind of plot you plan to use?

DF <- data.frame(
  stringsAsFactors = FALSE,
  ...1 = c("Val1", "Val2"),
  Prot1 = c(0.525, 0.534),
  Prot2 = c(0.469, 0.476),
  Prot3 = c(0.464, 0.457),
  Prot4 = c(0.525, 0.509),
  Prot5 = c(0.509, 0.511)
)
library(ggplot2)
library(tidyr)
library(dplyr)
DFlong <- pivot_longer(DF, Prot1:Prot5, names_to = "Protein", values_to = "Value")
DFlong
#> # A tibble: 10 x 3
#>    ...1  Protein Value
#>    <chr> <chr>   <dbl>
#>  1 Val1  Prot1   0.525
#>  2 Val1  Prot2   0.469
#>  3 Val1  Prot3   0.464
#>  4 Val1  Prot4   0.525
#>  5 Val1  Prot5   0.509
#>  6 Val2  Prot1   0.534
#>  7 Val2  Prot2   0.476
#>  8 Val2  Prot3   0.457
#>  9 Val2  Prot4   0.509
#> 10 Val2  Prot5   0.511
Stats <- DFlong %>% group_by(Protein) %>% 
  summarize(Mean = mean(Value))
Stats
#> # A tibble: 5 x 2
#>   Protein  Mean
#>   <chr>   <dbl>
#> 1 Prot1   0.530
#> 2 Prot2   0.472
#> 3 Prot3   0.460
#> 4 Prot4   0.517
#> 5 Prot5   0.51
ggplot(Stats, aes(Protein, Mean)) + geom_col(fill = "skyblue")

Created on 2020-05-11 by the reprex package (v0.3.0)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.