Subtract value from first row to other columns

fgaascht · May 22, 2020, 9:44pm

Hi,

I have the following dataframe and I am wondering how to subtract the value in the first row to values in the same column.
I would also be interested to know how to subtract any column or value to other rows.

data.frame(
  stringsAsFactors = FALSE,
            Sample = c("Blank", "ProtA", "ProtB", "ProtC"),
          Activity = c(0.033, 0.245, 0.31, 0.105)
)
#>   Sample Activity
#> 1  Blank    0.033
#> 2  ProtA    0.245
#> 3  ProtB    0.310
#> 4  ProtC    0.105

Thank you in advance.

FJCC · May 22, 2020, 9:54pm

Here are two methods. In the first the values are changed in the original column and in the second, which I prefer, a new column is made with the adjusted values.

DF <- data.frame(
  stringsAsFactors = FALSE,
  Sample = c("Blank", "ProtA", "ProtB", "ProtC"),
  Activity = c(0.033, 0.245, 0.31, 0.105)
)
DF
#>   Sample Activity
#> 1  Blank    0.033
#> 2  ProtA    0.245
#> 3  ProtB    0.310
#> 4  ProtC    0.105
DF$Activity <- DF$Activity - DF[1,2]
DF
#>   Sample Activity
#> 1  Blank    0.000
#> 2  ProtA    0.212
#> 3  ProtB    0.277
#> 4  ProtC    0.072


#Make a new column
DF <- data.frame(
  stringsAsFactors = FALSE,
  Sample = c("Blank", "ProtA", "ProtB", "ProtC"),
  Activity = c(0.033, 0.245, 0.31, 0.105)
)
DF
#>   Sample Activity
#> 1  Blank    0.033
#> 2  ProtA    0.245
#> 3  ProtB    0.310
#> 4  ProtC    0.105
DF$ActivityAdj <- DF$Activity - DF[1,2]
DF
#>   Sample Activity ActivityAdj
#> 1  Blank    0.033       0.000
#> 2  ProtA    0.245       0.212
#> 3  ProtB    0.310       0.277
#> 4  ProtC    0.105       0.072

^{Created on 2020-05-22 by the reprex package (v0.3.0)}

fgaascht · May 22, 2020, 10:10pm

Thank you for your fast response @FJCC.

Is there a way to also delete the row NULL directly or is it something that I can only do later, for example by selecting row 2 to 4 to create a new dataframe?

Thanks again!

FJCC · May 22, 2020, 10:17pm

The easiest method is to select rows 2 to 4, as you suggest.

fgaascht · May 23, 2020, 12:56am

Thank you very much @FJCC!

fgaascht · May 23, 2020, 4:02am

I tried with a more complex example, more representative of my data but I do not really understand why it does not work. It always makes result equal to zero.

Is it because I group my samples in order to determine the mean for each group?

> Rstudio_Examples <- read_excel("Desktop/Rstudio-Examples.xlsx")
> View(Rstudio_Examples)                                                                     
> Enzyme <- Rstudio_Examples
> head(Enzyme, 8)[, c('Sample', '24H')]
# A tibble: 8 x 2
  Sample `24H`
  <chr>  <dbl>
1 Blank  0.033
2 Blank  0.035
3 ProtA  0.201
4 ProtA  0.188
5 ProtB  0.345
6 ProtB  0.321
7 ProtC  0.245
8 ProtC  0.222
> datapasta::df_paste(head(Enzyme, 8)[, c('Sample', '24H')])
> data.frame(
+   stringsAsFactors = FALSE,
+        check.names = FALSE,
+             Sample = c("Blank","Blank","ProtA",
+                        "ProtA","ProtB","ProtB","ProtC","ProtC"),
+              `24H` = c(0.033, 0.035, 0.201, 0.188, 0.345, 0.321, 0.245, 0.222)
+ )
  Sample   24H
1  Blank 0.033
2  Blank 0.035
3  ProtA 0.201
4  ProtA 0.188
5  ProtB 0.345
6  ProtB 0.321
7  ProtC 0.245
8  ProtC 0.222
> Enzymes_24H <- Enzyme %>% group_by(Sample) %>% summarize(Mean = mean(`24H`))
> View(Enzymes_24H)
> Enzymes_24H <- Enzymes_24H %>% rename("24H" = "Mean")
> View(Enzymes_24H)
> Enzymes_24H$ActivityAdj <- Enzymes_24H$"24H" - Enzymes_24H[1,2]
> View(Enzymes_24H)
> head(Enzymes_24H, 4)[, c('Sample', '24H', 'ActivityAdj')]
# A tibble: 4 x 3
  Sample `24H` ActivityAdj$`24H`
  <chr>  <dbl>             <dbl>
1 Blank  0.034                 0
2 ProtA  0.194                 0
3 ProtB  0.333                 0
4 ProtC  0.233                 0
>

Thank you in advance.

Yarnabrina · May 23, 2020, 11:43am

Does this help?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

Enzyme <- tibble(Sample = c("Blank", "Blank", "ProtA", "ProtA", "ProtB", "ProtB", "ProtC", "ProtC"),
                 `24H` = c(0.033, 0.035, 0.201, 0.188, 0.345, 0.321, 0.245, 0.222))

Enzyme %>%
    group_by(Sample) %>%
    summarize(Mean = mean(`24H`)) %>%
    rename("24H" = "Mean") %>%
    mutate(ActivityAdj = `24H` - first(x = `24H`))
#> # A tibble: 4 x 3
#>   Sample `24H` ActivityAdj
#>   <chr>  <dbl>       <dbl>
#> 1 Blank  0.034       0    
#> 2 ProtA  0.194       0.160
#> 3 ProtB  0.333       0.299
#> 4 ProtC  0.233       0.199

As to why you failed, when you extract an element from a data.frame, it's a vector. But if you extract from a tibble, it's a tibble itself, of one variable and one observation. Subtraction of a tibble of suzh shape from a vector returns the result for first element of vector only, and then it gets replicated to match the length. That's why you are getting all zeros.

Now I think this is the reason just by checking a few combinations. These, especiallly extracting one element still being a tibble, seems odd to me. But there may be a documentation somewhere which I missed.

Hope this helps.

fgaascht · May 29, 2020, 3:39am

Hi @Yarnabrina,

It is exactly what I wanted. Thank you very much for your help and explanation!

system · June 5, 2020, 3:39am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.