group_by then divide values of the column by a espeficic "row_name"

NeFer · March 22, 2021, 11:07am

Hi all, I would like to normalize data base on one specific row value, the idea is to normalize the data
Example data

dfex <- data.frame(group = c("1","1","1","2","2","2","3","3","3"),
           treat = rep(c("A","B", "C"), 3),
          hight = rnorm(9),
            weight = rnorm(9),
            BMI = rnorm(9))

Now I would like to perform some operations and comparisons, and I have an internal control for the experiment in each group that is my treatment "A", so I found that the nicest way to scale is doing:

 dfex %>%
    mutate(val1 = hight/weight) %>%
    group_by(group) %>%
    mutate(val2 = scale(val1))

However what I need is to divide in each group my val2 = (val1 / val1(from treat="A")), because a hand calculations shows me that it makes a difference in the posterior statistical analysis.

I am following different R tutorials, but I do not manage to call that value properly.
If someone could help me with that and a link with a tutorial or something like that to understand the base of "it" (not sure which is the base I need for the answer), I would really appreciate it.

gtmbini · March 22, 2021, 1:18pm

I think you are not clear enough here val2 = (val1 / val1(from treat="A")), do you mean you want to divide val1 by the actual value of val1 of treatment A for each group (1, 2, 3). I am not sure why you used the scale function above in the code.
Something I tried please see below.

library(tidyverse)
dfex <- data.frame(group = c("1","1","1","2","2","2","3","3","3"),
                   treat = rep(c("A","B", "C"), 3),
                   hight = rnorm(9),
                   weight = rnorm(9),
                   BMI = rnorm(9))

dfex<- dfex %>%
  mutate(val1 = hight/weight) %>%
  group_by(group) %>%
  mutate(val2 = val1/val1[treat == 'A'])
dfex
#> # A tibble: 9 x 7
#> # Groups:   group [3]
#>   group treat   hight weight     BMI    val1   val2
#>   <fct> <fct>   <dbl>  <dbl>   <dbl>   <dbl>  <dbl>
#> 1 1     A     -0.0484  0.990  0.227  -0.0489  1    
#> 2 1     B     -0.233   0.228 -0.0333 -1.02   20.9  
#> 3 1     C     -1.91    1.44   0.816  -1.32   27.0  
#> 4 2     A     -0.683  -0.847  0.0646  0.806   1    
#> 5 2     B     -1.13    0.902  0.482  -1.25   -1.55 
#> 6 2     C     -0.135  -0.902  0.940   0.150   0.186
#> 7 3     A     -0.546   0.941 -0.885  -0.580   1    
#> 8 3     B     -0.310   0.936 -0.292  -0.331   0.572
#> 9 3     C      0.198   0.910 -0.198   0.218  -0.375

NeFer · March 22, 2021, 1:45pm

Thank you! I think I had a typo when I tried your approach.

To answer your question:
In my code above I tried to normalize or uniform the data to compare it between groups, therefore I used scale() (which I wont use for the later analysis), however I was loosing the "control" sample as reference, then I needed to correct that with exactly what you showed.

system · March 29, 2021, 1:45pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.