Hi there!
I have a dataset containing three factors: line, dop and conc. Each line group has four rows on which dop and conc values are "control". Below you can find a reprex:
line;dop;conc;prol
a;undop;100;0,1540
a;undop;100;0,2770
a;undop;100;0,2460
a;0,0175;100;0,2030
a;0,0175;100;0,1630
a;0,0175;100;0,2300
a;0,015;100;0,2960
a;0,015;100;0,1070
a;0,015;100;0,2450
a;0,013;100;0,1890
a;0,013;100;0,2910
a;0,013;100;0,2490
a;0,02;100;0,1250
a;0,02;100;0,2910
a;0,02;100;0,2650
a;0,01;100;0,2040
a;0,01;100;0,1030
a;0,01;100;0,1100
a;0,005;100;0,1770
a;0,005;100;0,2890
a;0,005;100;0,1920
a;0,001;100;0,2820
a;0,001;100;0,2480
a;0,001;100;0,1320
a;control;control;0,1640
a;undop;10;0,2920
a;undop;10;0,2580
a;undop;10;0,1900
a;0,0175;10;0,2060
a;0,0175;10;0,2860
a;0,0175;10;0,1010
a;0,015;10;0,2720
a;0,015;10;0,1300
a;0,015;10;0,2720
a;0,013;10;0,2760
a;0,013;10;0,2910
a;0,013;10;0,2630
a;0,02;10;0,1900
a;0,02;10;0,2710
a;0,02;10;0,1770
a;0,01;10;0,2980
a;0,01;10;0,2580
a;0,01;10;0,1500
a;0,005;10;0,3000
a;0,005;10;0,2510
a;0,005;10;0,1990
a;0,001;10;0,1270
a;0,001;10;0,2040
a;0,001;10;0,2860
a;control;control;0,1300
a;undop;1;0,2780
a;undop;1;0,1250
a;undop;1;0,2710
a;0,0175;1;0,1000
a;0,0175;1;0,2920
a;0,0175;1;0,2340
a;0,015;1;0,1620
a;0,015;1;0,1230
a;0,015;1;0,2770
a;0,013;1;0,1330
a;0,013;1;0,1880
a;0,013;1;0,2530
a;0,02;1;0,1410
a;0,02;1;0,1720
a;0,02;1;0,1780
a;0,01;1;0,2190
a;0,01;1;0,1650
a;0,01;1;0,1260
a;0,005;1;0,1210
a;0,005;1;0,1200
a;0,005;1;0,1160
a;0,001;1;0,1720
a;0,001;1;0,1320
a;0,001;1;0,2410
a;control;control;0,2590
a;undop;0,1;0,1880
a;undop;0,1;0,2340
a;undop;0,1;0,1950
a;0,0175;0,1;0,1630
a;0,0175;0,1;0,1190
a;0,0175;0,1;0,2250
a;0,015;0,1;0,2520
a;0,015;0,1;0,2890
a;0,015;0,1;0,2150
a;0,013;0,1;0,2850
a;0,013;0,1;0,1350
a;0,013;0,1;0,2550
a;0,02;0,1;0,2810
a;0,02;0,1;0,1810
a;0,02;0,1;0,2000
a;0,01;0,1;0,1320
a;0,01;0,1;0,2730
a;0,01;0,1;0,2570
a;0,005;0,1;0,1740
a;0,005;0,1;0,1830
a;0,005;0,1;0,2910
a;0,001;0,1;0,2580
a;0,001;0,1;0,1500
a;0,001;0,1;0,1480
a;control;control;0,2870
b;undop;100;0,2530
b;undop;100;0,1860
b;undop;100;0,1820
b;0,0175;100;0,2850
b;0,0175;100;0,1620
b;0,0175;100;0,2130
b;0,015;100;0,2900
b;0,015;100;0,2610
b;0,015;100;0,1900
b;0,013;100;0,1030
b;0,013;100;0,2650
b;0,013;100;0,2640
b;0,02;100;0,1580
b;0,02;100;0,2470
b;0,02;100;0,2730
b;0,01;100;0,2280
b;0,01;100;0,1850
b;0,01;100;0,2340
b;0,005;100;0,1170
b;0,005;100;0,2370
b;0,005;100;0,1160
b;0,001;100;0,2830
b;0,001;100;0,1560
b;0,001;100;0,1330
b;control;control;0,1410
b;undop;10;0,3000
b;undop;10;0,1430
b;undop;10;0,2910
b;0,0175;10;0,2350
b;0,0175;10;0,2500
b;0,0175;10;0,2100
b;0,015;10;0,1210
b;0,015;10;0,2220
b;0,015;10;0,1360
b;0,013;10;0,2070
b;0,013;10;0,2650
b;0,013;10;0,1450
b;0,02;10;0,2090
b;0,02;10;0,1060
b;0,02;10;0,2520
b;0,01;10;0,1700
b;0,01;10;0,2550
b;0,01;10;0,1570
b;0,005;10;0,1430
b;0,005;10;0,1060
b;0,005;10;0,1740
b;0,001;10;0,1980
b;0,001;10;0,1090
b;0,001;10;0,2330
b;control;control;0,2650
b;undop;1;0,2320
b;undop;1;0,2470
b;undop;1;0,2070
b;0,0175;1;0,2610
b;0,0175;1;0,2090
b;0,0175;1;0,1250
b;0,015;1;0,2780
b;0,015;1;0,2190
b;0,015;1;0,2720
b;0,013;1;0,1500
b;0,013;1;0,2400
b;0,013;1;0,2000
b;0,02;1;0,1780
b;0,02;1;0,1320
b;0,02;1;0,1680
b;0,01;1;0,1430
b;0,01;1;0,1660
b;0,01;1;0,2370
b;0,005;1;0,2040
b;0,005;1;0,2870
b;0,005;1;0,2710
b;0,001;1;0,1460
b;0,001;1;0,1150
b;0,001;1;0,2070
b;control;control;0,2200
b;undop;0,1;0,2680
b;undop;0,1;0,2620
b;undop;0,1;0,2510
b;0,0175;0,1;0,2100
b;0,0175;0,1;0,2980
b;0,0175;0,1;0,1740
b;0,015;0,1;0,2320
b;0,015;0,1;0,1230
b;0,015;0,1;0,2800
b;0,013;0,1;0,1830
b;0,013;0,1;0,1940
b;0,013;0,1;0,2580
b;0,02;0,1;0,2120
b;0,02;0,1;0,2820
b;0,02;0,1;0,1780
b;0,01;0,1;0,2470
b;0,01;0,1;0,2500
b;0,01;0,1;0,2760
b;0,005;0,1;0,1780
b;0,005;0,1;0,1880
b;0,005;0,1;0,1350
b;0,001;0,1;0,1260
b;0,001;0,1;0,2580
b;0,001;0,1;0,2840
b;control;control;0,1880
What I want, is to normalize each value of prol variable of every dop and conc row against the mean of the four control values I mentioned before.
Basically, you should divide every prol value of line a by the mean of the prol values of its controls and multiply it by 100. i.e.:
The mean of the controls belonging to line a is:
line dop conc prol
<chr> <chr> <chr> <dbl>
1 a control control 0.164
2 a control control 0.13
3 a control control 0.259
4 a control control 0.287
(0,1640+0,1300+0,2590+0,2870)/4 = 0.21
Now every prol value of line a should be divided by this number and multiplied by 100:
line dop conc prol
<chr> <chr> <chr> <dbl>
1 a undop 100 0.154
2 a undop 100 0.277
0.1540/0.21x100=73.33
0.2770/0.21x100=131.9
and so on.
The same should be done to line b.
With the following lines I've managed to do it, but it only normalizes the data corresponding to the controls, and skips all the useful data corresponding to the rest of the dop and conc levels:
dummy %>%
group_by(line) %>%
filter(dop=="control") %>%
mutate(ctrl=prol/mean(prol)*100)
# A tibble: 8 x 5
# Groups: line [2]
line dop conc prol ctrl
<chr> <chr> <chr> <dbl> <dbl>
1 a control control 0.164 78.1
2 a control control 0.13 61.9
3 a control control 0.259 123.
4 a control control 0.287 137.
5 b control control 0.141 69.3
6 b control control 0.265 130.
7 b control control 0.22 108.
8 b control control 0.188 92.4
You can see that ctrl column now shows the successfully calculated values, but it only does for the control values, skipping all the useful rest of the data.
How can I expand that mutation to all the rows and not only the control ones? I've tried using "cur_data()" which seems a new feature in dplyr, but haven't managed to make it work. Something tells me it could be done with rowwise() but I can't seem to understand how it works...
Thanks a lot in advance!
JP.