Using log2 transformation to compute M-value

I have a data contains the experimental data of 16 samples at ~27k DNA markers. The dimension is 27578 by 201. I am trying to compute a log2 transformation using two columns in my data. I tried the code below and encountered an error. What am i missing?

>  SampleData_trim %>% 
+   mutate(
+     M = log2((SampleData_trim$454.Signal_B)+1)/((SampleData_trim$454.Signal_A)+1)
Error: unexpected numeric constant in:
"  mutate(
    M = log2((SampleData_trim$454."
>   )

Try this:

SampleData_trim %>% 
mutate(M = log2((`454.Signal_B`)+1)/((`454.Signal_A`)+1)

If you have columns starting with numbers then you need to use backticks around the name.

Also, the %>% pipe means you do not need SampleData_trim$.

It says "could not find function mutate". So I used library(dplyr). And it printed the entire tibble. How do I see the result of just the function used?

> SampleData_trim %>% 
+   mutate(M = log2((`454.Signal_B`)+1)/((`454.Signal_A`)+1)
+ SampleData_trim %>% 
Error: unexpected symbol in:
"  mutate(M = log2((`454.Signal_B`)+1)/((`454.Signal_A`)+1)
SampleData_trim"

Here is the data I have:

# A tibble: 27,578 x 202
   Index SYMBOL `454.AVG_Beta` `454.Avg_NBEADS… `454.Avg_NBEADS… `454.BEAD_STDER…
   <int> <chr>           <dbl>            <int>            <int>            <int>
 1     1 ATP2A1         0.755                16               13               36
 2     2 SLMAP          0.722                12               18               30
 3     3 MEOX2          0.0975               20               25              111
 4     4 HOXD3          0.146                20               19              122
 5     5 ZNF398         0.102                16               20              181
 6     6 PANX1          0.0626               12               13              543
 7     7 COX8C          0.964                19               15               17
 8     8 IMPA2          0.0240               15               25              494
 9     9 TTC8           0.0109               23               20              562
10    10 FLJ35…         0.657                21               21              470
# ... with 27,568 more rows, and 196 more variables: `454.BEAD_STDERR_B` <int>,
#   `454.Signal_A` <int>, `454.Signal_B` <int>, `454.Detection Pval` <dbl>,
#   `454.Intensity` <int>, `3.AVG_Beta` <dbl>, `3.Avg_NBEADS_A` <int>,
#   `3.Avg_NBEADS_B` <int>, `3.BEAD_STDERR_A` <int>, `3.BEAD_STDERR_B` <int>,
#   `3.Signal_A` <int>, `3.Signal_B` <int>, `3.Detection Pval` <dbl>,
#   `3.Intensity` <int>, `531.AVG_Beta` <dbl>, `531.Avg_NBEADS_A` <int>,
#   `531.Avg_NBEADS_B` <int>, `531.BEAD_STDERR_A` <int>,
#   `531.BEAD_STDERR_B` <int>, `531.Signal_A` <int>, `531.Signal_B` <int>,
#   `531.Detection Pval` <dbl>, `531.Intensity` <int>, `18.AVG_Beta` <dbl>,
#   `18.Avg_NBEADS_A` <int>, `18.Avg_NBEADS_B` <int>, `18.BEAD_STDERR_A` <int>,
#   `18.BEAD_STDERR_B` <int>, `18.Signal_A` <int>, `18.Signal_B` <int>,
#   `18.Detection Pval` <dbl>, `18.Intensity` <int>, `554.AVG_Beta` <dbl>,
#   `554.Avg_NBEADS_A` <int>, `554.Avg_NBEADS_B` <int>,
#   `554.BEAD_STDERR_A` <int>, `554.BEAD_STDERR_B` <int>, `554.Signal_A` <int>,
#   `554.Signal_B` <int>, `554.Detection Pval` <dbl>, `554.Intensity` <int>,
#   `202.AVG_Beta` <dbl>, `202.Avg_NBEADS_A` <int>, `202.Avg_NBEADS_B` <int>,
#   `202.BEAD_STDERR_A` <int>, `202.BEAD_STDERR_B` <int>, `202.Signal_A` <int>,
#   `202.Signal_B` <int>, `202.Detection Pval` <dbl>, `202.Intensity` <int>,
#   `559.AVG_Beta` <dbl>, `559.Avg_NBEADS_A` <int>, `559.Avg_NBEADS_B` <int>,
#   `559.BEAD_STDERR_A` <int>, `559.BEAD_STDERR_B` <int>, `559.Signal_A` <int>,
#   `559.Signal_B` <int>, `559.Detection Pval` <dbl>, `559.Intensity` <int>,
#   `203.AVG_Beta` <dbl>, `203.Avg_NBEADS_A` <int>, `203.Avg_NBEADS_B` <int>,
#   `203.BEAD_STDERR_A` <int>, `203.BEAD_STDERR_B` <int>, `203.Signal_A` <int>,
#   `203.Signal_B` <int>, `203.Detection Pval` <dbl>, `203.Intensity` <int>,
#   `681.AVG_Beta` <dbl>, `681.Avg_NBEADS_A` <int>, `681.Avg_NBEADS_B` <int>,
#   `681.BEAD_STDERR_A` <int>, `681.BEAD_STDERR_B` <int>, `681.Signal_A` <int>,
#   `681.Signal_B` <int>, `681.Detection Pval` <dbl>, `681.Intensity` <int>,
#   `423.AVG_Beta` <dbl>, `423.Avg_NBEADS_A` <int>, `423.Avg_NBEADS_B` <int>,
#   `423.BEAD_STDERR_A` <int>, `423.BEAD_STDERR_B` <int>, `423.Signal_A` <int>,
#   `423.Signal_B` <int>, `423.Detection Pval` <dbl>, `423.Intensity` <int>,
#   `710.AVG_Beta` <dbl>, `710.Avg_NBEADS_A` <int>, `710.Avg_NBEADS_B` <int>,
#   `710.BEAD_STDERR_A` <int>, `710.BEAD_STDERR_B` <int>, `710.Signal_A` <int>,
#   `710.Signal_B` <int>, `710.Detection Pval` <dbl>, `710.Intensity` <int>,
#   `768.AVG_Beta` <dbl>, `768.Avg_NBEADS_A` <int>, `768.Avg_NBEADS_B` <int>,
#   `768.BEAD_STDERR_A` <int>, `768.BEAD_STDERR_B` <int>, …

If you mean you just want a small selection of columns returned, then try this:

SampleData_trim %>% 
mutate(M = log2((`454.Signal_B`)+1)/((`454.Signal_A`)+1) %>% 
select(Index, SYMBOL, M)

or any other list you wish.

It worked. Thank you so much.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.