R-studio doesn't appear to be recognizing a numerical variable

So in R-studio I have installed tidyverse, readxl and xlsx packages. Then I load my data from an excel file, assigning variable types as follows:

full_data<-read_excel("./Blast sequence assignments full.xlsx",
                      col_types=c("Identified Proteins (3404)"="text","Accession Number"="text",
            "Blast protein ID"="text","Blast Accession"="text",
            "Blast taxonomy"="text","Reference A"="numeric","Reference B"="numeric",
            "Anaerobic_24h A"="numeric","Anaerobic_24h B"="numeric",
            "Anaerobic_28d A"="numeric","Anaerobic_28d B"="numeric",
            "Citric_24h A"="numeric","Citric_24h B"="numeric",
            "Citric_28d A"="numeric","Citric_28d B"="numeric",
            "Taxonomy"="text")
)

And then I check the variable types and get the following and all of the assignments look correct:

str(full_data)

$ Identified Proteins (3404): chr [1:3342] "Casein kinase II subunit beta OS=Tetradesmus obliquus OX=3088 GN=BQ4739_LOCUS9 PE=3 SV=1" "Uncharacterized protein OS=Tetradesmus obliquus OX=3088 GN=BQ4739_LOCUS7 PE=4 SV=1" "DNA helicase OS=Tetradesmus obliquus OX=3088 GN=BQ4739_LOCUS38 PE=3 SV=1" "Uncharacterized protein OS=Tetradesmus obliquus OX=3088 GN=BQ4739_LOCUS55 PE=3 SV=1" ...
 $ Accession Number          : chr [1:3342] "A0A383V1G7" "A0A383V1H7" "A0A383V1J2" "A0A383V1M7" ...
 $ Blast protein ID          : chr [1:3342] "Casein kinase II subunit beta" "Uncharacterized" "DNA helicase" "Methanethiol oxidase" ...
 $ Blast Accession           : chr [1:3342] "A0A383V1G7" "A0A383V1H7" "A0A383V1J2" "Q8VIF7" ...
 $ Blast taxonomy            : chr [1:3342] "Tetradesmus obliquus" NA "Tetradesmus obliquus" "Rattus norvegicus" ...
 $ Reference A               : num [1:3342] 0.0807 0.0152 -0.1225 0.0443 -0.0241 ...
 $ Reference B               : num [1:3342] -0.1262 -0.0034 0.113 -0.052 0.0498 ...
 $ Anaerobic_24h A           : num [1:3342] 0.48 -0.247 0.3 0.244 -0.977 ...
 $ Anaerobic_24h B           : num [1:3342] 0.201 -0.404 0.306 0.249 -1.018 ...
 $ Anaerobic_28d A           : num [1:3342] -0.396 -1.05 0.943 -0.17 0.463 ...
 $ Anaerobic_28d B           : num [1:3342] 0.0993 -0.5398 0.8354 -0.1983 0.3515 ...
 $ Citric_24h A              : num [1:3342] 0.174 0.225 0.136 -0.158 -0.969 ...
 $ Citric_24h B              : num [1:3342] 0.3209 0.2025 -0.1242 -0.0307 -0.7478 ...
 $ Citric_28d A              : num [1:3342] 0.6117 -0.3526 0.964 0.0942 -0.4023 ...
 $ Citric_28d B              : num [1:3342] 0.4925 -0.3011 0.5552 0.0344 -0.3841 ...
 $ Taxonomy                  : chr [1:3342

But then I am trying to average a couple of the columns as follows and it tells me that one of the arguments isn't a number, so I'm not sure where the confusion is coming in:

code:

full_data<-full_data %>% 
  mutate(reference_avg=mean("Reference A","Reference B"))

Error message:

Problem with `mutate()` column `reference_avg`.
i `reference_avg = mean("Reference A", "Reference B")`.
i argument is not numeric or logical: returning NA 

And then everything is put in as NA when it should be an average number. I am completely new to R and coding in general. Any help would be great.

Here you are literally asking R to average two text strings, "Reference A" and "Reference B". To reference non-syntactic variable names you have to enclose them between backticks, not double-quotes. Also, the mean() function works "by column" not "by row". If you want to calculate the mean of values that come from more than one column, you need to use rowwise() and c_across() to reference the columns.
The code should look something like this

library(dplyr)

full_data <- full_data %>%
    rowwise() %>% 
    mutate(reference_avg = mean(c_across(c(`Reference A`,`Reference B`))))

If this doesn't solve your problem, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

1 Like

Thank you! This was very helpful

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.