@FJCC Thanks! Your diagnosis was spot on. My next task was trying to figure out how to remove the errant values in R, but using a text editor was definitely faster and easier.

Now I'm on to figuring out how to create and save the average asq_*a values as a new vector.

Thanks again.

Cheers,
Jason

#rstatsnewbie

The read.csv() function has a handy argument called na.strings that will likely work well in this scenario (and it makes it so you don't have to change your dataset at all :slightly_smiling_face:).

You'll see in the documentation this argument is

a character vector of strings which are to be interpreted as NA values.

By default, blanks or "NA" are read as NA for numeric variables.

In your case, you could add na.strings = c("Unknown", "#DIV/0!") in read.csv() as you read the dataset in to treat these values as NA and get your columns to read correctly as numbers instead of characters.

3 Likes

@aosmith Thanks! That is a great thing to know and will be very handy in the future!!!

Cheers,
Jason

How about using read_csv to create a tibble, which basically sets stringsAsFactors=FALSE by default, and is faster.

Sounds good. Would the code look like the following?

dat <- read.csv(as_tibble("pilotData.csv", header = TRUE))

Thanks for the suggestion and any other help.

Cheers,
Jason

#rstatsnewbie

No. It'll be like this, provided you've loaded readr previously:

dat <- read_csv("pilotData.csv")

The analogue of header argument in read.csv in read_csv is col_names. For both the functions, they are TRUE by default, so you don't need to mention it explicitly (but if you do, that's fine too).

Also, the analogue of na.strings here is just na.

For more details, go through the documentations here and here.

1 Like

@Yarnabrina Thanks for the information.
Would this then be correct:

library(readr)
dat <- read_csv("pilotData.csv", col_names = TRUE, na = "NA")

Does this command exist?

 skip_empty_cols = TRUE

Cheers,
Jason

Greetings @Yarnabrina and everyone
Does read_csv do something different to the data than read.csv?
I ask because when I switched to the former, some parts of my code, creating a new variable, quit working. Below are the commands and error messages. I can upload a reprex if that would help.

Commands

#	CREATING THE ASQ-LIGHT VARIABLE 
data3 <- 
	mutate(data2,
		x = pmap_dbl(list(asq_1a, asq_2a, asq_3a, asq_4a, asq_5a,
						  asq_6a, asq_7a, asq_8a, asq_9a), function(...){
			row_values <- unlist(list(...))
			number_of_NAs <- sum(is.na(row_values))
			map_dbl(number_of_NAs, ~ case_when(
				.x == 0 ~ mean(row_values),
				.x >= 1 ~ mean(row_values, na.rm = TRUE) #,
				# .x == 1 ~ mean(row_values, na.rm = TRUE),
				# .x == 2 ~ mean(row_values, na.rm = TRUE),
				# .x == 3 ~ mean(row_values, na.rm = TRUE)
			))
		})
	) %>% 
	rename(asq_light = x )

Error messages

argument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: returning NAargument is not numeric or logical: [... truncated]

I cut the error message short but it repeats seemingly, for every. subject.
I will keep looking for the difference between the two commands that can explain the problem but any help is greatly appreciated.

Here is a reprex of the problem.

###############################
#       CREATED BY:     JASON CRAGGS
#       CREATED ON:     2019-04-18
#       USAGE:          REPREX TO READ CSV FILES
###############################
#
library(tidyverse)

#       LOAD DATA (SAME FILE BOTH TIMES)
data1 <- read.csv(url('https://raw.githubusercontent.com/BrainStormCenter/ASQ_pilot/master/ASQ_pain_pilot_2019_04_18.csv'), header = TRUE)
data2 <- read_csv(url('https://raw.githubusercontent.com/BrainStormCenter/ASQ_pilot/master/ASQ_pain_pilot_2019_04_18.csv'),
                  col_names = TRUE,
                  col_types = NULL,
                  quoted_na = FALSE)
#> Parsed with column specification:
#> cols(
#>   .default = col_double(),
#>   ID = col_character(),
#>   redcap_event = col_character(),
#>   count_asqPain = col_character(),
#>   `Good-bad` = col_character(),
#>   demo_dob = col_date(format = ""),
#>   cohabitation = col_character(),
#>   prior_pain_about = col_logical(),
#>   painduration = col_character(),
#>   Groups = col_character(),
#>   asq_9a = col_character(),
#>   asq_11a = col_character(),
#>   asq_14a = col_character(),
#>   asq_15a = col_character(),
#>   typical_alc_use1 = col_character(),
#>   typical_alc_use2 = col_character(),
#>   calibrationvisit = col_character()
#> )
#> See spec(...) for full column specifications.


#       ADD ASQ-LIGHT VARIABLE
#               THIS VERSION DOES WORK
data1.1 <-
    mutate(data1,
           x = pmap_dbl(list(asq_1a, asq_2a, asq_3a, asq_4a, asq_5a,
                          asq_6a, asq_7a, asq_8a, asq_9a), function(...){
                            row_values <- unlist(list(...))
                            number_of_NAs <- sum(is.na(row_values))
                            map_dbl(number_of_NAs, ~ case_when(
                                .x == 0 ~ mean(row_values),
                                .x >= 1 ~ mean(row_values, na.rm = TRUE)
                            ))
                          })
    ) %>%
    rename(asq_light = x )

#               THIS VERSION DOES NOT WORK
data2.1 <-
    mutate(data2,
           x = pmap_dbl(list(asq_1a, asq_2a, asq_3a, asq_4a, asq_5a,
                          asq_6a, asq_7a, asq_8a, asq_9a), function(...){
                            row_values <- unlist(list(...))
                            number_of_NAs <- sum(is.na(row_values))
                            map_dbl(number_of_NAs, ~ case_when(
                                .x == 0 ~ mean(row_values),
                                .x >= 1 ~ mean(row_values, na.rm = TRUE)
                            ))
                          })
    ) %>%
    rename(asq_light = x )
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#> Warning in mean.default(row_values): argument is not numeric or logical:
#> returning NA
#> Warning in mean.default(row_values, na.rm = TRUE): argument is not numeric
#> or logical: returning NA
#....[truncated by user]
# ```

Created on 2019-04-18 by the reprex package (v0.2.1)

One main difference is that read.csv converts character vectors to factors by default, and it can be changed using stringsAsFactors argument. Also, the characters considered as missing values vary, as read_csv considers (quite correctly) "" as NA by default, along with "NA".

I haven't checked your reprex, but noted that adding stringsAsFactors = False for data1 generates same warnings. I don't know purrr (I'm still learning), and hence can't understand what you're trying to do. I hope others will answer your question.


On a separate note, please familiarise yourself with this post:

1 Like

For some reason read_csv is reading some of those numeric columns as characters and that is why mean() returns an error, I don't know why this is happening but you can walkaround this problem by converting them to numeric afterwards.(Also, for this case, you can use rowwise operations instead of the complicated purrr syntax)

library(tidyverse)
data2 <- read_csv(url('https://raw.githubusercontent.com/BrainStormCenter/ASQ_pilot/master/ASQ_pain_pilot_2019_04_18.csv'),
                  col_names = TRUE,
                  col_types = NULL,
                  quoted_na = FALSE)

data2 %>%
    mutate_at(vars(starts_with("asq_")), as.numeric) %>%
    rowwise() %>% 
    mutate(asq_light = mean(c(asq_1a, asq_2a, asq_3a, asq_4a, asq_5a,
                              asq_6a, asq_7a, asq_8a, asq_9a), na.rm = TRUE)) %>%
    ungroup() %>% 
    select(asq_light, starts_with("asq_")) %>% 
    head(10)
#> Warning: NAs introducidos por coerción

#> Warning: NAs introducidos por coerción

#> Warning: NAs introducidos por coerción

#> Warning: NAs introducidos por coerción
#> # A tibble: 10 x 31
#>    asq_light asq_1 asq_1a asq_2 asq_2a asq_3 asq_3a asq_4 asq_4a asq_5
#>        <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>  <dbl> <dbl>
#>  1    NaN       NA     NA    NA     NA    NA     NA    NA     NA    NA
#>  2    NaN        0     NA     0     NA     0     NA     0     NA     0
#>  3      4.71     1      5     1      6     1      4     0     NA     0
#>  4      1.89     1      1     1      2     1      2     1      2     1
#>  5      2.78     1      2     1      3     1      3     1      2     1
#>  6    NaN       NA     NA    NA     NA    NA     NA    NA     NA    NA
#>  7    NaN        0     NA     0     NA     0     NA     0     NA     0
#>  8      1.67     1      1     1      2     1      2     1      1     1
#>  9      3.25     1      2     1      4     1      3     1      2     1
#> 10      2.83     1      3     1      3     1      3     1      1     0
#> # … with 21 more variables: asq_5a <dbl>, asq_6 <dbl>, asq_6a <dbl>,
#> #   asq_7 <dbl>, asq_7a <dbl>, asq_8 <dbl>, asq_8a <dbl>, asq_9 <dbl>,
#> #   asq_9a <dbl>, asq_10 <dbl>, asq_10a <dbl>, asq_11 <dbl>,
#> #   asq_11a <dbl>, asq_12 <dbl>, asq_12a <dbl>, asq_13 <dbl>,
#> #   asq_13a <dbl>, asq_14 <dbl>, asq_14a <dbl>, asq_15 <dbl>,
#> #   asq_15a <dbl>

Created on 2019-04-19 by the reprex package (v0.2.1.9000)

1 Like

Thanks for the information and additional link.
As for my goal with creating the variable, I should have mentioned that! I am sure there is another/better way to accomplish my goals. This is especially true given that I don't understand what all the commands are doing. I just adapted something I found on here that seemed appropriate.
Goals

  1. Create a new variable called "asq-light" that is an average of each person's available asq_1a - asq_9a scores.
  2. Create a new variable called "asq_heavy" that is an average of each person's available asq_10a - asq_15a scores.

Once created, I will use these values in the correlation analyses I am trying to accomplish.

Thanks for all the help thus far.

Thanks for the response and information about the error.
I am still learning how to convert columns to different data types (e.g., character to numeric, factor to numeric etc.).

Also, thanks for letting me know about the rowwise operations. I didn’t know they existed or even when to look for them, yet.

Regarding not knowing things, what is the ungroup() command doing? I didn’t see where you used something to group anything.

Cheers,
Jason

After using rowwise() data gets grouped by rows so it's a good practice to ungroup it when you have done making rowwise operations, to avoid grouping related problems in the future.

1 Like

That is very good to know. Also, why didn’t your code put the new asq_light variable at the end of the dataset, like my previous attempts?
Jason

I have put the new variable at the beginning and trimed the output to 10 rows just for illustration purposes, so it can be shown properly on a post.
You can omit those lines and get the same result as with your code.

Thanks. I’ve used your code to create two variables, asq_light and ask_heavy, however, the length of the dataset didn’t change. Are these "virtual variables"?

Are you assigning the output to a variable?
Remember that dplyr doesn't perform in-place modifications.

new_data <- data2 %>%
    mutate_at(vars(starts_with("asq_")), as.numeric) %>%
    rowwise() %>% 
    mutate(asq_light = mean(c(asq_1a, asq_2a, asq_3a, asq_4a, asq_5a,
                              asq_6a, asq_7a, asq_8a, asq_9a), na.rm = TRUE)) %>%
    ungroup()

Ah, ok. As a complete newbie, I didn’t know that about dplyr.
I asked about saving the variable because I don't know how to use it otherwise. I tried accessing it from the console but got an error message.

> asq_light
Error: object 'asq_light' not found
> data2$asq_light
NULL
Warning message:
Unknown or uninitialised column: 'asq_light'. 

Thanks for helping me. I really appreciate your time and assistance.
Cheers,
Jason

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.