Finding the Mean

Hello. I am new to R. I am having a go with a small project pulled from the Data Analytics course I did to build R proficiency. It is to use smart device data to gain insight on how they're using smart devices. I was given a lot of datasets to work with. I used the dailyActivity_merged.csv to start.

As I was looking at the data, I saw that the data did not look complete. They were some inconsistencies for example, the total time in minutes was not 24 hours for some and total steps on some values were 0 which is weird. So I filtered that to create a new table with IDs and days where there was a full 24 hour data collection. Now, with this new table, I was trying to find the mean of various minutes spent very active, light active, fairly active and sedentary per ID with anomalies filtered. This is the code I used:

ID_A <- true_full_day %>%
filter(Id == 1503960366) %>%
filter(VeryActiveMinutes) %>%
mean(VeryActiveMinutes)

This is the error message I got:

Error in filter():
! Problem while computing ..1 = VeryActiveMinutes.
:heavy_multiplication_x: Input ..1 must be a logical vector, not a integer.
Run rlang::last_error() to see where the error occurred.

ID_A <- true_full_day %>%

  • filter(Id == 1503960366) %>%
  • filter(VeryActiveMinutes) %>%
  • mean(VeryActiveMinutes)

This what stood out

My question remains can the mean only be coded as long as the data type is a vector or what?

I see two problems. The line

filter(VeryActiveMinutes) 

does not make sense. Are you trying to select a column named VeryActiveMinutes or is there a column using which you want to select rows with the value "VeryActiveMinutes"? In any case, filter needs to use a comparison, as in the previous line where you wrote filter(Id == 1503960366) . If you want to select a column named VeryActive minutes, use select(VeryActiveMinutes), though there is no need to do that.
The second problem is that the mean function is not designed to work with the %>% operator. To compute the mean of a column named VeryActiveMinutes, you can combine mean() with the summarize function

true_full_day %>%
filter(Id == 1503960366) %>%
summarize(Avg = mean(VeryActiveMinutes))

I am trying to select the rows for the column "VeryActiveMinutes" with the ID number: 1503960366

Sweet that worked. the mean and summarize function

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.