Hello guys!
I need to know how to count simply the number of values among several columns row-wise. It's not about counting certain strings or numbers, it's just about the number of values not being NA
in specific columns.
I dont know if this would be the right way to get the desired outcome. If yes, what would I have to insert into the mutate()
brackets?
df <- df %>%
rowwise() %>%
mutate(abc01 = sum())
This is the kind of result I need:
You can try something like this to count missing and then subtract or write it slightly differently for number of values per row
df %>%
rowwise %>%
summarise(NA_per_row = sum(is.na(.)))
Unfortunately for some reason it doesn't work out properly. When I assign this operation to a dataframe this is the result:
I have already tried this one which works out in a simple toy data frame:
df <- mutate(df, NA_per_row=rowSums(is.na(df)))
In the data frame I want to use this on for some reason an error appears even though the code is the same:
df <- mutate(df, NA_per_row=rowSums(is.na(df)))
ABC01 <- mutate(ABC01, NA_per_row=rowSums(is.na(ABC01)))
"> ABC01 <- mutate(ABC01, NA_per_row=rowSums(is.na(ABC01)))
Fehler: Column NA_per_row
must be length 1 (the group size), not 309"
I really don't know what's the difference the two data frames to this error to appear in the first place.
probably you have groups() set on the dataframe. try ungroup() on it first before mutating it
The error remains.
> ungroup(ABC01)
> ABC01 <- mutate(ABC01, NA_per_row=rowSums(is.na(ABC01)))
Fehler: Column `NA_per_row` must be length 1 (the group size), not 309
This dataframe is the result of a select()
out of a data frame used a group_by()
operation on. Does this make difference in comparion to groups()
?
smichal
September 16, 2020, 3:29pm
6
Row-wise operations are alway a little tricky in R with atoms being vectors .
How about this combined purrr trick?
library(tidyverse)
df <- tribble(~n, ~s, ~b, 1, 2, NA, NA, 4, NA, NA, 8, 7, 9, NA, 11, NA, NA, NA)
df %>%
mutate(valid_in_row = map(., ~(!is.na(.x))) %>% pmap_int(sum))
# A tibble: 5 x 4
n s b valid_in_row
<dbl> <dbl> <dbl> <int>
1 1 2 NA 2
2 NA 4 NA 1
3 NA 8 7 2
4 9 NA 11 2
5 NA NA NA 0
The intermediate step is:
> map(df, ~(!is.na(.x)))
$n
[1] TRUE FALSE FALSE TRUE FALSE
$s
[1] TRUE TRUE TRUE FALSE FALSE
$b
[1] FALSE FALSE TRUE TRUE FALSE
2 Likes
The issue with the grouped data error remains even with this operation which works just fine with the toy data frame.
But this seems to be another question not belonging into this thread, I suppose?
Your post thus is nonetheless the solution for the question I asked above.
As is typical for R and the tidyverse, calling ungroup on a dataframe but not assigning the result to anywhere, means the result is ephemeral, ABC01 would remain grouped.
Best practice for assignment is to use <-
1 Like
Dear nirgrahamuk,
you just solved my grouped data problem. I am thankful to you just as I am to smichal. Unfortunately I can't choose two solutions here - I really would love to.
I have also changed the title of the thread because with the grouped data issue another question has come in at the midway point.
1 Like
Scoco
September 23, 2020, 7:46am
10
This is a simple one-liner in base R for the OQ:
df$SUM <- apply(df, 1, function(rr) sum(is.na(rr)))
and it works for data.frames with different data types as well as matrices.
system
Closed
September 30, 2020, 7:46am
11
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.