summarise the count of frequency of values in dataframe

I am trying to have a summarize of frequency of values 1 of data frame. but only count of values if column have 1

df = data.frame (A1=c(1,	0,	0,	0,	1,	0,	0,	0,	1,	0,	0),
                           A2 = c(1,	1,	0,	0,	0,	1,	0,	1,	1,	0,	0),
                           A3 =c(0,	0,	0,	0,	0,	0,	0,	0,	0,	0,	0),
                           A4 =c(1,	0,	0,	0,	0,	0,	0,	0,	1,	1,	1))

out may be like count of values if > 0

column count
A1 3
A2 5
A4 4

summarise(DF,across(.fns = sum)) %>% 
  pivot_longer(cols=everything()) %>% 
  filter(value >0)

also if there is any na values in any column....how to remove or filter that

if you change your example by making the first value of A1 NA, and run the code, what happens ?

some of the column also have NA in my original data

Error: Problem with summarise() input ..1.
x ‘sum’ not meaningful for factors
i Input ..1 is across(.fns = sum).

It would be best if your example provided representative data.
You previously implied your data was simple integers, that could be summed.
Yet. if you have factors, do they logically map to summable concepts, or must the be excluded ?

to provide example of your data you would do something like

dput(head(mydata,n=10))
structure(list(check_16 = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 
1L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), check_7 = c(0, 
0, 0, 0, 1, 0, 0, 0, 0, 0), check_8 = c(1, 1, 0, 0, 0, 0, 0, 
0, 0, 0), check_9_1 = c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0), check_9_2 = c(0, 
0, 0, 1, 0, 1, 1, 1, 1, 1), check_12 = c(0, 0, 1, 0, 0, 0, 0, 
0, 0, 0), check_10 = c(1, 0, 0, 0, 1, 1, 1, 1, 1, 0), check_11 = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), check_6_OR_14 = c(1L, 0L, 1L, 1L, 
1L, NA, 1L, 1L, 1L, 1L), Check_1 = c(0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L), Check_3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L), Check_54 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    check_2 = c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0), check_56 = c(0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0), check_51 = c(0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame")

On the assumption that the data is in principle integer and so can be summed, as sustained by your most recent example, the following adjustment to pre transform all variables to integer serves to plug the gap

mutate_all(example_df,as.integer) %>%
summarise(across(.fns = sum)) %>% 
  pivot_longer(cols=everything()) %>% 
  filter(value >0)

# A tibble: 8 x 2
  name      value
  <chr>     <int>
1 check_16     12
2 check_7       1
3 check_8       2
4 check_9_1     1
5 check_9_2     6
6 check_12      1
7 check_10      6
8 check_2       1
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.