Error in dplyr summarise by groups

Hi,

when trying to summarise median values by a categorical variable (here, cognitive_status) using the dplyr package, in some cases I get a error warning (for most of the variables it works fine!)

I'd appreciate some help,

Thanks!

dat %>%
group_by(cognitive_status) %>%
summarize(medn = median(dt_alph_gs, na.rm = T))

Error: Problem with summarise() input medn.
x Input medn must return compatible vectors across groups
:information_source: Result type for group 1 (cognitive_status = "No cognitive impairment"): .
:information_source: Result type for group 2 (cognitive_status = "MCI"): .
:information_source: Input medn is median(dt_alph_gs, na.rm = T).

To help us try to understand your error could you try to share a small sample of the data in question?
some code like this should work:

dat %>% 
select(cognitive_status, dt_alph_ga) %>% 
sample_n(size=10,  replace = FALSE) %>% 
dput()

Hi,

thanks so much for your reply.

this is the output that I get:

dat %>%

  • select(cognitive_status, dt_alph_gs) %>% 
    
  • sample_n(size=10,  replace = FALSE) %>% 
    
  • dput()
    

structure(list(cognitive_status = structure(c(2L, 2L, 1L, 2L,
1L, 1L, 2L, 1L, 2L, 2L), .Label = c("No cognitive impairment",
"MCI"), label = "Cognitive status group", class = c("labelled",
"factor")), dt_alph_gs = structure(c(0.516795865633075, 0.225903614457831,
0.575815738963532, 0.409976084728391, 0.471883602044829, 0.430416068866571,
NA, 0.556070435588508, 0.614439324116744, 0.316122233930453), label = "DT alphabet time mean", class = c("labelled",
"numeric"))), row.names = c(NA, -10L), class = "data.frame")

Not sure if this helpful (sorry I'm new to R...)

Thanks,
Cristina

Thats a good step, you can improve how you share code, by formatting it for the forum, by using a three backticks, to open and close code formatting segments.

```
like this
```

I ran the following based on what you provided

dat <- structure(list(cognitive_status = structure(c(
  2L, 2L, 1L, 2L,
  1L, 1L, 2L, 1L, 2L, 2L
), .Label = c(
  "No cognitive impairment",
  "MCI"
), label = "Cognitive status group", class = c(
  "labelled",
  "factor"
)), dt_alph_gs = structure(c(
  0.516795865633075, 0.225903614457831,
  0.575815738963532, 0.409976084728391, 0.471883602044829, 0.430416068866571,
  NA, 0.556070435588508, 0.614439324116744, 0.316122233930453
), label = "DT alphabet time mean", class = c(
  "labelled",
  "numeric"
))), row.names = c(NA, -10L), class = "data.frame")

library(tidyverse)

dat %>%
  group_by(cognitive_status) %>%
  summarize(medn = median(dt_alph_gs, na.rm = T))

This did not give an error, but a result

# A tibble: 2 x 2
  cognitive_status         medn
  <fct>                   <dbl>
1 No cognitive impairment 0.514
2 MCI                     0.410

when you run what I run, do you get this same result, or an error ?

Yes, I get the same error as before..

Thanks,
so can you tell me what the result is for you when you do

packageVersion("dplyr")

(I'm thinking it might not hurt to upgrade your dplyr version if its somewhat old)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.