tbl_summary reporting numeric as categorical

Hi,

This might be a silly question. I am trying to use tbl_summary to create a summary table for my data. But somehow the results for numeric variables are reported as categoricals. I don't know why this happened. Below is the code I used.

test <- data.frame("Gender" = c("Female", "Male", "Female", "Female"),
"vegetable_score" = c(1.214138, 5.552633, 2.757568, 1.209925))
test %>%
tbl_summary(
by = Gender,
type = all_continuous() ~ "continuous",
statistic = list(all_continuous() ~ "{mean} ({sd})")) %>%
modify_header(label ~ "Variable") %>%
modify_spanning_header(c("stat_1", "stat_2") ~ "Gender")

image

Anyone can shed some lights? Thanks.

Polly

Statistics presented: n (%)

sd(5.552633) evaluates to NA leaving the only statistic available the one shown in the output.

The reprex below uses the built-in trial data set on this code to show that if there are an adequate number of occurrences for each continuous variable by category, it will run.

suppressPackageStartupMessages({
  library(dplyr)
  library(gtsummary)})

retrial <- trial %>% select(trt,marker)

retrial %>%
  tbl_summary(
    by = trt,
    type = all_continuous() ~ "continuous",
    statistic = list(all_continuous() ~ "{mean} ({sd})")) %>%
  modify_header(label ~ "Variable") %>%
  modify_spanning_header(c("stat_1", "stat_2") ~ "Gender")

Created on 2020-10-06 by the reprex package (v0.3.0.9001)

(output suppressed)

Hi technocrat,

Thanks for the reply.

I tried to increase the number of observations, doesn't seem to solve the problem. What's the minimum number of observations required for each group? Or am I misunderstanding the problem?

test <- data.frame("Gender" = c("Female", "Male", "Female", "Male", "Male", "Male", "Female", "Female", "Female"),
"vegetable_score" = c(1.2, 5.5, 2.7, 1.2, 2.5, 3.6, 4.7, 5.8, 1.3))
test %>%
tbl_summary(
by = Gender,
statistic = list(all_continuous() ~ "{mean} ({sd})")) %>%
modify_header(label ~ "Variable") %>%
modify_spanning_header(c("stat_1", "stat_2") ~ "Gender") %>%
add_overall() %>%
add_n()

Work from

test %>% tbl_summary(type = c(vegetable_score = "continuous"))

1 Like

I added a few more observations. Now it works. Thank you!

1 Like

The tbl_summary function makes its best guess how to summarize each variable. When there are few unique levels, it guesses categorical.

@technocrat suggestion to use the type= argument to change the default summary type is perfect :+1:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.