Summarizing mean, SD and CI of many variable

I have a data set with 15 numerical variabel (SP_1, SP_2, SP_3, SP_4.....SP_15)
Each variable has 20 observations each.

I would like to create a table with; variable, mean, SD and confidence interval.

I think you want dplyr::group_by and dplyr::summarize, along with tidyr::pivot_longer. If you need to adjust the confidence level, you can supply the conf.level argument to t.test (by default it is set to 95%).

library(dplyr)
library(tidyr)

# Creating a dataset like yours with 15 variables and 20 observations
df <- data.frame(
    replicate(
        15,
        runif(20)
    )
)

df %>% 
    pivot_longer(
        cols = everything()
    ) %>% 
    group_by(name) %>% 
    summarize(
        'mean' = mean(value),
        'sd' = sd(value),
        'lower_ci' = t.test(value, mu = mean(value))$conf.int[1],
        'upper_ci' = t.test(value, mu = mean(value))$conf.int[2]
    )


2 Likes

I get the error below when I run this

Error in summarize(., mean = mean(value), sd = sd(value), lower_ci = t.test(value, :
argument "by" is missing, with no default

Please have a look on crosstable package: Introduction to Crosstable

The posted code works for me. Make sure that you have a new R session running--googling that error seems to imply that the error is because you have other packages loaded that also have a function called summarise.

The gtsummary package might also serve your needs (Tutorial: tbl_summary • gtsummary).

1 Like

The code returns values for everything while I would want to get mean, sd, se and CI or each variable separately.

e.g
mean SD se CI
Var1 x
Var 2 x
Var 3 x

Not sure what you mean by:

each variable separately

You could then filter the name column to get at the variable you want. But the above code does in fact calculate statistics only with respect to the individual variables, not on the whole dataset. You can confirm it by comparing the data.frame and list results below:

library(dplyr)
library(tidyr)
    # Creating a dataset like yours with 15 variables and 20 observations
    df <- data.frame(
        replicate(
            15,
            runif(20)
        )
    )
    
    df %>% 
        pivot_longer(
            cols = everything()
        ) %>% 
        group_by(name) %>% 
        summarize(
            'mean' = mean(value),
            'sd' = sd(value),
            'lower_ci' = t.test(value, mu = mean(value))$conf.int[1],
            'upper_ci' = t.test(value, mu = mean(value))$conf.int[2]
        )
#> # A tibble: 15 x 5
#>    name   mean    sd lower_ci upper_ci
#>  * <chr> <dbl> <dbl>    <dbl>    <dbl>
#>  1 X1    0.559 0.360    0.390    0.727
#>  2 X10   0.464 0.336    0.307    0.622
#>  3 X11   0.492 0.283    0.360    0.625
#>  4 X12   0.370 0.275    0.242    0.499
#>  5 X13   0.485 0.289    0.349    0.620
#>  6 X14   0.482 0.285    0.349    0.616
#>  7 X15   0.571 0.250    0.454    0.688
#>  8 X2    0.564 0.270    0.437    0.690
#>  9 X3    0.525 0.285    0.392    0.658
#> 10 X4    0.488 0.292    0.351    0.625
#> 11 X5    0.515 0.278    0.385    0.645
#> 12 X6    0.476 0.293    0.339    0.614
#> 13 X7    0.514 0.341    0.354    0.674
#> 14 X8    0.509 0.286    0.375    0.643
#> 15 X9    0.598 0.303    0.457    0.740
    
    # And now confirm that the variables pivoted into the 'name' column have only the values calculated according to their group
    
    nms <- paste0('X', 1:15)
    
    lapply(
        nms,
        function(x) {
            tmp <- df[[x]]
            list(
                'mean' = mean(tmp),
                'sd' = sd(tmp),
                'lower_ci' = t.test(tmp, mu = mean(tmp))$conf.int[1],
                'upper_ci' = t.test(tmp, mu = mean(tmp))$conf.int[2]
            )
        }
    )
#> [[1]]
#> [[1]]$mean
#> [1] 0.5586078
#> 
#> [[1]]$sd
#> [1] 0.3600182
#> 
#> [[1]]$lower_ci
#> [1] 0.3901142
#> 
#> [[1]]$upper_ci
#> [1] 0.7271015
#> 
#> 
#> [[2]]
#> [[2]]$mean
#> [1] 0.5635041
#> 
#> [[2]]$sd
#> [1] 0.2698614
#> 
#> [[2]]$lower_ci
#> [1] 0.437205
#> 
#> [[2]]$upper_ci
#> [1] 0.6898031
#> 
#> 
#> [[3]]
#> [[3]]$mean
#> [1] 0.5252184
#> 
#> [[3]]$sd
#> [1] 0.2845411
#> 
#> [[3]]$lower_ci
#> [1] 0.392049
#> 
#> [[3]]$upper_ci
#> [1] 0.6583877
#> 
#> 
#> [[4]]
#> [[4]]$mean
#> [1] 0.4880604
#> 
#> [[4]]$sd
#> [1] 0.2922919
#> 
#> [[4]]$lower_ci
#> [1] 0.3512636
#> 
#> [[4]]$upper_ci
#> [1] 0.6248572
#> 
#> 
#> [[5]]
#> [[5]]$mean
#> [1] 0.5150481
#> 
#> [[5]]$sd
#> [1] 0.2784126
#> 
#> [[5]]$lower_ci
#> [1] 0.384747
#> 
#> [[5]]$upper_ci
#> [1] 0.6453492
#> 
#> 
#> [[6]]
#> [[6]]$mean
#> [1] 0.4764728
#> 
#> [[6]]$sd
#> [1] 0.2928978
#> 
#> [[6]]$lower_ci
#> [1] 0.3393924
#> 
#> [[6]]$upper_ci
#> [1] 0.6135532
#> 
#> 
#> [[7]]
#> [[7]]$mean
#> [1] 0.5142307
#> 
#> [[7]]$sd
#> [1] 0.3413002
#> 
#> [[7]]$lower_ci
#> [1] 0.3544972
#> 
#> [[7]]$upper_ci
#> [1] 0.6739641
#> 
#> 
#> [[8]]
#> [[8]]$mean
#> [1] 0.5089112
#> 
#> [[8]]$sd
#> [1] 0.2859729
#> 
#> [[8]]$lower_ci
#> [1] 0.3750718
#> 
#> [[8]]$upper_ci
#> [1] 0.6427507
#> 
#> 
#> [[9]]
#> [[9]]$mean
#> [1] 0.59822
#> 
#> [[9]]$sd
#> [1] 0.3025874
#> 
#> [[9]]$lower_ci
#> [1] 0.4566047
#> 
#> [[9]]$upper_ci
#> [1] 0.7398352
#> 
#> 
#> [[10]]
#> [[10]]$mean
#> [1] 0.4644698
#> 
#> [[10]]$sd
#> [1] 0.3363694
#> 
#> [[10]]$lower_ci
#> [1] 0.307044
#> 
#> [[10]]$upper_ci
#> [1] 0.6218955
#> 
#> 
#> [[11]]
#> [[11]]$mean
#> [1] 0.4924063
#> 
#> [[11]]$sd
#> [1] 0.2828683
#> 
#> [[11]]$lower_ci
#> [1] 0.3600199
#> 
#> [[11]]$upper_ci
#> [1] 0.6247928
#> 
#> 
#> [[12]]
#> [[12]]$mean
#> [1] 0.370358
#> 
#> [[12]]$sd
#> [1] 0.2747296
#> 
#> [[12]]$lower_ci
#> [1] 0.2417806
#> 
#> [[12]]$upper_ci
#> [1] 0.4989354
#> 
#> 
#> [[13]]
#> [[13]]$mean
#> [1] 0.4845037
#> 
#> [[13]]$sd
#> [1] 0.2890672
#> 
#> [[13]]$lower_ci
#> [1] 0.3492161
#> 
#> [[13]]$upper_ci
#> [1] 0.6197913
#> 
#> 
#> [[14]]
#> [[14]]$mean
#> [1] 0.4822205
#> 
#> [[14]]$sd
#> [1] 0.2852961
#> 
#> [[14]]$lower_ci
#> [1] 0.3486978
#> 
#> [[14]]$upper_ci
#> [1] 0.6157431
#> 
#> 
#> [[15]]
#> [[15]]$mean
#> [1] 0.570778
#> 
#> [[15]]$sd
#> [1] 0.2499468
#> 
#> [[15]]$lower_ci
#> [1] 0.4537992
#> 
#> [[15]]$upper_ci
#> [1] 0.6877567

Created on 2022-03-01 by the reprex package (v1.0.0)

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.