cdata to summarise confidence intervals?

Quite new to R Studio. Is it possible to use cdata package to summarise 95% confidence intervals to then plot these as error bars? Or do I have to use summarySE?
Or basically, how is it recommended to create a summary including 95% ci to then plot this on a geom_line/ geom_point? I used cdata to do this with SEM error bars but unsure how it translates to ci.

Hi, and welcome!

See @aosmith's answer. Here's a reproducible example, called a reprex of this code

library(ggplot2)
library(ggeffects)
data(efc)
fit <- lm(barthtot ~ c12hour + neg_c_7 + c161sex + c172code, data = efc)
mydf <- ggpredict(fit, terms = "c12hour")
ggplot(mydf, aes(x, predicted)) +
  geom_line() +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)

Created on 2019-10-31 by the reprex package (v0.3.0)

Although he used ggpredict to get the confidence limits, they are also available with

confint(fit)
              2.5 %   97.5 %
(Intercept) 89.5005 109.2778
c12hour     -0.2874  -0.2192
neg_c_7     -2.7392  -1.8553
c161sex     -2.8541   4.9369
c172code    -1.9238   3.3605

For adding geom_point(), what are you looking to plot on the x-axis?

Thank you for your help technocrat!
I am using some example data to produce a graph where x=dose and y=mean toothgrowth
So far I have been able to add error bars showing SEM using this:

ggplot(cdata, aes(x=dose, y=mean, colour=supp)) + geom_errorbar(aes(ymin=mean-se, ymax=mean+se), colour = "black", width=.1) + geom_line() + geom_point()

Before this I created a summary table for N, mean, sd, se, min and max but I didn't know if there was a way to include confidence intervals in the summary so I could then plot these as error bars? To create the summary I used this within the plyr package:

cdata <- ddply(ToothGrowth, c("supp","dose"), summarise, N = length(len), mean = mean(len), sd = sd(len), se = sd/sqrt(N), min = min(len), max = max(len))

So, I think I am ok with adding the geom_point () (unless you disagree - I am merely an undergraduate student struggling with this, hence how bad my explanations of my problem are)... my problem is adding the 95% ci bars.

Thank you so so much for your help!

1 Like

There are no "mere" undergraduates when it comes to R! Every learner starts out and improves with experience. You're doing pretty well so far.

I should refer you to the homework guidelines, which is a convention on this site to give guidance on assigned problems, but not solutions, ready to hand in.

Here's a simpler ggplot recipe for error bars, such as what you already have:

# Adopted from Long & Teetor, R Cookbook, 2nd ed. § 10.11 https://rce2.com 

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)
library(forcats)
data(airquality)


aq_data <- airquality %>%
  arrange(Month) %>%
  mutate(month_abb = fct_inorder(month.abb[Month]))

ggplot(aq_data, aes(month_abb, Temp)) +
  geom_point(stat = "summary",
           fun.y = "mean",
           fill = "cornflowerblue") +
  stat_summary(fun.data = mean_se, geom = "errorbar") +
  labs(title = "Mean Temp by Month",
       x = "",
       y = "Temp (deg. F)")



ggplot(aq_data, aes(month_abb, Temp)) +
  geom_point(stat = "summary",
           fun.y = "mean",
           fill = "cornflowerblue") +
  stat_summary(fun.data = mean_se, geom = "errorbar") +
  labs(title = "Mean Temp by Month",
       x = "",
       y = "Temp (deg. F)")

Created on 2019-11-01 by the reprex package (v0.3.0)

Nearby, they also discuss plotting confidence intervals. Now, I can't tell from your plot whether it is observed values, in which case, how are you deriving confidence levels? (Wouldn't they necessarily lie inside the errorbars?) Or, is it a model, such as

fit <- lm(mean(dose) ~ dose, data = my_data)
1 Like