not enough data to fit regression

greetings,

i would like to fit a regression line on grouped data in the graph provided to kind of like see the trend or model with the increase in concentration/dose. Some equation seem to be displaying but no actual line on the graph to show the regression line. I have used the code below and getting error message . I would like to display the equation of the different varieties and the R square thereofRplot02

library(ggplot2)

library(ggpmisc)
formula1 <- avg_AB ~ Dose
ggplot(data = MeanSE_AB, aes(x = Dose, y = avg_AB)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, formula = formula1) +
stat_poly_eq(aes(label = paste0("atop(", ..eq.label.., ",", ..rr.label.., ")")),
formula = formula1,
parse = TRUE) +
theme_bw(base_size = 16)

Warning messages:
1: Not enough data to perform fit for group 1; computing mean instead.
2: Not enough data to perform fit for group 2; computing mean instead.
3: Not enough data to perform fit for group 3; computing mean instead.

1 Like

Perhaps you could show us the data in group 1 of MeanSE_AB?

I did lm() on the first four observations, which worked fine. Check if "group 1" is in fact the first four observations maybe?

yes , the 1st four observations are group 1 and so on..
what do you mean you did lm()?, does that mean this was done to all groups?

foo
# A tibble: 4 x 4
  Variety  Dose avg_AB    se
  <chr>   <dbl>  <dbl> <dbl>
1 Akwa       25   13.2  5.07
2 Akwa       50   30.8  6.98
3 Akwa       75   24.8  4.9 
4 Akwa      100   43.2  5.93```


lm(avg_AB ~ Dose,data = foo)

Call:
lm(formula = avg_AB ~ Dose, data = foo)

Coefficients:
(Intercept) Dose
7.000 0.336

So it looks to me that you DO have enough data, so the bug is not what it appears to be.

what i am actually looking for is the lm () equation of the group represented in the graph and indicating the R2 value as in the attached Example

Your code looks correct to me, which I understand is not very helpful.

I don't know how to adjust the positions of the regression equations in the first version so I added a faceted version also.

library(ggplot2)
library(ggpmisc)
MeanSE_AB <- read.csv("~/R/Play/Dummy.csv",sep = " ")
formula1 <- avg_AB ~ Dose
ggplot(data = MeanSE_AB, aes(x = Dose, y = avg_AB,color=Variety)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, formula = y~x) +
  stat_poly_eq(aes(label = paste0("atop(", ..eq.label.., ",", ..rr.label.., ")")),
               formula = y~x,
               parse = TRUE) +
  theme_bw(base_size = 16)


ggplot(data = MeanSE_AB, aes(x = Dose, y = avg_AB)) +
  geom_point() +
  facet_wrap(~Variety)+
  geom_smooth(method = "lm", se = FALSE, formula = y~x) +
  stat_poly_eq(aes(label = paste0("atop(", ..eq.label.., ",", ..rr.label.., ")")),
               formula = y~x,
               parse = TRUE) +
  theme_bw(base_size = 16)

Created on 2021-06-18 by the reprex package (v0.3.0)

HI @FJCC ,
can you share the data please ?

@FJCC

which program did you use to run? when i run using Rstudio the regression lines do not show up only see the dots

The data:

structure(list(Variety = c("Akwa", "Akwa", "Akwa", "Akwa", "Anel", 
"Anel", "Anel", "Anel", "Kwarts", "Kwarts", "Kwarts", "Kwarts", 
"Selie", "Selie", "Selie", "Selie"), Dose = c(25L, 50L, 75L, 
100L, 25L, 50L, 75L, 100L, 25L, 50L, 75L, 100L, 25L, 50L, 75L, 
100L), avg_AB = c(13.2, 30.8, 24.8, 43.2, 5, 13.5, 19.2, 33, 
24.8, 30.2, 42, 50, 8.25, 13.8, 20, 33.5), se = c(5.07, 6.98, 
4.9, 5.93, 3.61, 4.19, 2.72, 2.7, 4.9, 2.76, 4.33, 3.25, 3.67, 
2.52, 2.24, 1.84)), class = "data.frame", row.names = c(NA, -16L
))

The code I posted above was run in RStudio. You can just substitute the structure() call that I just posted for the read.csv() call in my code.

Edit: Notice that I changed the formula within geom_smooth and stat_poly_eq.

@FJCC

The problem i am now experiencing is that the graph not showing regression lines but plotted points. Only when i change the Dose to integer or numeric then i get the regression line on the graph. Changing Dose to integer nor numeric changes the values to 1,2,3, and 4 which is not the actual presentation of the graph ...values required are the 25,75 as in your constructed graph

Please post the output of

dput(MeanSE_AB)

Put a line with three back ticks before and after your pasted output, like this
```
Your output
```

structure(list(Variety = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("Akwa", "Anel", 
"Kwarts", "Selie"), class = "factor"), Dose = structure(c(1L, 
2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("25", 
"50", "75", "100"), class = "factor"), avg_AB = c(13.25, 30.75, 
24.75, 43.25, 5, 13.5, 19.25, 33, 24.75, 30.25, 42, 50, 8.25, 
13.75, 20, 33.5), se = c(5.07004508292157, 6.97892234824669, 
4.89806813229169, 5.9273880058107, 3.60555127546399, 4.18756663059996, 
2.72390213795262, 2.69920623252731, 4.89806813229169, 2.76295649104878, 
4.32600112277906, 3.25137333621173, 3.67301938853737, 2.51956628920818, 
2.23606797749979, 1.84197099403252)), row.names = c(NA, -16L), groups = structure(list(
    Variety = structure(1:4, .Label = c("Akwa", "Anel", "Kwarts", 
    "Selie"), class = "factor"), .rows = structure(list(1:4, 
        5:8, 9:12, 13:16), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

The Dose column in your data is a factor. You can see this with

class(MeanSE_AB$Dose)

Try running the following code before graphing.

MeanSE_AB$Dose <- as.numeric(as.character(MeanSE_AB$Dose))

thanks a lot :pray: :pray:t its working now
i had set dose as : "as.numeric" Not "as.numeric(as.character)"
much appreciated

Hi,

do you maybe perhaps know how one can get the points to start at zero or pass through zero point? I would like to reference that to the zero point

Thanks

You can force the fits to go through zero by adding + 0 to the fit formulas.

ggplot(data = MeanSE_AB, aes(x = Dose, y = avg_AB)) +
  geom_point() +
  facet_wrap(~Variety)+
  geom_smooth(method = "lm", se = FALSE, formula = y ~ x + 0) +
  stat_poly_eq(aes(label = paste0("atop(", ..eq.label.., ",", ..rr.label.., ")")),
               formula = y ~ x + 0,
               parse = TRUE) +
  theme_bw(base_size = 16)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.