Problems using ggplot2

Hello everyone, I'm very new to R coding and I've been doing exercises to help myself learn the language.

I'm trying to use ggplot2 to help plot models of some data and I'm having some issues.
This is my data, x1 is the input and y1 the output:
y1=c(10.05,1.5,-1.234,0.02,8.03)
x1=c(-3,-1,0,1,3)
When I try to model the data using a 1st degree polynomial, it all works fine until I try to plot using ggplot where I get a huge zig zag, I'll paste my coding and if someone could help me out that would be much appreciated!

data1<-data.frame(y1,x1)
Fit1<-lm(y1~x1,data = data1)
yhat1<-predict(Fit1,data1)
Then when I try to plot the fit using gg plot:
ggplot(data1, aes(x=x1, y=y1)) +geom_point(size=3, shape=19,color="red") + * geom_smooth(method="lm",formula = y1~x1, data=data1,se=FALSE,linetype="solid", color = "blue", size = 3)

I get the error : Warning message:
'newdata' had 80 rows but variables found have 5 rows

Along with the zigzag.
Thank you

The only issue with your code was in the call to the geom_smooth() function. You should have set the formula argument to y ~ x instead of y1 ~ x1. So here is the full reproducible code for you:

library(ggplot2)

y1 <- c(10.05, 1.5, -1.234, 0.02, 8.03)
x1 <- c(-3, -1, 0, 1, 3)

data1 <- data.frame(y1, x1)

Fit1 <- lm(y1 ~ x1, data = data1)

yhat1 <- predict(Fit1, data1)

ggplot(data = data1, aes(x = x1, y = y1)) +
	geom_point(size = 3, shape = 19, color = "red") +
	geom_smooth(method = "lm", formula = y ~ x, data = data1, se = FALSE, linetype = "solid", color = "blue", size = 3)

EDIT: As mentioned by @Z3tt, the geom_smooth() function does the fitting for you.

1 Like

Hi, ggplot's geom_smooth() does the fitting for you, no need to fit a linear model first nor to specify a formula explictly in your case. Alternatively, you can add the line manually but not with geom_smooth() but geom_line().

library(tidyverse)

y1=c(10.05,1.5,-1.234,0.02,8.03)
x1=c(-3,-1,0,1,3)
data1<-data.frame(y1,x1)
ggplot(data1, aes(x = x1, y = y1)) +
  geom_point(size = 3, shape = 19, color = "red") +   
  geom_smooth(method = "lm", se = FALSE, linetype = "solid", color = "blue", size = 3)
#> `geom_smooth()` using formula 'y ~ x'

Fit1<-lm(y1~x1,data = data1)
yhat1<-predict(Fit1,data1)
yhat2 <- as_tibble(x = x1, y = yhat1)
ggplot(data1, aes(x = x1, y = y1)) + 
  geom_point(size = 3, shape = 19, color = "red") + 
  geom_line(data = yhat2, aes(x = x1, y = yhat1), color = "blue", size = 3)

Created on 2020-07-16 by the reprex package (v0.3.0)

Thank you and @Z3tt so much for the help! So was my problem in that for y ~ x, the y and x don't mean the input I want to use, as in my example I had the data as y1 and x1, but rather the elements on the y axis and x axis? Am I understanding this correctly?

Yes, that is correct. You should just use y and x in the formula since they represent the variables that you mapped to them. And in all honesty, @Z3tt's answer is more thorough than mine. So you should mark his as the solution as it is likely to help future readers more.

1 Like

Haha, did not even realize that we posted almost in the same moment. Both work so I'm fien with your being the solution as well. Going to keep my answer so people can look up all. Cheers and happy coding!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.