`geom_smooth()` using formula 'y ~ x' Error: Continuous value supplied to discrete scale

i m stack i need help.
On running this code,

PIA1a<-ggplot(QNR, aes(x=N2Age, y=sources_Metric, fill=N3Sex, colour=N3Sex))+

  • labs(title = "sources of water by sex", x=" ", y="sources (%)")+
  • scale_color_discrete(name="Sex") + theme(legend.title = element_text(size = 7))+
  • geom_point(aes(size=hygieneSanitation_Metric)) + geom_smooth(method=lm) + theme_bw()+
  • theme(legend.position="bottom", plot.title = element_text(hjust = 0.5))+
  • guides(fill = "none", color= "none", size = "none") + scale_size(range = c(1,3))

To plot,

plot_grid(PIA1a)
it return with this error.
geom_smooth() using formula 'y ~ x'
Error: Continuous value supplied to discrete scale

Can you please post the result of

str(QNR)

str(QNR)
'data.frame': 200 obs. of 61 variables:
N2Age : int 27 45 19 17 32 35 27 23 20 30 ... N3Sex : int 1 2 1 1 2 1 1 1 1 2 ...
N4Education : int 2 3 2 3 3 3 2 3 2 3 ... occupation : int 2 4 1 1 2 2 3 3 1 2 ...
N5MaritalStatus : int 1 1 2 2 1 2 1 1 2 1 ... N6Religion : int 1 1 1 1 1 1 1 1 1 1 ...
N7Ethnicity : int 1 1 1 1 1 1 1 1 2 1 ... N8Site : int 8 1 2 3 3 4 5 8 8 6 ...
N9QS1 : int 1 3 1 2 2 3 6 6 1 3 ... N10QS2 : int 3 3 3 3 5 3 3 5 5 3 ...
N11QS3 : int 2 2 2 2 2 1 1 1 1 2 ... N12QS4 : int NA NA NA NA NA 1 1 1 1 NA ...
N13QS5 : int 2 2 2 2 2 2 2 5 2 2 ... N14QHS10 : int 2 2 2 2 2 1 1 1 1 2 ...
N15QHS11 : int 2 2 2 2 2 1 1 1 1 2 ... N16QHS12 : int 2 2 2 2 2 1 1 1 1 2 ...
N17QHS13 : int 2 2 1 1 2 1 1 1 1 2 ... N18QHS14 : int 2 2 2 2 2 1 1 2 1 2 ...
N19QHS15 : int 2 2 3 3 1 2 3 2 2 2 ... N20QA22 : int 2 2 2 2 2 2 2 2 1 2 ...
N21QA23 : int 2 2 1 2 2 1 1 1 1 2 ... N22QA24 : int 1 2 1 2 2 1 1 2 1 2 ...
N23QA25 : int 1 2 1 2 2 1 1 1 1 2 ... N24QEA16 : int 2 2 2 2 2 1 2 2 1 2 ...
N25QEA17 : int 2 2 1 2 1 1 2 1 1 2 ... N26QEA18 : int 5 5 5 5 4 1 5 5 5 5 ...
N27QEA19 : int 4 3 5 5 2 1 5 1 5 3 ... N28QEA20 : int 2 2 2 2 2 2 2 2 1 2 ...
N29QEA21 : int 2 2 2 2 2 2 2 2 2 2 ... N30QScw1 : int 2 1 2 2 2 1 2 2 2 1 ...
N31QScw2 : int 1 1 1 1 2 1 1 2 2 1 ... N32QScw3 : int 2 2 2 2 2 1 1 1 1 2 ...
N33QScw4 : int 2 2 2 2 2 2 2 2 2 2 ... N34QScw5 : int 1 1 1 1 1 1 1 2 1 1 ...
N35QHScw10 : int 2 2 2 2 2 1 1 1 1 2 ... N36QHScw11 : int 2 2 2 2 2 1 1 1 1 2 ...
N37QHScw12 : int 2 2 2 2 2 1 1 1 1 2 ... N38QHScw13 : int 2 2 1 1 2 1 1 1 1 2 ...
N39QHScw14 : int 2 2 2 2 2 1 1 2 1 2 ... N40QHScw15 : int 2 2 2 2 1 2 2 2 2 2 ...
N41QAcw22 : int 1 1 1 1 1 1 1 1 2 1 ... N42QAcw23 : int 1 1 2 1 1 2 2 2 2 1 ...
N43QAcw24 : int 2 1 2 1 1 2 2 1 2 1 ... N44QAcw25 : int 2 1 2 1 1 2 2 2 2 1 ...
N45QEA16 : int 2 2 2 2 2 1 2 2 1 2 ... N46QEA17 : int 2 2 1 2 1 1 2 1 1 2 ...
N47QEA18 : int 5 5 5 5 4 1 5 5 5 5 ... N48QEA19 : int 4 3 5 5 2 1 5 1 5 3 ...
N49QEA20 : int 2 2 2 2 2 2 2 2 1 2 ... N50QEA21 : int 2 2 2 2 2 2 2 2 2 2 ...
N51Qwbd6 : int 1 2 2 1 2 1 1 1 2 3 ... N52Qwbd7 : int 1 2 2 2 2 2 2 2 2 2 ...
N53QI8 : int 1 1 1 1 2 2 2 2 1 2 ... N54QI9 : int 2 3 3 3 3 NA NA NA 3 NA ...
N55QP26 : int 2 2 2 2 2 2 2 2 2 2 ... N56QP27 : int 2 2 2 2 2 2 2 2 1 2 ...
N57QP28 : int 2 2 2 2 2 2 2 2 1 2 ... sources_Metric : num 0 0 0 0 0 0 0 0 0 0 ...
hygieneSanitation_Metric: num 0 0 0 0 0 0 0 0 0 0 ... attitudeWASH_Metric : num 0 0 0 0 0 0 0 0 0 0 ...
$ sources_binary : num 0 0 0 0 0 0 0 0 0 0 ...

There are more columns that I expected, so the output of str() was not as useful as I hoped. Let's try a simplified example. Please make a new version of your data with just the columns needed for the plot and just enough rows to make a simple plot using your ggplot command. Ten or twenty rows should be plenty. Do not have a lot of rows, please. Call that simplified data frame QNR2. Please post the output of

dput(QNR2)

On the line just before and just after the output, place three back ticks, ```, so that it is formatted as code.

Ok. thank you.
this is a part of my research work. i hope reducing the rows will not affect the outcome.
Also can you help me with your email i like to have some private discussions.
thank you.

My request for a reduced data set is just to debug the problem. Once the plot works for a small data set it will work for the larger set and you should definitely use the whole set for your final result.

I do not give out my email address. You can send a private message on the forum, though I prefer to have discussions in the forum so that others can learn from the exchange.

Ok. i m very new to R.I have been trying to make small data set with N2Age, N3Sex, sources_Metric, hygieneSanitation_Metric, attitudeWASH_Metric with 20 rows from the data set QNR. but i have failed, it gives me Error in [.data.frame(QNR, , c(1, 2, 69, 70, 71), ) :
undefined columns selected.

Try

QNR2 <- QNR[1:20, c("N2Age", "N3Sex", "sources_Metric", "hygieneSanitation_Metric", "attitudeWASH_Metric")]
1 Like

Kindly send me a private message here, i dont know how. thank you.

Thank you so much.
here is the new data set.
'data.frame': 20 obs. of 5 variables:
N2Age : int 27 45 19 17 32 35 27 23 20 30 ... N3Sex : int 1 2 1 1 2 1 1 1 1 2 ...
sources_Metric : num 0 0 0 0 0 0 0 0 0 0 ... hygieneSanitation_Metric: num 0 0 0 0 0 0 0 0 0 0 ...
$ attitudeWASH_Metric : num 0 0 0 0 0 0 0 0 0 0 ...

i want to plot
PIA1a<-ggplot(QNR, aes(x=N2Age, y=sources_Metric, fill=N3Sex, colour=N3Sex))+
labs(title = "sources of water by sex", x=" ", y="sources (%)")+
scale_color_discrete(name="Sex") + theme(legend.title = element_text(size = 7))+
geom_point(aes(size=hygieneSanitation_Metric)) + geom_smooth(method=lm) + theme_bw()+
theme(legend.position="bottom", plot.title = element_text(hjust = 0.5))+
guides(fill = "none", color= "none", size = "none") + scale_size(range = c(1,3))

PIA1b<-ggplot(QNR, aes(x=N2Age, y=sources_Metric, fill=N3Sex, colour=N3Sex)) +
labs(title = "", x="Age in years", y="sources (%)") +
scale_fill_discrete(name="Sex") + theme(legend.title = element_text(size = 6)) +
geom_point(aes(size=attitudeWASH_Metric)) + geom_smooth(method=lm) + theme_bw()+
theme(legend.position="bottom",plot.title = element_text(hjust = 0.5)) +
guides(color = "none", size = "none",text.font=2) + scale_size(range = c(1,3)) +
theme(legend.title = element_text(size = 9)) + theme(legend.text = element_text(size = 9))+
theme(legend.key.size = unit(0.5, "cm"))

Please post the output of

dput(QNR2)

and remember to put three back ticks, ```, on the line before and after the text that you paste into your message.

I have to leave the forum for a few hours but I will check back.

sorry i dont get you clearly, (didnt understand) your last message on back ticks.

Copy the output from dput(QNR2) then start your new message. Type a line containing only three back ticks and paste in the dput() output below that. Then type another line containing only three back ticks after the pasted text. All of the text between the two lines with back ticks will be formatted as code and be much easier to read.

'data.frame': 20 obs. of 5 variables:
''' $ N2Age : int 27 45 19 17 32 35 27 23 20 30 ...

''' $ N3Sex : int 1 2 1 1 2 1 1 1 1 2 ...

''' $ sources_Metric : num 0 0 0 0 0 0 0 0 0 0 ...

''' $ hygieneSanitation_Metric: num 0 0 0 0 0 0 0 0 0 0 ...

''' $ attitudeWASH_Metric : num 0 0 0 0 0 0 0 0 0 0 ...

Is this what you mean or can you give example please.

In the following example I make a toy data frame named DF and run it through dput.

DF <- data.frame(Country = c("Austria", "Cambodia", "Congo", "Mexico", "Nepal"),
                 Value = 1:5)

dput(DF)
structure(list(Country = structure(1:5, .Label = c("Austria", 
"Cambodia", "Congo", "Mexico", "Nepal"), class = "factor"), Value = 1:5), class = "data.frame", row.names = c(NA, 
 -5L))

Created on 2020-06-15 by the reprex package (v0.3.0)
I would then copy the dput output and paste it into a reply on the forum putting three back ticks before and after it like this.
```
structure(list(Country = structure(1:5, .Label = c("Austria",
"Cambodia", "Congo", "Mexico", "Nepal"), class = "factor"), Value = 1:5), class = "data.frame", row.names = c(NA,
-5L))
```
In the above example, I have prevented the back ticks from having their usual effect. If I do not do that, the result in the posted reply looks like this

structure(list(Country = structure(1:5, .Label = c("Austria", 
"Cambodia", "Congo", "Mexico", "Nepal"), class = "factor"), Value = 1:5), class = "data.frame", row.names = c(NA, 
-5L))

In that tiny example, it does not make a lot of difference. In a larger block of code, it makes it much easier to copy and paste the result into R.

Ok. what should i do now.

You should make QNR2 like this

QNR2 <- QNR[1:20, c("N2Age", "N3Sex", "sources_Metric", "hygieneSanitation_Metric", "attitudeWASH_Metric")]

then run

dput(QNR2)

and paste the output into a message here.

Here is the
dput(QRE2)
structure(list(Age = c(27L, 45L, 19L, 17L, 32L, 35L, 27L, 23L,
20L, 30L, 51L, 31L, 31L, 64L, 48L, 22L, 16L, 20L, 19L, 35L),
SexCD = c(1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L), sources_Metric = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), hygieneSanitation_Metric = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
attitudeWASH_Metric = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c("KI1", "KI2",
"KI3", "KI4", "KI5", "KI6", "KI7", "KI8", "KI9", "KI10", "KI11",
"KI12", "KI13", "KI14", "KI15", "KI16", "KI17", "KI18", "KI19",
"KI20"), class = "data.frame")

Thank you! Using that data I got you original ggplot code to run with only a a few changes. Here is the original code

    ggplot(QNR, aes(x=N2Age, y=sources_Metric, fill=N3Sex, colour=N3Sex))+

    labs(title = "sources of water by sex", x=" ", y="sources (%)")+
    scale_color_discrete(name="Sex") + theme(legend.title = element_text(size = 7))+
    geom_point(aes(size=hygieneSanitation_Metric)) + geom_smooth(method=lm) + theme_bw()+
    theme(legend.position="bottom", plot.title = element_text(hjust = 0.5))+
    guides(fill = "none", color= "none", size = "none") + scale_size(range = c(1,3))

It plots N2Age on the x axis and N3Sex is used for fill and color. The data you posted does not have those columns but it has Age and SexCD, so I substituted those. The ggplot code would then throw the error "Error: Continuous value supplied to discrete scale". This happens because scale_color_discrete needs discrete values but SexCD is numbers. I fixed that by making a factor out of SexCD. The example below shows both the failed version and the corrected one. The statement geom_smooth() using formula 'y ~ x' still appears. You can change that by making the call to geom_smooth
geom_smooth(formula = y ~ x, method=lm)
The graph looks strange but remember that it only uses the first few rows of your data.

library(ggplot2)
QNR2 <-  structure(list(Age = c(27L, 45L, 19L, 17L, 32L, 35L, 27L, 23L,
                              20L, 30L, 51L, 31L, 31L, 64L, 48L, 22L, 
                              16L, 20L, 19L, 35L),
                      SexCD = c(1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 
                                2L, 2L,1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L), 
                      sources_Metric = c(0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 
                                         0, 0, 0, 0, 0, 0, 0, 0, 0), 
                      hygieneSanitation_Metric = c(0,0, 0, 0, 0, 0, 0, 0, 0, 
                                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
                      attitudeWASH_Metric = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                              0, 0, 0, 0, 0, 0, 0, 0, 0)), 
                   row.names = c("KI1", "KI2","KI3", "KI4", "KI5", "KI6", "KI7", 
                                 "KI8", "KI9","KI10", "KI11","KI12", "KI13", 
                                 "KI14", "KI15", "KI16", "KI17", "KI18", "KI19","KI20"), 
                   class = "data.frame")

ggplot(QNR2, aes(x=Age, y=sources_Metric, fill=SexCD, colour=SexCD))+
  labs(title = "sources of water by sex", x=" ", y="sources (%)")+
  scale_color_discrete(name="Sex") + 
  theme(legend.title = element_text(size = 7))+
  geom_point(aes(size=hygieneSanitation_Metric)) + 
  geom_smooth(method=lm) + 
  theme_bw()+
  theme(legend.position="bottom", plot.title = element_text(hjust = 0.5))+
  guides(fill = "none", color= "none", size = "none") + scale_size(range = c(1,3))
#> `geom_smooth()` using formula 'y ~ x'
#> Error: Continuous value supplied to discrete scale

QNR2$SexCD <- factor(QNR2$SexCD)
ggplot(QNR2, aes(x=Age, y=sources_Metric, fill=SexCD, colour=SexCD))+
  labs(title = "sources of water by sex", x=" ", y="sources (%)")+
  scale_color_discrete(name="Sex") + 
  theme(legend.title = element_text(size = 7))+
  geom_point(aes(size=hygieneSanitation_Metric)) + 
  geom_smooth(method=lm) + 
  theme_bw()+
  theme(legend.position="bottom", plot.title = element_text(hjust = 0.5))+
  guides(fill = "none", color= "none", size = "none") + scale_size(range = c(1,3))
#> `geom_smooth()` using formula 'y ~ x'

Created on 2020-06-15 by the reprex package (v0.3.0)

1 Like

Thank you so much.
i have started seeing the strange plot by ggplots. i dont know where the problems really is.
i wanted to plot a like this graph