Two Regression Lines in single plot, shaded by year, and points indicating a different location

#1

When using ggplot2, is there a way in which I can plot two regression lines from the same data set? The two regressions would be categorized in the data set (P VS NP), however I want to indicate location category by point shape (each location represented by a different point shape), and year by shading (each year represented by a different color shade).

I can easily plot two regressions indicating P and NP, with colors showing each location. But I am not sure how I would show year.

0 Likes

#2

We don't really have enough info to help you out. Could you ask this with a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

0 Likes

#3
Year Lake Total Wt (g) Total Length (mm) K Parasites
2015 A6 0.0915 29 0.375169134 0
2015 A6 0.166 34 0.42234887 0
2015 A6 0.201 36 0.430812757 0
2015 A6 0.1416 33 0.394022873 0
2015 A6 0.1913 35 0.446180758 0
2015 A8 0.365 41 0.529591852 0
2015 A8 0.534 48 0.482855903 0
2015 D7 0.182 35 0.424489796 0
2015 D7 0.1951 34 0.496387136 0
2015 D7 0.2137 34 0.543710564 0
2015 D7 0.2729 39 0.460054957 0
2015 D7 0.3845 44 0.451375845 0
2016 A6 0.0534 24 0.386284722 0
2016 A6 0.7558 48 0.683412905 0
2016 A6 0.6106 48 0.552119502 0
2016 A8 0.1219 29 0.499815491 0
2016 A8 0.1249 31 0.419254137 0
2016 A8 0.2519 37 0.497305194 0
2016 A8 0.4617 45 0.506666667 0
2016 A8 0.415 42 0.560144693 0
2016 A8 0.9104 51 0.686312203 0
2016 B2 0.0643 24 0.465133102 0
2016 B2 0.0807 27 0.409998476 0
2016 B2 0.1435 30 0.531481481 0
2016 B2 0.0292 19 0.425718035 0
2016 B2 0.0689 23 0.566285855 0
2016 B2 0.923 52 0.656434911 0
2016 D7 0.706 48 0.638382523 0
2016 D7 0.761 54 0.483285068 0
2016 D7 0.506 53 0.339877886 0
2016 D7 0.579 57 0.312646806 0
2016 D7 0.689 47 0.663629446 0
2016 D7 0.562 55 0.337791134 0
2016 D7 0.6049 51 0.456008624 0
2017 B2 0.264 35 0.61574344 0
2017 B2 0.3025 37 0.597200561 0
2017 B2 0.2666 36 0.571416324 0
2017 B2 0.3032 37.5 0.574957037 0
2017 B2 0.3573 41 0.51841964 0
2017 D7 0.6741 53 0.452789887 0
2017 D7 0.7112 49 0.604510026 0
2017 D7 0.7173 51 0.540742248 0
2017 D7 0.7104 50 0.56832 0
2017 D7 0.9858 59 0.479990651 0
2015 A6 0.5854 47 0.563844235 1
2015 A8 0.5298 48 0.47905816 1
2015 A8 0.8723 51 0.657590218 1
2015 D7 0.5171 46 0.531252568 1
2015 D7 0.55 47 0.529747744 1
2015 D7 0.659 46 0.677036246 1
2015 D7 0.5147 46 0.528786883 1
2016 A6 1.3292 61 0.585599676 1
2016 A6 0.3911 45 0.429190672 1
2016 A6 0.696 47 0.67037169 1
2016 A6 0.9489 55 0.570338092 1
2016 A6 0.6278 44 0.736992863 1
2016 A8 0.1397 34 0.35543456 1
2016 A8 0.2709 35 0.631836735 1
2016 A8 0.72 51 0.542777665 1
2016 A8 0.6535 49 0.555465835 1
2016 A8 0.8723 53 0.58591992 1
2016 A8 0.9278 57 0.500990858 1
2016 A8 1.1383 57 0.614656062 1
2016 B2 0.13 36 0.278635117 1
2016 B2 1.1902 57 0.642680879 1
2016 D7 0.2864 42 0.386567325 1
2016 D7 0.4656 41 0.675556071 1
2016 D7 0.3965 45 0.435116598 1
2016 D7 1.1652 58 0.597195457 1
2017 A6 1.7541 68 0.557863067 1
2017 B2 0.8721 54 0.553840878 1
2017 D7 0.5807 55 0.349030804 1

Ok so here is my reprex data set.

The code I have tried is:

Plot1<- ggplot(YearsPooled, aes(x=LogLength, y=LogWeight, colour=Year, linetype=Parasites)) + geom_point(size=2) + stat_smooth(method = "lm", col = "black", size=0.25, se=FALSE, fullrange=TRUE) + theme_bw() + theme(plot.background = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + theme(axis.line = element_line(color = 'black')) + guides(shape=guide_legend(title=NULL), linetype=guide_legend(title=NULL)) + scale_color_grey("Year") + xlab("Length ln(mm)") + ylab("Weight ln(g)") + scale_shape_manual("Lake", values=c(15,16, 17, 18))

And I am getting this figure:

What I am trying to do and figure out is to change each point to represent the lakes in the data set by shape, while maintaining the regressions of Parasitized vs non parasitized fish, also I want to keep the color shading by year. Hopefully this makes more sense now. Thank you for quickly replying as well.

0 Likes

#4

Your example is not actually reproducible because you are not providing the data within the reprex itself, and you are not including library calls.

I think this is what you are trying to do

library(tidyverse, quietly = TRUE)
YearsPooled <- data.frame(stringsAsFactors=FALSE,
                          Year = as.factor(c(2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,
                                   2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
                                   2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
                                   2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017,
                                   2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2015, 2015,
                                   2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016,
                                   2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
                                   2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017)),
                          Lake = c("A6", "A6", "A6", "A6", "A6", "A8", "A8", "D7", "D7",
                                   "D7", "D7", "D7", "A6", "A6", "A6", "A8", "A8", "A8",
                                   "A8", "A8", "A8", "B2", "B2", "B2", "B2", "B2", "B2",
                                   "D7", "D7", "D7", "D7", "D7", "D7", "D7", "B2", "B2",
                                   "B2", "B2", "B2", "D7", "D7", "D7", "D7", "D7", "A6", "A8",
                                   "A8", "D7", "D7", "D7", "D7", "A6", "A6", "A6", "A6",
                                   "A6", "A8", "A8", "A8", "A8", "A8", "A8", "A8", "B2",
                                   "B2", "D7", "D7", "D7", "D7", "A6", "B2", "D7"),
                          LogWeight = log(c(0.0915, 0.166, 0.201, 0.1416, 0.1913, 0.365, 0.534,
                                           0.182, 0.1951, 0.2137, 0.2729, 0.3845, 0.0534, 0.7558,
                                           0.6106, 0.1219, 0.1249, 0.2519, 0.4617, 0.415, 0.9104,
                                           0.0643, 0.0807, 0.1435, 0.0292, 0.0689, 0.923, 0.706,
                                           0.761, 0.506, 0.579, 0.689, 0.562, 0.6049, 0.264, 0.3025,
                                           0.2666, 0.3032, 0.3573, 0.6741, 0.7112, 0.7173, 0.7104,
                                           0.9858, 0.5854, 0.5298, 0.8723, 0.5171, 0.55, 0.659,
                                           0.5147, 1.3292, 0.3911, 0.696, 0.9489, 0.6278, 0.1397,
                                           0.2709, 0.72, 0.6535, 0.8723, 0.9278, 1.1383, 0.13, 1.1902,
                                           0.2864, 0.4656, 0.3965, 1.1652, 1.7541, 0.8721, 0.5807)),
                          LogLength = log(c(29, 34, 36, 33, 35, 41, 48, 35, 34, 34, 39, 44, 24,
                                                48, 48, 29, 31, 37, 45, 42, 51, 24, 27, 30, 19, 23,
                                                52, 48, 54, 53, 57, 47, 55, 51, 35, 37, 36, 37.5, 41, 53,
                                                49, 51, 50, 59, 47, 48, 51, 46, 47, 46, 46, 61, 45, 47,
                                                55, 44, 34, 35, 51, 49, 53, 57, 57, 36, 57, 42, 41, 45,
                                                58, 68, 54, 55)),
                          K = c(0.375169134, 0.42234887, 0.430812757, 0.394022873,
                                0.446180758, 0.529591852, 0.482855903, 0.424489796,
                                0.496387136, 0.543710564, 0.460054957, 0.451375845,
                                0.386284722, 0.683412905, 0.552119502, 0.499815491,
                                0.419254137, 0.497305194, 0.506666667, 0.560144693, 0.686312203,
                                0.465133102, 0.409998476, 0.531481481, 0.425718035,
                                0.566285855, 0.656434911, 0.638382523, 0.483285068,
                                0.339877886, 0.312646806, 0.663629446, 0.337791134, 0.456008624,
                                0.61574344, 0.597200561, 0.571416324, 0.574957037,
                                0.51841964, 0.452789887, 0.604510026, 0.540742248, 0.56832,
                                0.479990651, 0.563844235, 0.47905816, 0.657590218,
                                0.531252568, 0.529747744, 0.677036246, 0.528786883,
                                0.585599676, 0.429190672, 0.67037169, 0.570338092, 0.736992863,
                                0.35543456, 0.631836735, 0.542777665, 0.555465835,
                                0.58591992, 0.500990858, 0.614656062, 0.278635117, 0.642680879,
                                0.386567325, 0.675556071, 0.435116598, 0.597195457,
                                0.557863067, 0.553840878, 0.349030804),
                          Parasites = as.factor(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                                   0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                                                   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)) 
                          )
ggplot(YearsPooled, aes(x=LogLength, y=LogWeight, colour=Year, linetype=Parasites)) + 
    geom_point(size=2, aes(shape = Lake)) +
    stat_smooth(method = "lm", col = "black", size=0.25, se=FALSE, fullrange=TRUE) +
    theme_bw() +
    theme(plot.background = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + 
    theme(axis.line = element_line(color = 'black')) +
    guides(shape=guide_legend(title=NULL), linetype=guide_legend(title=NULL)) +
    scale_color_grey("Year") +
    xlab("Length ln(mm)") +
    ylab("Weight ln(g)") +
    scale_shape_manual("Lake", values=c(15,16, 17, 18)) +
    NULL

Created on 2019-01-28 by the reprex package (v0.2.1)

0 Likes

#5

Hi, wow, that was exactly what I wanted to do, may I ask how your code worked and not mine, I did try changing a number of statements to allow it to code lake for shape. But what ended up happening is lake was treated for regressions so I got two regressions for each lake... Thanks!

0 Likes

#6

This is the line that makes the trick, you just have to map your variable to an aesthetic at the geom level.

0 Likes

#7

Appreciate the help! thanks!

0 Likes

#8

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

1 Like

closed #9

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

0 Likes