Loop to create plots based on specific stations

Hello,

Not quite sure how to describe this but I am going to try. I have a very large dataset (13000 observations) that I want to plot based on the column Station. Out of this dataset, I have a list of stations that I want to be plotted. Is there a way to compare the dataset to the list, and if the Station name in the dataset matches the Station name on the list, have R plot the resulting Station?

I have the following code to plot the data by station:

dt %>%
  split(list(.$Station)) %>%
  purrr::map2(.y = names(.),
              ~ ggplot(.) +
                geom_line(aes(x=Fluorescence, y=Depth), color="purple") +
                geom_line(aes(x=Temp, y=Depth), color="blue") +
                scale_y_reverse() +
                labs(title=.y))

Here is a subset of the data that I want to plot (using slice() to show the first 3 entries so as not to overwhelm):

structure(list(Station = c("MAN1", "MAN1", "MAN17", "MAN17", 
"MAN17", "MAN3", "MAN3", "MAN3", "MAN4", "MAN4", "MAN4", "MAN5", 
"MAN5", "MAN5", "MAN8", "MAN8", "MAN8", "MIC17", "MIC17", "MIC17", 
"N2", "N2", "N2", "N3", "N3", "N3", "PET2", "PET2", "PET2", "PW2", 
"PW2", "PW2", "PW3", "PW3", "PW3", "PW4", "PW4", "PW4", "PW5", 
"PW5", "PW5", "PWA17", "PWA17", "PWA17", "PWA30", "PWA30", "PWA30", 
"Q13", "Q13", "Q13", "Q30", "Q30", "Q30", "RAC8", "RAC8", "RAC8", 
"S2", "S2", "S2", "S3", "S3", "S3", "S4", "S4", "S4", "SB2", 
"SB2", "SB2", "SB3", "SB3", "SB3", "SB4", "SB4", "SB4", "SB5", 
"SB5", "SB5", "SB6", "SB6", "SB6", "SC2", "SC2", "SC2", "SC3", 
"SC3", "SC3", "SC4", "SC4", "SC4", "SC5", "SC5", "SC5", "SHB17", 
"SHB17", "SHB17", "SHB30", "SHB30", "SHB30", "SHB8", "SHB8", 
"SHB8", "STJ17", "STJ17", "STJ17", "STJ30", "STJ30", "STJ30", 
"STJ8", "STJ8", "STJ8", "STJ82", "STJ82", "STJ82", "STO30", "STO30", 
"STO30", "SY1", "SY1", "SY1", "SY4", "SY4", "SY4", "SY5", "SY5", 
"SY5", "V1", "V1", "V1", "WAU8", "WAU8", "WAU8", "X1", "X1", 
"X1", "X2", "X2", "X2"), Depth = c(1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 2, 2.5, 3, 1.5, 2, 2.5, 0.5, 
1, 1.5, 2, 2.5, 3, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 
1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5, 0.5, 1, 1.5), Temp = c(18.3467, 
18.3467, 19.3993, 19.4034, 19.4037, 18.6445, 18.6557, 18.7067, 
18.6094, 18.5384, 18.5233, 18.6765, 18.6779, 18.6745, 20.2061, 
20.2041, 20.2031, 21.1933, 21.278, 21.3736, 21.1167, 21.132, 
21.2108, 20.9386, 20.9494, 20.9513, 19.3293, 19.4498, 19.3897, 
20.3008, 20.3055, 20.308, 20.0363, 20.0443, 20.0486, 19.91, 19.9069, 
19.8949, 19.9188, 19.9327, 19.9283, 19.7936, 19.7922, 19.7973, 
19.3111, 19.6616, 19.6432, 18.9398, 18.9247, 18.9294, 18.1222, 
17.7386, 17.7598, 18.2149, 18.1774, 18.144, 21.5584, 21.5845, 
21.575, 20.5931, 20.5692, 20.5752, 20.443, 20.3992, 20.4119, 
19.4674, 19.3408, 19.2992, 20.1452, 20.1535, 20.1551, 19.0262, 
19.0745, 19.2226, 19.6549, 19.75, 19.7689, 19.5443, 19.5789, 
19.563, 18.4369, 18.3949, 18.3928, 18.6957, 18.6719, 19.1021, 
18.7909, 18.5096, 17.8352, 17.1719, 17.31, 17.2597, 19.6089, 
19.6176, 19.6258, 19.3181, 19.3301, 19.3301, 19.9309, 19.9298, 
19.9311, 20.8297, 20.8177, 20.814, 20.517, 20.5132, 20.5129, 
21.5459, 21.5465, 21.5461, 21.5796, 21.5778, 21.5711, 18.6234, 
18.6231, 18.6227, 20.2869, 20.2899, 20.2828, 19.7011, 19.6954, 
19.6956, 19.6966, 19.6976, 19.7005, 21.3443, 21.296, 20.9851, 
18.9729, 18.9728, 18.9444, 19.8958, 19.8973, 19.9013, 18.7957, 
18.7072, 18.2253), Fluorescence = c(0.4913, 0.49176, 0.23811, 
0.15949, 0.20518, 0.51767, 0.38041, 0.38022, 0.47683, 0.36871, 
0.35445, 0.24683, 0.24215, 0.23905, 0.58206, 0.4907, 0.38369, 
0.42625, 0.41642, 0.41131, 0.79022, 0.65469, 0.5861, 0.43609, 
0.44216, 0.46172, 0.4536, 0.46869, 0.43206, 1.2731, 1.2486, 1.2356, 
0.79053, 0.80851, 0.81715, 0.46273, 0.45185, 0.45092, 0.42194, 
0.4306, 0.44206, 0.48511, 0.2449, 0.27952, 0.33325, 0.29978, 
0.30044, 0.37807, 0.39161, 0.38103, 0.27039, 0.2364, 0.2403, 
0.21091, 0.22255, 0.24443, 0.50026, 0.52019, 0.52482, 0.20827, 
0.15968, 0.1559, 0.16748, 0.15976, 0.16207, 0.41961, 0.32552, 
0.28988, 0.77813, 0.36515, 0.37967, 0.37003, 0.33572, 0.36936, 
0.50158, 0.53267, 0.57139, 0.47693, 0.4819, 0.51325, 0.30893, 
0.25657, 0.26727, 0.361, 0.20831, 0.22788, 0.22303, 0.19556, 
0.20372, 0.20655, 0.19805, 0.20698, 0.34661, 0.33625, 0.35358, 
0.33217, 0.33269, 0.32378, 0.47246, 0.47865, 0.47904, 0.15546, 
0.15833, 0.16342, 0.27456, 0.24677, 0.24984, 0.29446, 0.30664, 
0.31506, 0.29177, 0.28125, 0.30387, 0.37583, 0.42154, 0.36675, 
0.60421, 0.57923, 0.59241, 0.30534, 0.22203, 0.23022, 0.28294, 
0.16516, 0.1772, 0.51532, 0.48948, 0.50991, 0.52351, 0.53712, 
0.61381, 1.0138, 1.0338, 1.0853, 0.29813, 0.27079, 0.273)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -137L), groups = structure(list(
    Station = c("MAN1", "MAN17", "MAN3", "MAN4", "MAN5", "MAN8", 
    "MIC17", "N2", "N3", "PET2", "PW2", "PW3", "PW4", "PW5", 
    "PWA17", "PWA30", "Q13", "Q30", "RAC8", "S2", "S3", "S4", 
    "SB2", "SB3", "SB4", "SB5", "SB6", "SC2", "SC3", "SC4", "SC5", 
    "SHB17", "SHB30", "SHB8", "STJ17", "STJ30", "STJ8", "STJ82", 
    "STO30", "SY1", "SY4", "SY5", "V1", "WAU8", "X1", "X2"), 
    .rows = structure(list(1:2, 3:5, 6:8, 9:11, 12:14, 15:17, 
        18:20, 21:23, 24:26, 27:29, 30:32, 33:35, 36:38, 39:41, 
        42:44, 45:47, 48:50, 51:53, 54:56, 57:59, 60:62, 63:65, 
        66:68, 69:71, 72:74, 75:77, 78:80, 81:83, 84:86, 87:89, 
        90:92, 93:95, 96:98, 99:101, 102:104, 105:107, 108:110, 
        111:113, 114:116, 117:119, 120:122, 123:125, 126:128, 
        129:131, 132:134, 135:137), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -46L), .drop = TRUE))

And here is the list of stations that I only want plotted. I'm doing this because I only need a small percentage of the generated plots and having to flip through all the plots to find the ones I'm looing for is very time consuming.

chemsites <- structure(list(Station = c("STJ8", "STJ8", "STJ17", "STJ17", 
"STJ30", "STJ30", "MIC8", "MIC8", "MIC17", "MIC17", "N3", "N3", 
"WAU8", "WAU8", "EG12", "EG12", "B5", "B5", "RAC8", "RAC8", "Q30", 
"Q30", "C6", "C6", "PWA8", "PWA8", "PWA17", "PWA17", "PWA30", 
"PWA30", "L260", "L260", "9561", "9561", "STO30", "STO30", "SHB8", 
"SHB8", "SHB17", "SHB17", "SHB30", "SHB30", "Man8", "Man8", "Man17", 
"Man17", "9574", "9574", "LVD8", "LVD8", "LVD17", "LVD17", "9570", 
"9570")), class = "data.frame", row.names = c(NA, -54L))

Thank you so much for any help you can provide!

Provided I'm understanding the problem correctly, it sounds like you just need to filter down the data.frame before plotting it. If I'm understanding that correctly, you can either:

  • facet
  • loop
    If you want to facet, you can do something like this:

df %>% 
    filter(
        Station %in% chemsites$Station
    ) %>% 
    ggplot() +
    geom_line(aes(x = Fluorescence, y = Depth), color = 'purple') + 
    # And any other plotting code
    facet_wrap(~Station)

If you want to loop, you could do this:

for(station in chemsites$Station) {
    tmp <- df %>% 
        filter(
            Station == station
        )
    if(!is_empty(tmp)) {
        p <- tmp %>% 
            ggplot() +
            geom_line(aes(x = Fluorescence, y = Depth), color = 'purple')
        
        show(p) # You would probably want to either use ggsave or assign to get the actual plots out of here
        # show will just make them appear in your RStudio Viewer pane
    }
}
1 Like

You understood perfectly! It turns out the only part of that code I needed was

dt %>%
  filter(
    Station %in% chemsites$Station
  )

And then add the rest of my code. Thanks so much for your help!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.