Plotting the Relationship Between Precipitation Depth, Precipitation Intensity, and Increase in Streamflow

I have a .csv file containing precipitation and streamflow data taken over 15 minute intervals between 2010 and 2022. To avoid any changes in streamflow due to snow melt I used only data from May 1 through October 31 of each year. I want to illustrate the relationship between rainfall depth (mm), rainfall intensity (mm/hr), and the resulting increase in stream discharge (m3/s).

Below is the structure of my .csv file:

          Date     Time Precipitation Discharge
1   2010-05-01  0:00:00           0.0     0.299
2   2010-05-01  0:15:00           0.0     0.302
3   2010-05-01  0:30:00           0.0     0.305
4   2010-05-01  0:45:00           0.2     0.308
5   2010-05-01  1:00:00           0.2     0.312
6   2010-05-01  1:15:00           0.0     0.317
7   2010-05-01  1:30:00           0.0     0.321
8   2010-05-01  1:45:00           0.2     0.326
9   2010-05-01  2:00:00           0.4     0.330
10  2010-05-01  2:15:00           0.0     0.339
11  2010-05-01  2:30:00           0.0     0.352
12  2010-05-01  2:45:00           0.2     0.380
13  2010-05-01  3:00:00           0.0     0.410
14  2010-05-01  3:15:00           0.0     0.428
15  2010-05-01  3:30:00           0.0     0.424
16  2010-05-01  3:45:00           0.0     0.419

Using this data, I want to be able to create a plot showing the relationship between precipitation depth (mm), precipitation intensity (mm/hr), and the increase in stream discharge (m3/s) such as what is shown in the plot below. Note that these values are made up for illustrative purposes and are not representative of my data.

Does anyone have any advice as to how to illustrate this?

TIA.

I guess a place to start is with some questions: What code did you use to make the plot you shared, and in what ways do you see the plot you shared as insufficient?

I didn’t use any code to illustrate this, this plot was created in excel. I am trying to create a similar plot in R but am unsure how to code it.

I see — thanks, Brant. Could you describe how you calculate intensity? Is it the rolling hourly sum of precipitation? (I assume depth is the same as precipitation — is that right?)

Yes, precipitation intensity will be calculated as a rolling hourly sum of precipitation and depth is the same as precipitation. I’m also not sure how to write the code to calculate precipitation intensity.

Thank you. My data is contained in the google drive folder "fifteeenminuteintervaldata.csv" (attached).

This .csv file is in the format Date, Time, Precipitation, Discharge. In this .csv file, "Date" is the date ranging from May 1, 2010 until October 31, 2022. "Time" is the time of day (in 15 minute intervals), "Precipitation" is the volume of precipitation that has fallen over each 15 minute time interval, and "Discharge" is the stream discharge measurement taken at each 15 minute interval.

I first want to split this data into two time periods "pre-treatment", and "post-treatment". The pre-treatment time period will contain data from 2010 until 2012 and the post-treatment time period will contain data from 2020 until 2022. My end-goal is to determine whether the increase in stream discharge at varying precipitation depths and intensities differs in the post-treatment scenario compared to the pre-treatment scenario.

I would like to use this data to plot a series of regression lines (similar to the plot above) to compare the relationships between precipitation depth (where precipitation depth is the sum of consecutive non-zero precipitation measurements), precipitation intensity (where precipitation intensity is a rolling hourly sum of the precipitation depth), and the increase in the stream discharge (where increase in stream discharge is the rolling hourly increase in stream discharge).

After plotting individual regression lines for 5, 10, 15, and 20 mm/hr rainfall intensities, I would like to be able to calculate an R-squared and regression slope for each of these relationships, but am also unsure how to go about that. Ultimately I want to be able to determine how much stream discharge increases with increasing precipitation depth for each precipitation intensity (5 mm/hr, 10 mm/hr, 15 mm/hr, and 20 mm/hr and compare if and how this relationship varies between the "pre-treatment" and "post-treatment" scenarios (Although I am not sure how to code this yet). I would appreciate any guidance or advice that anyone has.

Hi Brant,

It would be easier for folks here to help if you supplied data by following these steps:

  1. Import your csv file into RStudio, say, as a table called interval_data.
  2. Run this code, exactly as written:
# redirect console output to the new file, "data_for_posit.txt"
sink("data_for_posit.txt")  
# print contents of first 100 rows to file
head(interval_data, 100) |>
  dput()
# redirect output back to console (very important!)
sink()
  1. Open the file "data_for_posit.txt", select all, copy, and then paste here between a pair of triple backticks, like this:
``` r
[<-- paste dput() output here]
```

@ dromano

Brant's dataset, as downloaded, is 227,455 X 4.

I am not sure that Brant could post any really meaningfully data sample here. One day's sample (2010-05-01) gives us 96 rows of data.

BTW, in what looks like machine-generated data, there is one row of data with missing data.

DT[24958,]
     Date   Time Precipitation Discharge
   <char> <char>         <num>     <num>
1:                          NA 0.2759609

@jrkrideau you are correct, this was all machine-generated data. The data was automatically captured by sensors every 15 minutes.

Any idea why that row is corrupt? Or possibly, from a more practical viewpoint, would it be okay to just delete it?

BTW, am I correct in assuming that there is data for every day from May 1, 2010 until October 31, 2022? I suppose I could work it out but it is easier to just ask.

Forgot.

Is
The pre-treatment time period will contain data from 2010 until 2012 and the post-treatment time period will contain data from 2020 until 2022.
correct? It is okay to drop 8 years of data?

Yes, from a practical viewpoint, this row could be deleted from the dataset. I tried removing missing data, but must have missed an observation.

Ah good. It's easier dealing with no NA's. With that large a data set, cleaning it can be a real pain.

Before 2010 there was no LID construction in the watershed and after 2020 there were four LIDs constructed in the watershed. These LIDs are designed to retain rainfall runoff with the intent of reducing increases in downstream stream discharge and associated flood risk during and following heavy rainfall events. I want to compare the conditions before there were no LIDs (i.e., 2010 until 2012) to the conditions after there were 4 LIDs constructed in the watershed (i.e., after 2020). The first LID was constructed in 2013, the second and third in 2016 and the fourth in 2019. There is going to be a lot of changes during this time period, so I wanted to focus on only the periods 2010 to 2012 and 2020 to 2022 as there were no changes during these time periods that can mask or influence the results.

Can I not just use the function na.omit() to deal with the corrupt row?

For the purpose of addressing the plotting question,

I think 100 rows would be enough, but for the purpose of suggesting alternative approaches to visualization and analysis, I would agree.

If you use ggplot() for the visualization, missing data won't be an issue, but I'm not sure about plot().

Ok, I followed your steps and this is what was provided when I opened the file "data_for_posit.txt"

[<-- structure(list(function..... = c("    verbose = getOption(verbose)", 
"{", "    fileExt <- function(x) {", "        db <- grepl(\\\\.[^.]+\\\\.(gz|bz2|xz)$", 
"        ans <- sub(.*\\\\.", "        ans[db] <- sub(.*\\\\.([^.]+\\\\.)(gz|bz2|xz)$", 
"            x[db])", "        ans", "    }", "    my_read_table <- function(...) {", 
"        lcc <- Sys.getlocale(LC_COLLATE)", "        on.exit(Sys.setlocale(LC_COLLATE", 
"        Sys.setlocale(LC_COLLATE", "        read.table(...)", 
"    }", "    stopifnot(is.character(list))", "    names <- c(as.character(substitute(list(...))[-1L])", 
"    if (!is.null(package)) {", "        if (!is.character(package)) ", 
"            stop('package' must be a character vector or NULL)", 
"    }", "    paths <- find.package(package", "    if (is.null(lib.loc)) ", 
"        paths <- c(path.package(package", "            paths)", 
"    paths <- unique(normalizePath(paths[file.exists(paths)]))", 
"    paths <- paths[dir.exists(file.path(paths", "    dataExts <- tools:::.make_file_exts(data)", 
"    if (length(names) == 0L) {", "        db <- matrix(character()", 
"        for (path in paths) {", "            entries <- NULL", 
"            packageName <- if (file_test(-f", "                DESCRIPTION))) ", 
"                basename(path)", "            else .", "            if (file_test(-f", 
"                data.rds))) {", "                entries <- readRDS(INDEX)", 
"            }", "            else {", "                dataDir <- file.path(path", 
"                entries <- tools::list_files_with_type(dataDir", 
"                  data)", "                if (length(entries)) {", 
"                  entries <- unique(tools::file_path_sans_ext(basename(entries)))", 
"                  entries <- cbind(entries", "                }", 
"            }", "            if (NROW(entries)) {", "                if (is.matrix(entries) && ncol(entries) == 2L) ", 
"                  db <- rbind(db", "                    entries))", 
"                else warning(gettextf(data index for package %s is invalid and will be ignored", 
"                  sQuote(packageName))", "            }", "        }", 
"        colnames(db) <- c(Package", "        footer <- if (missing(package)) ", 
"            paste0(Use ", "                \\n", "        else NULL", 
"        y <- list(title = Data sets", "            footer = footer)", 
"        class(y) <- packageIQR", "        return(y)", "    }", 
"    paths <- file.path(paths", "    for (name in names) {", 
"        found <- FALSE", "        for (p in paths) {", "            tmp_env <- if (overwrite) ", 
"                envir", "            else new.env()", "            if (file_test(-f", 
"                rds <- readRDS(file.path(p", "                if (name %in% names(rds)) {", 
"                  found <- TRUE", "                  if (verbose) ", 
"                    message(sprintf(name=%s:\\t found in Rdata.rds", 
"                      name)", "                  thispkg <- sub(.*/([^/]*)/data$", 
"                  thispkg <- sub(_.*$", "                  thispkg <- paste0(package:", 
"                  objs <- rds[[name]]", "                  lazyLoad(file.path(p", 
"                    filter = function(x) x %in% objs)", "                  break", 
"                }", "                else if (verbose) ", "                  message(sprintf(name=%s:\\t NOT found in names() of Rdata.rds, i.e.,\\n\\t%s\\n", 
"                    name", "                    domain = NA)", 
"            }", "            if (file_test(-f", "                warning(zipped data found for package ", 
"                  .\\nThat is defunct, so please re-install the package.", 
"                  domain = NA)", "                if (file_test(-f", 
"                  files <- file.path(p"), list...character.. = c(" envir = .GlobalEnv", 
"", "", " x)", " ", " \\\\1\\\\2", "", "", "", "", "", " lcc))", 
" C)", "", "", "", " list)", "", "", "", "", " lib.loc", "", 
" TRUE)", "", "", " data))]", "", "", " nrow = 0L", "", "", " file.path(path", 
"", "", "", " INDEX <- file.path(path", "", "", "", "", " data)", 
" ", "", "", "", " )", "", "", "", "", " cbind(packageName", 
"", " ", " domain = NA", "", "", " LibPath", "", " sQuote(paste(data(package =", 
" to list the data sets in all *available* packages.)", "", " header = NULL", 
"", "", "", "", " data)", "", "", "", "", "", "", " file.path(p", 
" Rdata.rds))", "", "", "", " ", " domain = NA)", " \\\\1", " ", 
" thispkg)", "", " Rdata)", "", "", "", "", " ", " paste(names(rds)", 
"", "", " file.path(p", " sQuote(basename(dirname(p)))", " ", 
"", " fp <- file.path(p", " scan(fp"), package...NULL = c(" overwrite = TRUE) ", 
"", "", "", " x)", " ", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", " verbose = verbose)", "", " if (!length(package)) getwd()", 
"", "", "", "", "", " ncol = 4L)", "", "", " ", "", "", "", " Meta", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", " dirname(path)", 
"", "", " call. = FALSE)", "", "", " Item", "", " .packages(all.available = TRUE))))", 
"", "", " results = db", "", "", "", "", "", "", "", "", "", 
"", "", " Rdata.rds))) {", "", "", "", "", "", "", " p)", " thispkg)", 
"", "", " envir = tmp_env", "", "", "", "", "", " collapse = ,))", 
"", "", " Rdata.zip))) {", " ", "", "", " filelist))) ", " what = "
), lib.loc...NULL = c("", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", " ", "", 
"", "", "", "", "", "", "", "", "", "", "", " ", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", " ", "", "", "", 
"", "", " Title)", "", " ", "", "", " ", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", " ", "", "", "", "", "", " ", "", "", "", "", "", "", "", 
" quiet = TRUE))"), X = c("", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "")), row.names = c(NA, 
100L), class = "data.frame")]

Very strange — how did you import the csv? Could you post the code you used to do that?

# Read the data
data <- read.csv("fifteenminuteintervaldata.csv")
data # 15 minute interval precipitation and discharge data from 2010 to 2022