How to combine rows based on two columns

Hello,
I have a growing dataset that I am really struggling to condense into a usable form. I am doing field work that requires me repeatedly visiting sites throughout a year, and doing 10 sweeps of the sites each time. So my data is inputted as "Site 1" on "3-July", and "Sweep 1", repeat through sweep 10 for each visit.

On each sweep we count the number of different inverts we find. I need to be able to combine those counts per site, but keep the different dates separate.

> site <- c("A1","A1","A2","A2","A1","A1","A2","A2")
> date <- c("7/31","7/31","7/31","7/31","8/29","8/29","8/29","8/29")
> bugs <- c(10,10,20,20,5,5,10,10)
> data <- data.frame(site,data,bugs)

In the small dataset above, I want to combine the number of bugs based on site and date, e.g. sum the number of bugs in the two "A1" on "7/31" columns so that it just says "Site = A1, Date = 7/31, bugs =20", as the data frame below would show:

> site <- c("A1","A2","A1","A2")
> date <-c("7/31","7/31", "8/29", "8/29")
> bugs <-c(20, 40, 10, 20)
> data <- data.frame(site,date,bugs)

Thank you for any advice.

This is a classic use of the group_by() and summerize() functions of the dplyr package.

library(dplyr)

site <- c("A1","A1","A2","A2","A1","A1","A2","A2")
date <- c("7/31","7/31","7/31","7/31","8/29","8/29","8/29","8/29")
bugs <- c(10,10,20,20,5,5,10,10)
data <- data.frame(site,date,bugs)
STATS <- group_by(data, site, date) %>% summarize(sum(bugs))
STATS
#> # A tibble: 4 x 3
#> # Groups:   site [?]
#>   site  date  `sum(bugs)`
#>   <fct> <fct>       <dbl>
#> 1 A1    7/31           20
#> 2 A1    8/29           10
#> 3 A2    7/31           40
#> 4 A2    8/29           20

Created on 2019-08-27 by the reprex package (v0.2.1)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.