Counting rows in data frame based on certain criteria of a column

I am a new user to R and do not understand how the syntax works. I know what I need to do, but do not know how to write it. I have the following dataframe:

tibble::tribble(~Time.x, ~Site.ID, ~Vol..mL..x, ~Soil.Type.x, 
    ~Radius.x, 0L, "H1", 60L, "Sand", 5L, 30L, "H1", 60L, "Sand", 
    5L, 60L, "H1", 60L, "Sand", 5L, 90L, "H1", 60L, "Sand", 5L, 
    120L, "H1", 60L, "Sand", 5L, 150L, "H1", 60L, "Sand", 5L, 
    180L, "H1", 60L, "Sand", 5L, 210L, "H1", 60L, "Sand", 5L, 
    240L, "H1", 60L, "Sand", 5L, 270L, "H1", 60L, "Sand", 5L, 
    300L, "H1", 60L, "Sand", 5L, 0L, "H2", 60L, "Sand", 5L, 30L, 
    "H2", 60L, "Sand", 5L, 60L, "H2", 60L, "Sand", 5L, 90L, "H2", 
    60L, "Sand", 5L, 120L, "H2", 60L, "Sand", 5L, 150L, "H2", 
    60L, "Sand", 5L, 180L, "H2", 60L, "Sand", 5L, 210L, "H2", 
    60L, "Sand", 5L, 240L, "H2", 60L, "Sand", 5L, 270L, "H2", 
    60L, "Sand", 5L, 300L, "H2", 60L, "Sand", 5L, 330L, "H2", 
    60L, "Sand", 5L, 360L, "H2", 60L, "Sand", 5L)

#> # A tibble: 24 x 5
#>    Time.x Site.ID Vol..mL..x Soil.Type.x Radius.x
#>     <int> <chr>        <int> <chr>          <int>
#>  1      0 H1              60 Sand               5
#>  2     30 H1              60 Sand               5
#>  3     60 H1              60 Sand               5
#>  4     90 H1              60 Sand               5
#>  5    120 H1              60 Sand               5
#>  6    150 H1              60 Sand               5
#>  7    180 H1              60 Sand               5
#>  8    210 H1              60 Sand               5
#>  9    240 H1              60 Sand               5
#> 10    270 H1              60 Sand               5
#> # ... with 14 more rows

I would like to count the number of rows where Time.x = 0 to the row above when Time.x = 0 again. So for the first go through it would spit out 11 rows and for the second go through it would spit out 13. I have a much larger data frame with a ton more Site.ID's, so I am not sure if I need a loop or how to format this. I would assume to use nrow, but I do not know what to do afterwards.

Cheers!

If I understan correctly you want to count the number of rows for each Site.ID, something like this.

library(dplyr)

data <- tibble::tribble(~Time.x, ~Site.ID, ~Vol..mL..x, ~Soil.Type.x, 
                        ~Radius.x, 0L, "H1", 60L, "Sand", 5L, 30L, "H1", 60L, "Sand", 
                        5L, 60L, "H1", 60L, "Sand", 5L, 90L, "H1", 60L, "Sand", 5L, 
                        120L, "H1", 60L, "Sand", 5L, 150L, "H1", 60L, "Sand", 5L, 
                        180L, "H1", 60L, "Sand", 5L, 210L, "H1", 60L, "Sand", 5L, 
                        240L, "H1", 60L, "Sand", 5L, 270L, "H1", 60L, "Sand", 5L, 
                        300L, "H1", 60L, "Sand", 5L, 0L, "H2", 60L, "Sand", 5L, 30L, 
                        "H2", 60L, "Sand", 5L, 60L, "H2", 60L, "Sand", 5L, 90L, "H2", 
                        60L, "Sand", 5L, 120L, "H2", 60L, "Sand", 5L, 150L, "H2", 
                        60L, "Sand", 5L, 180L, "H2", 60L, "Sand", 5L, 210L, "H2", 
                        60L, "Sand", 5L, 240L, "H2", 60L, "Sand", 5L, 270L, "H2", 
                        60L, "Sand", 5L, 300L, "H2", 60L, "Sand", 5L, 330L, "H2", 
                        60L, "Sand", 5L, 360L, "H2", 60L, "Sand", 5L)

data %>%
    count(Site.ID)
#> # A tibble: 2 x 2
#>   Site.ID     n
#>   <chr>   <int>
#> 1 H1         11
#> 2 H2         13

Created on 2019-01-08 by the reprex package (v0.2.1)

3 Likes

Yes, that is correct. Could you please explain what "data" is doing? Is that calling/selecting the dataframe?

data is a dataframe, I am using the pipe operator %>% from dplyr for expressing the operations that I want to apply to data, like count() for example. I could also express this as count(data, Site.ID) with the same results.

If you read this chapter of the book that I have advised you to read before, you will understand how to use the pipe operator.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.