How do I refer to the left hand side dataframe with group_by?

Usually, I would use . to refer to the LHS, but group_by is not exactly compatible.

df |>
  group_by(col1) |>
  mutate(col2 = rbinom(n = nrow(.),
                       1,
                       prob = 0.8))

Is there a way to get nrow() of that particular group_by group?

{dplyr} has a load of "context dependent expressions", one of which is cur_data() which may be of help here:

mtcars |> 
  dplyr::group_by(cyl, am) |> 
  dplyr::summarise(nrow = nrow(dplyr::cur_data()))
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups`
#> argument.
#> # A tibble: 6 × 3
#> # Groups:   cyl [3]
#>     cyl    am  nrow
#>   <dbl> <dbl> <int>
#> 1     4     0     3
#> 2     4     1     8
#> 3     6     0     4
#> 4     6     1     3
#> 5     8     0    12
#> 6     8     1     2

Created on 2022-10-17 with reprex v2.0.2

1 Like

You can also refer to the function dplyr::n() which outputs the number of rows for each group:

mtcars |> 
  dplyr::group_by(cyl, am) |> 
  dplyr::summarise(nrow = dplyr::n())
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups`
#> argument.
#> # A tibble: 6 × 3
#> # Groups:   cyl [3]
#>     cyl    am  nrow
#>   <dbl> <dbl> <int>
#> 1     4     0     3
#> 2     4     1     8
#> 3     6     0     4
#> 4     6     1     3
#> 5     8     0    12
#> 6     8     1     2

Created on 2022-10-17 by the reprex package (v2.0.1)

Kind regards

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.