Is it possible to make dplyr::between() non-inclusive?


#1

I'm currently trying to filter out all rows after some condition, including the rows which meet that condition. With dplyr::between(), it seems like I can do that, except it'll still contain the rows that meet that condition.

Here's an example:

library(dplyr)

mtcars %>% 
  as_tibble() %>% 
  filter(between(row_number(), 1, which(mpg == 17.8)))

#> # A tibble: 11 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> 11  17.8     6  168.   123  3.92  3.44  18.9     1     0     4     4

So, this contains the row in which mpg = 17.8. Any idea how I can get it so it doesn't include that row? Thanks!


#2

At the moment you have to use inequalities:

library(dplyr)

mtcars %>% 
    as_tibble() %>% 
    filter(1 < row_number(), row_number() < which(mpg == 17.8))
#> Warning in filter_impl(.data, quo): hybrid evaluation forced for
#> `row_number`. Please use dplyr::row_number() or library(dplyr) to remove
#> this warning.

#> Warning in filter_impl(.data, quo): hybrid evaluation forced for
#> `row_number`. Please use dplyr::row_number() or library(dplyr) to remove
#> this warning.
#> # A tibble: 9 x 11
#>     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#> 2  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#> 3  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#> 4  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#> 5  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#> 6  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#> 7  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#> 8  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 9  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4

(Side note: I have no idea why the warnings are popping up; this usage seems unambiguous. Using dplyr::row_number() does make them go away. Reported.)

I generally use inequalities anyway, as generally in English "between" is exclusive, but dplyr::between is based on the SQL function, which is inclusive:

Using inequalities removes any ambiguity.


#3

As you are working with rows that are integer number, x <= 1 is equivalent to x < 2.
So if you don't want the row that meet the condition, just take the previous row number (which(mpg == 17.8) - 1).

library(dplyr, warn.conflicts = FALSE)
mtcars %>% 
  as_tibble() %>% 
  filter(between(row_number(), 1, which(mpg == 17.8) - 1))
#> # A tibble: 10 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4

Created on 2018-08-12 by the reprex package (v0.2.0).