Help with subset

On learning resources, see The Big Book of R. A good introductory course is R Basics. For tutoring, there's a jobs board here (see this example).

Every R problem can be thought of with advantage as the interaction of three objects— an existing object, x , a desired object,y , and a function, f, that will return a value of y given x as an argument. In other words, school algebra— f(x) = y. Any of the objects can be composites.

Here's an illustration of a problem similar to yours, except with rows, rather than columns.

# x is a large data frame composed of integers
# 
# simulated data to substitute for the actual x
set.seed(42)
(DF <- rbind(sample(-20:400,20),sample(-20:400,20),sample(-20:400,20),sample(-20:400,20),sample(-20:400,20)))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,]   28  300  132   53  207  125  101  400  107   282     3   306   335    68
#> [2,]  276   68  262   88  -16  191  327  339  238   293   277     3   137   278
#> [3,]  125   88  327  176  -17  205  334  194  224   389    93   241   369   109
#> [4,]  117   19  -16   12   82  207   88  308  136    55   244    14   200    -5
#> [5,]   61  387  348  381  304   89  339  275  128    36    79   277   397    70
#>      [,15] [,16] [,17] [,18] [,19] [,20]
#> [1,]   144    89    -1   349   346   366
#> [2,]   378   385   391   115   271   303
#> [3,]   351   -18   353   237   337   165
#> [4,]   336   199   227   304    97   109
#> [5,]   248   160    33   318   267   187

# Substitute a row with no values outside the range

DF[3,] <- 1:20

# y is a smaller data frame composed of integers lies outside the range 1:365, so let's create an object to represent that range

boring <- 1:365

# Every row of y is also a row of x, so y is a subset of x. If, and only if,
# there is no integer in x outside a specified range of integers is identical(x,y) equal to TRUE. The objective is to find all rows that have one or more integers outside the specified range of the interval.
# 
# f is a function that needs to be composed, so let's start

# determine whether a single integer is outside the specified range

find_outsider <- function(x) !(x %in% boring)

# example

find_outsider(400)
#> [1] TRUE

# determine which integers in a single row are outside the range

find_outsider(DF[1,])
#>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
#> [13] FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE

# but we're only really interested if there is at least one
sum(find_outsider(DF[1,])) > 0
#> [1] TRUE

# let's glue this together

pick_rows <- function(x,y) sum(find_outsider(x[y,])) > 0

# create a vector to hold results

hits <- vector()

# loop over rows

for(i in 1:nrow(DF)) hits[i] = pick_rows(DF,i)

# use the hits logical vector to subset DF


DF[hits,]
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,]   28  300  132   53  207  125  101  400  107   282     3   306   335    68
#> [2,]  276   68  262   88  -16  191  327  339  238   293   277     3   137   278
#> [3,]  117   19  -16   12   82  207   88  308  136    55   244    14   200    -5
#> [4,]   61  387  348  381  304   89  339  275  128    36    79   277   397    70
#>      [,15] [,16] [,17] [,18] [,19] [,20]
#> [1,]   144    89    -1   349   346   366
#> [2,]   378   385   391   115   271   303
#> [3,]   336   199   227   304    97   109
#> [4,]   248   160    33   318   267   187

This makes use of the subset operator [. which allows easy selection of rows, columns, both rows and columns.

mtcars[1:2,]
#>               mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4      21   6  160 110  3.9 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4
mtcars[,3:4]
#>                      disp  hp
#> Mazda RX4           160.0 110
#> Mazda RX4 Wag       160.0 110
#> Datsun 710          108.0  93
#> Hornet 4 Drive      258.0 110
#> Hornet Sportabout   360.0 175
#> Valiant             225.0 105
#> Duster 360          360.0 245
#> Merc 240D           146.7  62
#> Merc 230            140.8  95
#> Merc 280            167.6 123
#> Merc 280C           167.6 123
#> Merc 450SE          275.8 180
#> Merc 450SL          275.8 180
#> Merc 450SLC         275.8 180
#> Cadillac Fleetwood  472.0 205
#> Lincoln Continental 460.0 215
#> Chrysler Imperial   440.0 230
#> Fiat 128             78.7  66
#> Honda Civic          75.7  52
#> Toyota Corolla       71.1  65
#> Toyota Corona       120.1  97
#> Dodge Challenger    318.0 150
#> AMC Javelin         304.0 150
#> Camaro Z28          350.0 245
#> Pontiac Firebird    400.0 175
#> Fiat X1-9            79.0  66
#> Porsche 914-2       120.3  91
#> Lotus Europa         95.1 113
#> Ford Pantera L      351.0 264
#> Ferrari Dino        145.0 175
#> Maserati Bora       301.0 335
#> Volvo 142E          121.0 109
mtcars[1,1]
#> [1] 21
1 Like