How to filter a dataframe according to several columns

I have a dataframe named DF, with three columns: longitude, latitude, and values. How to process DF, so that the longitude and latitude fall within a smaller sample DF_include?

DF
lon lat value
-105.3125 39.9375 21
-105.4375 39.9375 10
-105.3125 40.0625 16
-105.4375 40.0625 11
...

DF_include
lon lat
-105.3125 39.9375
-105.4375 40.0625
...

I tried the code below, but it does nothing to DF, thanks for your help.
DF_sub = subset(DF, (lon = DF_include$lon)&(lat = DF_include$lat))

You can perform an inner join

df <- data.frame(
    lon = c(-105.3125, -105.4375, -105.3125, -105.4375),
    lat = c(39.9375, 39.9375, 40.0625, 40.0625),
    value = c(21L, 10L, 16L, 11L))
df_include <- data.frame(
    lon = c(-105.3125, -105.4375),
    lat = c(39.9375, 40.0625))
library(dplyr)
df %>% 
    inner_join(df_include)
#> Joining, by = c("lon", "lat")
#>         lon     lat value
#> 1 -105.3125 39.9375    21
#> 2 -105.4375 40.0625    11

Created on 2019-04-28 by the reprex package (v0.2.1.9000)

I have a follow up question. Besides choosing the "DF_include" grid cells, is it possible to fill the other cells with NAs in "DF" and still maintain the rectagular shape of "DF"? Right now, the grid cells not in "DF_include" are blank, and I want to have NAs for them in "DF". Thanks again.

I'm not sure if I understand you correctly but I think you want to replace the inner_join() by a left_join()

Not like this, but I will stick to the former solution.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.