Choose a subset based on one other data.frame and view problem


#1

Hello everyone,

I want to pick up s subset based on one of its column existing in one other data frame. For example, I want to pick a subset from orig based on that ''key_id'' existed in "sub". Because the real data sample is large, I want an efficient way to achieve it.

orig <- tribble(
  ~key_id, ~num,
  1, 0,
  2, 2,
  8, 4,
  13, 2   
)

sub <- tribble(
  ~key_id,
  2,
  13   
)

My second question is about viewing a data.frame: the left column of treat is not show as 1,2,3... but instead, it is shown as 3,9, 10 and so on. Is there any way to make it looks normal?

R%20res

Many thanks,
Lernst


#2

What you want to achieve is usually done with joins. Specifically in your case it is called dplyr::semi_join:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

orig <- tribble(
  ~key_id, ~num,
  1, 0,
  2, 2,
  8, 4,
  13, 2   
)

sub <- tribble(
  ~key_id,
  2,
  13   
)

semi_join(orig, sub, by = "key_id")
#> # A tibble: 2 x 2
#>   key_id   num
#>    <dbl> <dbl>
#> 1      2     2
#> 2     13     2

Created on 2018-08-05 by the reprex package (v0.2.0).

What you see in View is that your original dataset contains rownames that are simply integers 1 to number of rows. You can convert them to explicit column with tibble::rownames_to_column command and then do with them what you need.