Create a subset of a panel data set

Hi, @MLent! Thanks for including some of your data. There's a couple things you can do to make it easier for folks here to help with your question. The first is formatting your code as code so it's easier to read and copy and paste into an R console. Basically, you just enclose your code between three back ticks like this:

``` r
reg <- plm(y~x, data=subset(df, ID[Variable>1000]), model="within")
```

Also, to make it easier for folks here to read and work with, it's better to create an R object with your sample data and post it here. This post has some good tips for how to include sample data:

So, with your example, I would do something like the following:

# create sample data
my_data <- tibble::tribble(
 ~ID, ~Time, ~Variable,
 1, 1, 123,
 1, 2, 1001,
 1, 3, 90,
 2, 1, 1111,
 2, 2, 222,
 2, 3, 2222,
 3, 1, 200,
 3, 2, 2000,
 3, 3, 4000
 )

(I added more fake data to make the example a bit more clear.)

To manipulate data, I like to use the the dplyr package, which is part of the tidyverse. It can sometimes be a little more verbose than other ways of coding in R, but I think it makes the code easier to understand!

So here is how I would create a subset of the data you describe. First I find which IDs meet the conditions you define, and then I use those IDs to subset the full dataset.

library(dplyr)

# create vector of IDs meeting condition
my_ids <- my_data %>%
  filter(Time == 2 & Variable > 1000) %>%
  pull(ID)
my_ids
#> [1] 1 3

# subset data using that vector
my_subset <- my_data %>%
  filter(ID %in% my_ids)
my_subset
#> # A tibble: 6 x 3
#>      ID  Time Variable
#>   <dbl> <dbl>    <dbl>
#> 1     1     1      123
#> 2     1     2     1001
#> 3     1     3       90
#> 4     3     1      200
#> 5     3     2     2000
#> 6     3     3     4000

Created on 2018-11-15 by the reprex package (v0.2.1)

2 Likes