Creating Dummy Variables Based on Two Conditions

Could someone please assist me in creating dummy variables based on two conditions (State = and Year >=)? I am using panel data and would like to create a dummy variable for the revocation of a policy. In other words, certain states revoked this policy at different points in time and I would like to add this variable but am having a very hard time getting it written correctly.

Here is what I am looking for... hypothetically, in 2015 Arizona revokes policy, all others keep it. In 2016 Deleware revokes policy, all others keep it. In 2017 California revokes policy, all others keep it. And so on...

Any help would be greatly appreciated!

|2015| AL| 1|
|2015| AR| 0|
|2015| CA| 1|
|2015| CO| 1|
|2015| DE| 1|
|2016| AL| 1|
|2016| AR| 0|
|2016| CA| 1|
|2016| CO| 1|
|2016| DE| 0|
|2017| AL| 1|
|2017| AR| 0|
|2017| CA| 0|
|2017| CO| 1|
|2017| DE| 0|
|2018| AL| 1|
|2018| AR| 0|
|2018| CA| 0|
|2018| CO| 0|
|2018| DE| 0 |
|2019| AL| 0|
|2019| AR| 0|
|2019| CA| 0|
|2019| CO| 0|
|2019| DE| 0|

Is the third column what you are trying to achieve, or is it something that you already have in your data and that you would like to use as a condition for generating another (fourth) column? Please provide some additional context on what you are trying to achieve so that we might be able to help you out.

It would represent what I am trying to achieve.

Essentially I am trying to run a D.I.D. regression to show differences between states in which the law was enacted and was not. Data is across all 50 states, running from 2001 to 2020.

Perhaps something like this.

library(tidyverse)

policy <- tibble::tribble(
              ~year,  ~state,
            2015L, "AL",
            2015L, "AR",
            2015L, "CA",
            2015L, "CO",
            2015L, "DE",
            2016L, "AL",
            2016L, "AR",
            2016L, "CA",
            2016L, "CO",
            2016L, "DE",
            2017L, "AL",
            2017L, "AR",
            2017L, "CA",
            2017L, "CO",
            2017L, "DE",
            2018L, "AL",
            2018L, "AR",
            2018L, "CA",
            2018L, "CO",
            2018L, "DE",
            2019L, "AL",
            2019L, "AR",
            2019L, "CA",
            2019L, "CO",
            2019L, "DE"
)

policy %>% 
  mutate(dummy = case_when(year >= 2015 & state == "AR" ~ 0,
                           year >= 2016 & state == "DE" ~ 0,
                           year >= 2017 & state == "CA" ~ 0,
                           TRUE ~ 1))
#> # A tibble: 25 × 3
#>     year state dummy
#>    <int> <chr> <dbl>
#>  1  2015 AL        1
#>  2  2015 AR        0
#>  3  2015 CA        1
#>  4  2015 CO        1
#>  5  2015 DE        1
#>  6  2016 AL        1
#>  7  2016 AR        0
#>  8  2016 CA        1
#>  9  2016 CO        1
#> 10  2016 DE        0
#> # … with 15 more rows

Created on 2021-10-11 by the reprex package (v2.0.1)

That was ver close to what I originally had, however, this error kept me from running it....

"Error: Problem with mutate() input repeals_after_2001.
x Case 1 (Year >= 2010 & State == 4) must be a two-sided formula, not a logical vector.
:information_source: Input repeals_after_2001 is case_when(Year >= 2010 & State == 4, 0, TRUE ~ 0)."

Any thoughts?

Thanks again

Looks like you're missing a tilde (~) in your repeals_after_2001, which should be

case_when(Year >= 2010 & State == 4 ~ 0, TRUE ~ 0)

(Although this only makes a new column with only 0s.)

That was a typo carrying over from something else. Adding the tilde and changing to "TRUE ~ 1" produced no errors, so hopefully, when I go back and look at all of the states it will work.

Thank you so much for the help, I have been banging my head against a wall trying to figure out what was going wrong.

Best.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.