Synth package: Error "undefined columns selected"

I'm trying to create a synthetic control by using R package "Synth". However, my code isn't working. I'm receiving the message "undefined columns selected", though my unit.variable is in fact nummeric (from 1 to 27).
I'm working with a panel dataset. My data is annual, ranging from 2012 to 2021, and I have dependent variable, "hompc", for 27 states, "uf", each with an "id" (number from 1 to 27).
My code is as follows:

# Loading syntehtic control
install.packages("Synth")
library("Synth")

> # dataprep for Synth
> dataprep.out <- dataprep(foo = dfss,
+   predictors = c("pibpc", 
+                  "gini", 
+                  "pop", 
+                  "ocuppc", 
+                  "pp1624", 
+                  "salmed"),
+   predictors.op = "mean",
+   time.predictors.prior = 2012:2018,
+   special.predictors = 
+     list(list("hompm", 2012:2018, "mean"), 
+          list("osppc", 2012:2018, "mean"), 
+          list("gini", 2012:2018, "mean"), 
+          list("popppc", 2012:2018, "mean"),
+          list("pp1624", 2012:2018, "mean")), 
+   dependent = "hompm",
+   unit.variable = "id",
+   unit.names.variable = 23,
+   time.variable = "ano",
+   treatment.identifier = "trat",
+   controls.identifier = c(1:22,24:27),
+   time.optimize.ssr = 2012:2018,
+   time.plot = 2012:2021)
Error in `[.data.frame`(foo, , unit.names.variable) :
undefined columns selected

Here is a sample of my dataset dfss:

> dput(head(dfss))
structure(list(id = c(1, 1, 1, 1, 1, 1), 
uf = c("AC", "AC", "AC", "AC", "AC", "AC"), ano = c(2012, 2013, 2014, 2015, 2016, 2017), 
pop = c(758786, 776463, 790101, 803513, 816687, 829619), 
trat = c(0, 0, 0, 0, 0, 0), 
osppc = c(416.344970703125, 494.075622558594, 521.079528808594, 596.987121582031, 608.531005859375, 511.153869628906), 
gini = c(0.569999992847443, 0.550000011920929, 0.529999971389771, 0.550000011920929, 0.560000002384186, 0.550000011920929), 
salmed = c(1200, 1221, 1305, 1455, 1474, 1446), 
ocuppc = c(0.000390096800401807, 0.000383791630156338, 0.000388557906262577, 0.00038953943294473, 0.00035999104147777, 0.000356790289515629), 
pibpc = c(13.1789464950562, 14.166805267334, 16.453592300415, 17.4234886169434, 17.1424312591553, 16.8752155303955), 
hompm = c(27.4122085571289, 30.1366577148438, 29.3633346557617, 
27.0064086914062, 44.4478721618652, 62.1972236633301), 
pp1624 = c(15.3000001907349, 14.3000001907349, 14.6000003814697, 15.1000003814697, 13.3000001907349, 12.8000001907349)), 
row.names = c(NA, 6L), 
class = "data.frame")

Could any of you help me, please? What is this error about?

I'm struggling with it for my master's thesis! Thanks in advance :slight_smile:

Hi @edison.jinzo,
I’m not familiar with the package you are using but in the “special.predictors” argument you are referring to a variable called “popppc” which doesn’t appear in your data frame.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.