Propensity scores

Hi- I am trying to do a propensity matched analysis but am having a lot of trouble. I have a large data set with an exposure coded as 0 (no exposure) and 1 (exposure) and am trying to matched based on a couple of variables. Basically I was trying to follow a tutorial on propensity matching via Coursera but am getting a really weird output. My initial dataset has 2,202 distinct observations. However, once I do the matching, my dataset has 3,074 distinct observations, which is obviously not supposed to happen.

Does anyone know what I'm doing wrong? I have been trying to troubleshoot for the past week but keep coming up empty handed.

Here is what I'm doing:

create a dataset with my covariates, exposure, and outcome

race <- as.numeric(cohort$race_eth)
insurance <- as.numeric(cohort$privateinsurance)
language <- as.numeric(cohort$primarylanguage)
bloodpressure <- as.numeric(cohort$bloodpressure
bmi <- cohort$bmiatdelivery
exp <- as.numeric(cohort$prechange)
out <- as.numeric(cohort$tdapvaccinedate_yn)

merge new dataset

propensity <- cbind(race, insurance, language, bloodpressure, bmi, exp, out)
propensity <- data.frame(prop_score)

covariates to use in matching

xvars <- c("race", "insurance", "language", "bloodpressure", "age", "bmi")
table1 <- CreateTableOne(vars=xvars, strata="exp",data=propensity, test=FALSE)
print(table1, smd=TRUE)

do matching

greedymatch <- Match(Tr=propensity$exp, M=1, X=propensity[xvars])
matched <- propensity[unlist(greedymatch[c("index.treated", "index.control")]),] # THIS IS WHERE THE PROBLEM OCCURS SHOWING THAT I HAVE 3074 OBSERVATIONS

See the FAQ: How to do a minimal reproducible example reprex for beginners.

Without cohort or another dataset that surfaces the same issue, it's not practical to attempt an answer.

Hi- thanks for your comment! I have done what you said and have posted my code here. Any help would be greatly appreciated!!

df <- data.frame(
outcome = c(0, 0, 0, 1, 0),
exposure = c(0, 0, 0, 1, 1),
insurance = c(1, 1, 1, 1, 1),
language = c(3, 1, 1, 1, 1),
age = c(32, 36, 22, 26, 38),
bmi = c(23.8407, 25.354099, 29.709999, 26.9098, 36.290401),
race_eth = as.factor(c("5", "1", "2", "1", "2")),
nullip = as.factor(c("1", "0", "1", "1", "0"))
)

library(tableone)
library(Matching)

#recode variables to use in matching
race <- as.numeric(cohort$race_eth)
insurance <- as.numeric(cohort$insurance)
language <- as.numeric(cohort$language)
nullip <- as.numeric(cohort$nullip)
age <- cohort$age
bmi <- cohort$bmi
exp <- as.numeric(cohort$exposure)
out <- as.numeric(cohort$outcome)

#create new dataset
prop_score <- cbind(race, insurance, language, nullip, pnc, age, bmi, exp, out)
prop_score <- data.frame(prop_score)
xvars <- c("race", "insurance", "language", "nullip", "pnc", "age", "bmi")

#table 1
table1 <- CreateTableOne(vars=xvars, strata="exp",data=prop, test=FALSE)
print(table1, smd=TRUE)

#matching
greedymatch <- Match(Tr=prop$exp, M=1, X=prop[xvars])
matched <- prop[unlist(greedymatch[c("index.treated", "index.control")]),]

The %<>% operator from {magrittr} may be unfamiliar. It just sends an object to a function and overwrites it back.

suppressPackageStartupMessages({
  library(dplyr)
  library(Matching)
  library(magrittr)
  library(tableone)
})

cohort <- data.frame(
  outcome = c(0, 0, 0, 1, 0),
  exposure = c(0, 0, 0, 1, 1),
  insurance = c(1, 1, 1, 1, 1),
  language = c(3, 1, 1, 1, 1),
  age = c(32, 36, 22, 26, 38),
  bmi = c(23.8407, 25.354099, 29.709999, 26.9098, 36.290401),
  race_eth = as.factor(c("5", "1", "2", "1", "2")),
  nullip = as.factor(c("1", "0", "1", "1", "0"))
)

cohort %<>% mutate(across(where(is.factor), as.numeric))
xvars <- c("race_eth", "insurance", "language", "nullip", "age", "bmi")
table1 <- CreateTableOne(vars=xvars, strata="exposure",data=cohort, test=FALSE)
print(table1, smd=TRUE)
#>                        Stratified by exposure
#>                         0            1            SMD   
#>   n                         3            2              
#>   race_eth (mean (SD))   2.00 (1.00)  1.50 (0.71)  0.577
#>   insurance (mean (SD))  1.00 (0.00)  1.00 (0.00) <0.001
#>   language (mean (SD))   1.67 (1.15)  1.00 (0.00)  0.816
#>   nullip (mean (SD))     1.67 (0.58)  1.50 (0.71)  0.258
#>   age (mean (SD))       30.00 (7.21) 32.00 (8.49)  0.254
#>   bmi (mean (SD))       26.30 (3.05) 31.60 (6.63)  1.027

greedymatch <- Match(Tr=cohort$exposure, M=1, X=cohort[xvars])

matched <- cohort[unlist(greedymatch[c("index.treated", "index.control")]),]

matched
#>   outcome exposure insurance language age     bmi race_eth nullip
#> 4       1        1         1        1  26 26.9098        1      2
#> 5       0        1         1        1  38 36.2904        2      1
#> 3       0        0         1        1  22 29.7100        2      2
#> 2       0        0         1        1  36 25.3541        1      1

Thanks for your reply! However, when I use my actual dataset I am still getting a matched cohort with more observations than my starting cohort. Everything is the same with my code, but I'm just using my actual dataset. Do you know where I could be going wrong?

If you can use dput() to copy in your data, can you suggest one of the R built-in datasets that reproduces the problem?

I'm not quite sure what you mean by that. I'm sorry- I'm an R beginner!

The one thing that is different between my dataset and the reprex I sent you is that I had to remove all the rows with NAs in order to do the greedymatch. Do you think that could have something to do with the error? This is the code I used:

cohort[complete.cases(cohort),'
propensity_score <- na.omit(cohort)

Can't really tell without the data.

Why don't you try to emulate the form of your modeling using the lalonde dataset in the Matching package?

Hi- thank you for your patience! I recreated the problem with the lalonde dataset.

library(Matching)
data(lalonde)
View(lalonde)
library(tableone)

race <- as.numeric(lalonde$black)
ethnicity <- as.numeric(lalonde$hisp)
married <- as.numeric(lalonde$married)
nodegr <- as.numeric(lalonde$nodegr)
age <- lalonde$age
exposure <- lalonde$treat
outcome <- as.numeric(lalonde$re78)

#check and exclude missing values
sum(is.na(race))
sum(is.na(ethnicity))
sum(is.na(married))
sum(is.na(nodegr))
sum(is.na(age))
sum(is.na(outcome))
prop_score <- cbind(race, ethnicity, married, nodegr, age, exposure, outcome)
prop_score <- data.frame(prop_score)
prop_score[complete.cases(prop_score),]
prop <- na.omit(prop_score)

prop %<>% mutate(across(where(is.factor), as.numeric))
xvars <- c("race", "ethnicity", "married", "nodegr", "age")
table1 <- CreateTableOne(vars=xvars, strata="exposure",data=prop, test=FALSE)
print(table1, smd=TRUE)
#prop has 445 observations

greedymatch <- Match(Tr=prop$exposure, M=1, X=prop[xvars])
matched <- prop[unlist(greedymatch[c("index.treated", "index.control")]),]
#matched has 2090 observations

matchedtab1 <- CreateTableOne(vars=xvars, strata="exposure", data=matched, test=FALSE)
print(matchedtab1, smd=TRUE)

1 Like

Here it is simplified. Just so I'm sure that I'm correctly oriented, the issue is why does prop have 445 rows and matched 2,090 and matchedtab1 1,045?

suppressPackageStartupMessages({
  library(dplyr)
  library(Matching)
  library(magrittr)
  library(tableone)
})

data(lalonde)

which(complete.cases(lalonde) == FALSE)
#> integer(0)

prop <- lalonde[,c("black", "hisp", "married", "nodegr", "age", "treat", "re78")]

colnames(prop) <- c("race", "ethnicity", "married", "nodegr", "age", "exposure", "outcome")

nrow(prop)
#> [1] 445

xvars <- c("race", "ethnicity", "married", "nodegr", "age")

table1 <- CreateTableOne(vars = xvars, strata = "exposure", data = prop, test = FALSE)

print(table1, smd = TRUE)
#>                        Stratified by exposure
#>                         0            1            SMD   
#>   n                       260          185              
#>   race (mean (SD))       0.83 (0.38)  0.84 (0.36)  0.044
#>   ethnicity (mean (SD))  0.11 (0.31)  0.06 (0.24)  0.175
#>   married (mean (SD))    0.15 (0.36)  0.19 (0.39)  0.094
#>   nodegr (mean (SD))     0.83 (0.37)  0.71 (0.46)  0.304
#>   age (mean (SD))       25.05 (7.06) 25.82 (7.16)  0.107

table1$ContTable[1][[1]][1]  + table1$ContTable[2][[1]][1] == nrow(prop)
#> [1] TRUE

greedymatch <- Match(Tr = prop$exposure, M = 1, X = prop[xvars])

greedymatch$orig.nobs
#> [1] 445

matched <- prop[unlist(greedymatch[c("index.treated", "index.control")]), ]

nrow(matched)
#> [1] 2090

matchedtab1 <- CreateTableOne(vars = xvars, strata = "exposure", data = matched, test = FALSE)

print(matchedtab1, smd = TRUE)
#>                        Stratified by exposure
#>                         0            1            SMD   
#>   n                      1045         1045              
#>   race (mean (SD))       0.96 (0.21)  0.96 (0.21) <0.001
#>   ethnicity (mean (SD))  0.02 (0.14)  0.02 (0.14) <0.001
#>   married (mean (SD))    0.07 (0.25)  0.07 (0.25)  0.004
#>   nodegr (mean (SD))     0.91 (0.29)  0.91 (0.29)  0.003
#>   age (mean (SD))       21.98 (5.20) 22.02 (5.41)  0.007
1 Like

Exactly. I'm wondering how to fix it so the matched sample is actually the matched observations from the initial dataset.

Basically my issue is that I have this large dataset that I'm running an analysis on, and I want to do a propensity matched analysis for the sensitivity analysis. When you have a dataset and create a matched sample, it should be either equal or less to the original dataset, depending on how good the matching is. That's why I'm wondering what I'm doing wrong here that the matched dataset nearly quadruples the number of observations from the original dataset.

1 Like

OK. This arises from the unlist on greedymatch.

 str(greedymatch)
List of 23
 $ est              : num [1, 1] 0
 $ se               : NULL
 $ est.noadj        : num 0
 $ se.standard      : num 0
 $ se.cond          : NULL
 $ mdata            :List of 4
  ..$ Y                         : num [1:2090] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ Tr                        : num [1:2090] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ X                         : int [1:2090, 1:5] 1 1 0 1 1 1 1 1 1 1 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2090] "1" "1" "2" "3" ...
  .. .. ..$ : chr [1:5] "race" "ethnicity" "married" "nodegr" ...
  ..$ orig.weighted.treated.nobs: num 185
 $ index.treated    : num [1:1045] 1 1 2 3 4 4 4 4 4 5 ...
 $ index.control    : num [1:1045] 193 211 203 261 208 224 254 292 383 291 ...
 $ index.dropped    : NULL
 $ weights          : num [1:1045] 0.5 0.5 1 1 0.2 0.2 0.2 0.2 0.2 0.5 ...
 $ orig.nobs        : int 445
 $ orig.wnobs       : num 445
 $ orig.treated.nobs: int 185
 $ nobs             : int 445
 $ wnobs            : num 185
 $ caliper          : NULL
 $ ecaliper         : NULL
 $ exact            : NULL
 $ ndrops           : num 0
 $ ndrops.matches   : num 0
 $ MatchLoopC       : num [1:1045, 1:5] 1 1 2 3 4 4 4 4 4 5 ...
 $ version          : chr "fast"
 $ estimand         : chr "ATT"
 - attr(*, "class")= chr "Match"

and, in turn, the return value of Match doesn't appear to produce output suitable for CreateTableOne

suppressPackageStartupMessages({
  library(Matching)
})
data(lalonde)

#
# Estimate the propensity model
#
glm1 <- glm(treat ~ age + I(age^2) + educ + I(educ^2) + black +
  hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
  u74 + u75, family = binomial, data = lalonde)


#
# save data objects
#
X <- glm1$fitted
Y <- lalonde$re78
Tr <- lalonde$treat

#
# one-to-one matching with replacement (the "M=1" option).
# Estimating the treatment effect on the treated (the "estimand" option defaults to ATT).
#
rr <- Match(Y = Y, Tr = Tr, X = X, M = 1)
summary(rr)
#> 
#> Estimate...  2153.3 
#> AI SE......  825.4 
#> T-stat.....  2.6088 
#> p.val......  0.0090858 
#> 
#> Original number of observations..............  445 
#> Original number of treated obs...............  185 
#> Matched number of observations...............  185 
#> Matched number of observations  (unweighted).  346

# Let's check the covariate balance

mb <- MatchBalance(treat ~ age + I(age^2) + educ + I(educ^2) + black +
  hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
  u74 + u75, data = lalonde, match.out = rr, nboots = 500)
#> 
#> ***** (V1) age *****
#>                        Before Matching        After Matching
#> mean treatment........     25.816             25.816 
#> mean control..........     25.054             25.006 
#> std mean diff.........     10.655             11.317 
#> 
#> mean raw eQQ diff.....    0.94054            0.41618 
#> med  raw eQQ diff.....          1                  0 
#> max  raw eQQ diff.....          7                  9 
#> 
#> mean eCDF diff........   0.025364           0.010597 
#> med  eCDF diff........   0.022193          0.0086705 
#> max  eCDF diff........   0.065177           0.049133 
#> 
#> var ratio (Tr/Co).....     1.0278             1.0662 
#> T-test p-value........    0.26594            0.23472 
#> KS Bootstrap p-value..      0.504              0.606 
#> KS Naive p-value......     0.7481            0.79781 
#> KS Statistic..........   0.065177           0.049133 
#> 
#> 
#> ***** (V2) I(age^2) *****
#>                        Before Matching        After Matching
#> mean treatment........     717.39             717.39 
#> mean control..........     677.32             673.08 
#> std mean diff.........     9.2937             10.275 
#> 
#> mean raw eQQ diff.....     56.076             28.948 
#> med  raw eQQ diff.....         43                  0 
#> max  raw eQQ diff.....        721                909 
#> 
#> mean eCDF diff........   0.025364           0.010597 
#> med  eCDF diff........   0.022193          0.0086705 
#> max  eCDF diff........   0.065177           0.049133 
#> 
#> var ratio (Tr/Co).....     1.0115            0.91516 
#> T-test p-value........    0.33337            0.31819 
#> KS Bootstrap p-value..      0.504              0.606 
#> KS Naive p-value......     0.7481            0.79781 
#> KS Statistic..........   0.065177           0.049133 
#> 
#> 
#> ***** (V3) educ *****
#>                        Before Matching        After Matching
#> mean treatment........     10.346             10.346 
#> mean control..........     10.088              10.48 
#> std mean diff.........     12.806            -6.6749 
#> 
#> mean raw eQQ diff.....    0.40541            0.16185 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          2                  2 
#> 
#> mean eCDF diff........   0.028698           0.011561 
#> med  eCDF diff........   0.012682          0.0086705 
#> max  eCDF diff........    0.12651           0.052023 
#> 
#> var ratio (Tr/Co).....     1.5513             1.1917 
#> T-test p-value........    0.15017            0.45021 
#> KS Bootstrap p-value..      0.006              0.354 
#> KS Naive p-value......   0.062873            0.73726 
#> KS Statistic..........    0.12651           0.052023 
#> 
#> 
#> ***** (V4) I(educ^2) *****
#>                        Before Matching        After Matching
#> mean treatment........     111.06             111.06 
#> mean control..........     104.37             113.21 
#> std mean diff.........     17.012             -5.466 
#> 
#> mean raw eQQ diff.....     8.7189             3.1098 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....         60                 60 
#> 
#> mean eCDF diff........   0.028698           0.011561 
#> med  eCDF diff........   0.012682          0.0086705 
#> max  eCDF diff........    0.12651           0.052023 
#> 
#> var ratio (Tr/Co).....     1.6625             1.2716 
#> T-test p-value........   0.053676            0.51046 
#> KS Bootstrap p-value..      0.006              0.354 
#> KS Naive p-value......   0.062873            0.73726 
#> KS Statistic..........    0.12651           0.052023 
#> 
#> 
#> ***** (V5) black *****
#>                        Before Matching        After Matching
#> mean treatment........    0.84324            0.84324 
#> mean control..........    0.82692            0.85946 
#> std mean diff.........     4.4767            -4.4482 
#> 
#> mean raw eQQ diff.....   0.016216          0.0086705 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........  0.0081601          0.0043353 
#> med  eCDF diff........  0.0081601          0.0043353 
#> max  eCDF diff........    0.01632          0.0086705 
#> 
#> var ratio (Tr/Co).....    0.92503             1.0943 
#> T-test p-value........    0.64736            0.57783 
#> 
#> 
#> ***** (V6) hisp *****
#>                        Before Matching        After Matching
#> mean treatment........   0.059459           0.059459 
#> mean control..........    0.10769           0.048649 
#> std mean diff.........    -20.341             4.5591 
#> 
#> mean raw eQQ diff.....   0.048649          0.0057803 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.024116          0.0028902 
#> med  eCDF diff........   0.024116          0.0028902 
#> max  eCDF diff........   0.048233          0.0057803 
#> 
#> var ratio (Tr/Co).....    0.58288             1.2083 
#> T-test p-value........   0.064043            0.41443 
#> 
#> 
#> ***** (V7) married *****
#>                        Before Matching        After Matching
#> mean treatment........    0.18919            0.18919 
#> mean control..........    0.15385            0.16667 
#> std mean diff.........     8.9995              5.735 
#> 
#> mean raw eQQ diff.....   0.037838           0.017341 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.017672          0.0086705 
#> med  eCDF diff........   0.017672          0.0086705 
#> max  eCDF diff........   0.035343           0.017341 
#> 
#> var ratio (Tr/Co).....     1.1802             1.1045 
#> T-test p-value........    0.33425            0.46741 
#> 
#> 
#> ***** (V8) nodegr *****
#>                        Before Matching        After Matching
#> mean treatment........    0.70811            0.70811 
#> mean control..........    0.83462            0.69189 
#> std mean diff.........    -27.751             3.5572 
#> 
#> mean raw eQQ diff.....    0.12432           0.014451 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.063254          0.0072254 
#> med  eCDF diff........   0.063254          0.0072254 
#> max  eCDF diff........    0.12651           0.014451 
#> 
#> var ratio (Tr/Co).....     1.4998            0.96957 
#> T-test p-value........  0.0020368            0.49161 
#> 
#> 
#> ***** (V9) re74 *****
#>                        Before Matching        After Matching
#> mean treatment........     2095.6             2095.6 
#> mean control..........       2107             1624.3 
#> std mean diff.........   -0.23437             9.6439 
#> 
#> mean raw eQQ diff.....     487.98             467.33 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....       8413              12410 
#> 
#> mean eCDF diff........   0.019223           0.019782 
#> med  eCDF diff........     0.0158           0.018786 
#> max  eCDF diff........   0.047089           0.046243 
#> 
#> var ratio (Tr/Co).....     0.7381             2.2663 
#> T-test p-value........    0.98186            0.22745 
#> KS Bootstrap p-value..       0.55              0.256 
#> KS Naive p-value......    0.97023             0.8532 
#> KS Statistic..........   0.047089           0.046243 
#> 
#> 
#> ***** (V10) I(re74^2) *****
#>                        Before Matching        After Matching
#> mean treatment........   28141434           28141434 
#> mean control..........   36667413           13117852 
#> std mean diff.........    -7.4721             13.167 
#> 
#> mean raw eQQ diff.....   13311731           10899373 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....  365146387          616156569 
#> 
#> mean eCDF diff........   0.019223           0.019782 
#> med  eCDF diff........     0.0158           0.018786 
#> max  eCDF diff........   0.047089           0.046243 
#> 
#> var ratio (Tr/Co).....    0.50382             7.9006 
#> T-test p-value........    0.51322            0.08604 
#> KS Bootstrap p-value..       0.55              0.256 
#> KS Naive p-value......    0.97023             0.8532 
#> KS Statistic..........   0.047089           0.046243 
#> 
#> 
#> ***** (V11) re75 *****
#>                        Before Matching        After Matching
#> mean treatment........     1532.1             1532.1 
#> mean control..........     1266.9             1297.6 
#> std mean diff.........     8.2363             7.2827 
#> 
#> mean raw eQQ diff.....     367.61             211.42 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....     2110.2             8195.6 
#> 
#> mean eCDF diff........   0.050834           0.023047 
#> med  eCDF diff........   0.061954           0.023121 
#> max  eCDF diff........    0.10748           0.057803 
#> 
#> var ratio (Tr/Co).....     1.0763             1.4291 
#> T-test p-value........    0.38527            0.33324 
#> KS Bootstrap p-value..      0.046              0.176 
#> KS Naive p-value......    0.16449            0.60988 
#> KS Statistic..........    0.10748           0.057803 
#> 
#> 
#> ***** (V12) I(re75^2) *****
#>                        Before Matching        After Matching
#> mean treatment........   12654753           12654753 
#> mean control..........   11196530            8896263 
#> std mean diff.........     2.6024             6.7076 
#> 
#> mean raw eQQ diff.....    2840830            2887443 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....  101657197          344942969 
#> 
#> mean eCDF diff........   0.050834           0.023047 
#> med  eCDF diff........   0.061954           0.023121 
#> max  eCDF diff........    0.10748           0.057803 
#> 
#> var ratio (Tr/Co).....     1.4609              3.559 
#> T-test p-value........    0.77178            0.37741 
#> KS Bootstrap p-value..      0.046              0.176 
#> KS Naive p-value......    0.16449            0.60988 
#> KS Statistic..........    0.10748           0.057803 
#> 
#> 
#> ***** (V13) u74 *****
#>                        Before Matching        After Matching
#> mean treatment........    0.70811            0.70811 
#> mean control..........       0.75            0.68458 
#> std mean diff.........    -9.1895             5.1608 
#> 
#> mean raw eQQ diff.....   0.037838           0.017341 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.020946          0.0086705 
#> med  eCDF diff........   0.020946          0.0086705 
#> max  eCDF diff........   0.041892           0.017341 
#> 
#> var ratio (Tr/Co).....     1.1041            0.95721 
#> T-test p-value........    0.33033            0.52298 
#> 
#> 
#> ***** (V14) u75 *****
#>                        Before Matching        After Matching
#> mean treatment........        0.6                0.6 
#> mean control..........    0.68462            0.62072 
#> std mean diff.........    -17.225            -4.2182 
#> 
#> mean raw eQQ diff.....   0.081081           0.031792 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.042308           0.015896 
#> med  eCDF diff........   0.042308           0.015896 
#> max  eCDF diff........   0.084615           0.031792 
#> 
#> var ratio (Tr/Co).....     1.1133             1.0194 
#> T-test p-value........   0.068031            0.46507 
#> 
#> 
#> Before Matching Minimum p.value: 0.0020368 
#> Variable Name(s): nodegr  Number(s): 8 
#> 
#> After Matching Minimum p.value: 0.08604 
#> Variable Name(s): I(re74^2)  Number(s): 10
1 Like

Thank you! That code works for my dataset, and it seems like the matching functioned properly. But how do I conduct logistic regression on the matched cohort? It didn't create a dataset with just the matched values. I want to run logistic regressions on the matched cohort.

Also, why do you square the values in the glm model?

Again, I appreciate all your help and patience with my problem!

Hi, the code was from help(Match), which I should have noted. I didn't provide the model myself. I'm going to need to think about how to extract the matched cohort. Ping me after a decent interval?

1 Like

Hey- I appreciate all your help! Is there any chance you thought of how we can figure this out?

1 Like

Here's the documentation example modified to limit the variables of interest to those you selected. I have not renamed them, however, to ease comparison with the documentation example.

The number of observations used is identical between the input data and the mout output of class Match.

The source of the disconnect in your code is (where exposure is the example's treat)

greedymatch <- Match(Tr=prop$exposure, M=1, X=prop[xvars])

The Tr and M arguments are ok (although I think the default for M is already one). However, the X argument is problematic.

Although the documentation states

X \dots A matrix containing the variables we wish to match on. This matrix may contain the actual observed covariates or the propensity score or a combination of both. All columns of this matrix must have positive variance or Match will return an error.
that a matrix of the observed results for the selected values is allowed,

the Details note

If one wants to do propensity score matching, one should estimate the propensity model before calling Match, and then place the fitted values in the X matrix—see the provided example.

The introductory paper contrasts the propensity score with multivariate matching based on Mahalanobis distance (§2.2).

I haven't looked into why the Mahalanobis approach should result in the higher n; I suspect that it is because we're looking at matches.

suppressPackageStartupMessages({
  library(Matching)
  library(tableone)
})

data(lalonde)

xvars <- c("black", "hisp", "married", "nodegr", "age")
table1 <- CreateTableOne(vars = xvars, strata = "treat", data = lalonde, test = FALSE)
print(table1, smd = TRUE)
#>                      Stratified by treat
#>                       0            1            SMD   
#>   n                     260          185              
#>   black (mean (SD))    0.83 (0.38)  0.84 (0.36)  0.044
#>   hisp (mean (SD))     0.11 (0.31)  0.06 (0.24)  0.175
#>   married (mean (SD))  0.15 (0.36)  0.19 (0.39)  0.094
#>   nodegr (mean (SD))   0.83 (0.37)  0.71 (0.46)  0.304
#>   age (mean (SD))     25.05 (7.06) 25.82 (7.16)  0.107

glm1 <- glm(treat ~ age +
  black +
  hisp +
  married +
  nodegr +
  re74,
family = binomial,
data = lalonde
)

X <- glm1$fitted
Y <- lalonde$re78
Tr <- lalonde$treat

rr <- Match(Y = Y, Tr = Tr, X = X, M = 1)

mb <- MatchBalance(treat ~ age +
  black +
  hisp +
  married +
  nodegr +
  re74,
data = lalonde,
match.out = rr,
nboots = 500
)
#> 
#> ***** (V1) age *****
#>                        Before Matching        After Matching
#> mean treatment........     25.816             25.816 
#> mean control..........     25.054             26.294 
#> std mean diff.........     10.655            -6.6818 
#> 
#> mean raw eQQ diff.....    0.94054            0.20325 
#> med  raw eQQ diff.....          1                  0 
#> max  raw eQQ diff.....          7                  8 
#> 
#> mean eCDF diff........   0.025364          0.0054201 
#> med  eCDF diff........   0.022193           0.004065 
#> max  eCDF diff........   0.065177            0.01626 
#> 
#> var ratio (Tr/Co).....     1.0278            0.83815 
#> T-test p-value........    0.26594            0.31817 
#> KS Bootstrap p-value..      0.528               0.99 
#> KS Naive p-value......     0.7481            0.99997 
#> KS Statistic..........   0.065177            0.01626 
#> 
#> 
#> ***** (V2) black *****
#>                        Before Matching        After Matching
#> mean treatment........    0.84324            0.84324 
#> mean control..........    0.82692             0.8955 
#> std mean diff.........     4.4767            -14.333 
#> 
#> mean raw eQQ diff.....   0.016216            0.01626 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........  0.0081601          0.0081301 
#> med  eCDF diff........  0.0081601          0.0081301 
#> max  eCDF diff........    0.01632            0.01626 
#> 
#> var ratio (Tr/Co).....    0.92503             1.4125 
#> T-test p-value........    0.64736           0.039941 
#> 
#> 
#> ***** (V3) hisp *****
#>                        Before Matching        After Matching
#> mean treatment........   0.059459           0.059459 
#> mean control..........    0.10769            0.07027 
#> std mean diff.........    -20.341            -4.5591 
#> 
#> mean raw eQQ diff.....   0.048649           0.001355 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.024116         0.00067751 
#> med  eCDF diff........   0.024116         0.00067751 
#> max  eCDF diff........   0.048233           0.001355 
#> 
#> var ratio (Tr/Co).....    0.58288            0.85599 
#> T-test p-value........   0.064043            0.41443 
#> 
#> 
#> ***** (V4) married *****
#>                        Before Matching        After Matching
#> mean treatment........    0.18919            0.18919 
#> mean control..........    0.15385            0.21982 
#> std mean diff.........     8.9995            -7.7996 
#> 
#> mean raw eQQ diff.....   0.037838          0.0081301 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.017672           0.004065 
#> med  eCDF diff........   0.017672           0.004065 
#> max  eCDF diff........   0.035343          0.0081301 
#> 
#> var ratio (Tr/Co).....     1.1802            0.89445 
#> T-test p-value........    0.33425            0.28698 
#> 
#> 
#> ***** (V5) nodegr *****
#>                        Before Matching        After Matching
#> mean treatment........    0.70811            0.70811 
#> mean control..........    0.83462             0.6973 
#> std mean diff.........    -27.751             2.3715 
#> 
#> mean raw eQQ diff.....    0.12432           0.001355 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........   0.063254         0.00067751 
#> med  eCDF diff........   0.063254         0.00067751 
#> max  eCDF diff........    0.12651           0.001355 
#> 
#> var ratio (Tr/Co).....     1.4998            0.97924 
#> T-test p-value........  0.0020368            0.41443 
#> 
#> 
#> ***** (V6) re74 *****
#>                        Before Matching        After Matching
#> mean treatment........     2095.6             2095.6 
#> mean control..........       2107             1706.9 
#> std mean diff.........   -0.23437             7.9544 
#> 
#> mean raw eQQ diff.....     487.98             128.21 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....       8413              10308 
#> 
#> mean eCDF diff........   0.019223          0.0082117 
#> med  eCDF diff........     0.0158           0.004065 
#> max  eCDF diff........   0.047089            0.03252 
#> 
#> var ratio (Tr/Co).....     0.7381             1.4296 
#> T-test p-value........    0.98186            0.30237 
#> KS Bootstrap p-value..      0.584              0.112 
#> KS Naive p-value......    0.97023               0.83 
#> KS Statistic..........   0.047089            0.03252 
#> 
#> 
#> Before Matching Minimum p.value: 0.0020368 
#> Variable Name(s): nodegr  Number(s): 5 
#> 
#> After Matching Minimum p.value: 0.039941 
#> Variable Name(s): black  Number(s): 2

attach(lalonde)
X <- cbind(age, educ, black, hisp, married, nodegr, re74)

# The covariates we want to obtain balance on
BalanceMat <- cbind(age, educ, black, hisp, married, nodegr, re74)

genout <- GenMatch(
  Tr = treat, X = X, BalanceMatrix = BalanceMat, estimand = "ATE", M = 1,
  pop.size = 16, max.generations = 10, wait.generations = 1
)
#> Loading required namespace: rgenoud
#> 
#> 
#> Sat Mar 13 14:38:27 2021
#> Domains:
#>  0.000000e+00   <=  X1   <=    1.000000e+03 
#>  0.000000e+00   <=  X2   <=    1.000000e+03 
#>  0.000000e+00   <=  X3   <=    1.000000e+03 
#>  0.000000e+00   <=  X4   <=    1.000000e+03 
#>  0.000000e+00   <=  X5   <=    1.000000e+03 
#>  0.000000e+00   <=  X6   <=    1.000000e+03 
#>  0.000000e+00   <=  X7   <=    1.000000e+03 
#> 
#> Data Type: Floating Point
#> Operators (code number, name, population) 
#>  (1) Cloning...........................  1
#>  (2) Uniform Mutation..................  2
#>  (3) Boundary Mutation.................  2
#>  (4) Non-Uniform Mutation..............  2
#>  (5) Polytope Crossover................  2
#>  (6) Simple Crossover..................  2
#>  (7) Whole Non-Uniform Mutation........  2
#>  (8) Heuristic Crossover...............  2
#>  (9) Local-Minimum Crossover...........  0
#> 
#> SOFT Maximum Number of Generations: 10
#> Maximum Nonchanging Generations: 1
#> Population size       : 16
#> Convergence Tolerance: 1.000000e-03
#> 
#> Not Using the BFGS Derivative Based Optimizer on the Best Individual Each Generation.
#> Not Checking Gradients before Stopping.
#> Using Out of Bounds Individuals.
#> 
#> Maximization Problem.
#> GENERATION: 0 (initializing the population)
#> Lexical Fit..... 5.748786e-01  6.548664e-01  6.548664e-01  7.128050e-01  7.832412e-01  8.902903e-01  9.960588e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 16, #Total UniqueCount: 16
#> var 1:
#> best............ 9.289670e+02
#> mean............ 4.917288e+02
#> variance........ 8.269529e+04
#> var 2:
#> best............ 4.550728e+02
#> mean............ 3.907238e+02
#> variance........ 6.904880e+04
#> var 3:
#> best............ 3.806766e+02
#> mean............ 4.359593e+02
#> variance........ 6.719554e+04
#> var 4:
#> best............ 3.542337e+02
#> mean............ 3.134204e+02
#> variance........ 5.761572e+04
#> var 5:
#> best............ 7.952554e+02
#> mean............ 4.917387e+02
#> variance........ 7.642902e+04
#> var 6:
#> best............ 8.401417e+02
#> mean............ 5.899149e+02
#> variance........ 6.775066e+04
#> var 7:
#> best............ 3.673146e+02
#> mean............ 4.715795e+02
#> variance........ 7.536119e+04
#> 
#> GENERATION: 1
#> Lexical Fit..... 6.548664e-01  6.548664e-01  7.257700e-01  7.594559e-01  8.301410e-01  8.416173e-01  8.676569e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 13, #Total UniqueCount: 29
#> var 1:
#> best............ 6.588346e+02
#> mean............ 8.448473e+02
#> variance........ 3.821754e+04
#> var 2:
#> best............ 4.354876e+02
#> mean............ 4.121641e+02
#> variance........ 2.528664e+04
#> var 3:
#> best............ 3.350025e+02
#> mean............ 3.745095e+02
#> variance........ 1.169388e+04
#> var 4:
#> best............ 3.460951e+02
#> mean............ 4.016470e+02
#> variance........ 5.002570e+04
#> var 5:
#> best............ 7.955957e+02
#> mean............ 7.247522e+02
#> variance........ 1.103880e+04
#> var 6:
#> best............ 5.189830e+02
#> mean............ 7.363346e+02
#> variance........ 1.765436e+04
#> var 7:
#> best............ 6.338158e+02
#> mean............ 4.119778e+02
#> variance........ 2.159561e+04
#> 
#> GENERATION: 2
#> Lexical Fit..... 6.548664e-01  6.548664e-01  7.274113e-01  7.858046e-01  8.995694e-01  9.451454e-01  9.519317e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 10, #Total UniqueCount: 39
#> var 1:
#> best............ 7.862787e+02
#> mean............ 7.270021e+02
#> variance........ 2.150428e+04
#> var 2:
#> best............ 4.447276e+02
#> mean............ 4.334261e+02
#> variance........ 1.755559e+03
#> var 3:
#> best............ 3.561457e+02
#> mean............ 3.712857e+02
#> variance........ 9.433007e+03
#> var 4:
#> best............ 3.499347e+02
#> mean............ 3.857918e+02
#> variance........ 1.988041e+04
#> var 5:
#> best............ 7.905422e+02
#> mean............ 8.058979e+02
#> variance........ 1.595209e+03
#> var 6:
#> best............ 6.443106e+02
#> mean............ 6.008631e+02
#> variance........ 3.191360e+04
#> var 7:
#> best............ 5.233673e+02
#> mean............ 5.708700e+02
#> variance........ 2.052293e+04
#> 
#> GENERATION: 3
#> Lexical Fit..... 6.548664e-01  6.548664e-01  7.592078e-01  7.740863e-01  8.653295e-01  8.653295e-01  8.937531e-01  9.999994e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 11, #Total UniqueCount: 50
#> var 1:
#> best............ 6.588346e+02
#> mean............ 6.964218e+02
#> variance........ 2.835536e+03
#> var 2:
#> best............ 3.835998e+02
#> mean............ 4.293301e+02
#> variance........ 3.091663e+03
#> var 3:
#> best............ 3.781536e+02
#> mean............ 3.449748e+02
#> variance........ 1.993487e+02
#> var 4:
#> best............ 2.449522e+02
#> mean............ 3.285998e+02
#> variance........ 4.734795e+03
#> var 5:
#> best............ 7.968081e+02
#> mean............ 7.946177e+02
#> variance........ 8.265155e+00
#> var 6:
#> best............ 7.139499e+02
#> mean............ 5.668399e+02
#> variance........ 4.171253e+03
#> var 7:
#> best............ 5.155878e+02
#> mean............ 5.882687e+02
#> variance........ 2.112587e+03
#> 
#> GENERATION: 4
#> Lexical Fit..... 6.548664e-01  6.548664e-01  7.653059e-01  7.677288e-01  8.294727e-01  8.429368e-01  8.670786e-01  9.999995e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 12, #Total UniqueCount: 62
#> var 1:
#> best............ 6.588346e+02
#> mean............ 6.781277e+02
#> variance........ 2.084470e+03
#> var 2:
#> best............ 3.541159e+02
#> mean............ 3.925755e+02
#> variance........ 9.590821e+02
#> var 3:
#> best............ 3.350025e+02
#> mean............ 3.861895e+02
#> variance........ 6.826576e+03
#> var 4:
#> best............ 3.460951e+02
#> mean............ 3.188420e+02
#> variance........ 1.074715e+04
#> var 5:
#> best............ 7.974970e+02
#> mean............ 7.762558e+02
#> variance........ 4.161714e+03
#> var 6:
#> best............ 6.944532e+02
#> mean............ 6.654518e+02
#> variance........ 8.823948e+03
#> var 7:
#> best............ 5.255704e+02
#> mean............ 5.686478e+02
#> variance........ 1.098735e+04
#> 
#> GENERATION: 5
#> Lexical Fit..... 6.548664e-01  6.548664e-01  7.771018e-01  7.911884e-01  8.013732e-01  8.294727e-01  8.670786e-01  9.999995e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 12, #Total UniqueCount: 74
#> var 1:
#> best............ 6.588346e+02
#> mean............ 6.510582e+02
#> variance........ 7.676196e+02
#> var 2:
#> best............ 3.736795e+02
#> mean............ 3.541162e+02
#> variance........ 5.294927e+03
#> var 3:
#> best............ 3.636348e+02
#> mean............ 3.662756e+02
#> variance........ 3.255691e+03
#> var 4:
#> best............ 2.789833e+02
#> mean............ 2.865397e+02
#> variance........ 3.221589e+03
#> var 5:
#> best............ 7.970399e+02
#> mean............ 7.836465e+02
#> variance........ 1.311768e+03
#> var 6:
#> best............ 6.483502e+02
#> mean............ 6.982166e+02
#> variance........ 5.145674e+03
#> var 7:
#> best............ 5.279309e+02
#> mean............ 5.242414e+02
#> variance........ 1.270683e+03
#> 
#> GENERATION: 6
#> Lexical Fit..... 6.548664e-01  6.548664e-01  7.883088e-01  7.928805e-01  8.294727e-01  8.670786e-01  9.477954e-01  9.999915e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 12, #Total UniqueCount: 86
#> var 1:
#> best............ 6.588346e+02
#> mean............ 6.524879e+02
#> variance........ 5.847407e+02
#> var 2:
#> best............ 3.257457e+02
#> mean............ 3.656414e+02
#> variance........ 1.988795e+03
#> var 3:
#> best............ 2.934813e+02
#> mean............ 3.374197e+02
#> variance........ 3.476227e+02
#> var 4:
#> best............ 4.434174e+02
#> mean............ 3.591496e+02
#> variance........ 1.109475e+04
#> var 5:
#> best............ 7.981599e+02
#> mean............ 7.625181e+02
#> variance........ 1.832465e+04
#> var 6:
#> best............ 6.756930e+02
#> mean............ 6.511352e+02
#> variance........ 1.626546e+04
#> var 7:
#> best............ 5.351758e+02
#> mean............ 5.274139e+02
#> variance........ 8.350165e+00
#> 
#> GENERATION: 7
#> Lexical Fit..... 6.548664e-01  6.548664e-01  8.404743e-01  8.670786e-01  8.670786e-01  8.885959e-01  9.480178e-01  9.999296e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 12, #Total UniqueCount: 98
#> var 1:
#> best............ 6.218172e+02
#> mean............ 6.248966e+02
#> variance........ 1.493401e+04
#> var 2:
#> best............ 3.257457e+02
#> mean............ 3.474654e+02
#> variance........ 5.080746e+02
#> var 3:
#> best............ 3.137496e+02
#> mean............ 3.265126e+02
#> variance........ 9.744804e+02
#> var 4:
#> best............ 4.434174e+02
#> mean............ 3.623001e+02
#> variance........ 7.531509e+03
#> var 5:
#> best............ 7.981599e+02
#> mean............ 7.976318e+02
#> variance........ 2.457292e-01
#> var 6:
#> best............ 6.756930e+02
#> mean............ 6.572347e+02
#> variance........ 1.262113e+03
#> var 7:
#> best............ 6.299600e+02
#> mean............ 5.375486e+02
#> variance........ 5.814486e+02
#> 
#> GENERATION: 8
#> Lexical Fit..... 6.548664e-01  6.548664e-01  8.404743e-01  8.670786e-01  8.670786e-01  8.885959e-01  9.480178e-01  9.999296e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 13, #Total UniqueCount: 111
#> var 1:
#> best............ 6.218172e+02
#> mean............ 6.442309e+02
#> variance........ 2.686632e+02
#> var 2:
#> best............ 3.257457e+02
#> mean............ 3.184059e+02
#> variance........ 5.240427e+03
#> var 3:
#> best............ 3.137496e+02
#> mean............ 3.296255e+02
#> variance........ 1.217940e+04
#> var 4:
#> best............ 4.434174e+02
#> mean............ 4.210740e+02
#> variance........ 7.485304e+03
#> var 5:
#> best............ 7.981599e+02
#> mean............ 7.981592e+02
#> variance........ 5.841333e-06
#> var 6:
#> best............ 6.756930e+02
#> mean............ 6.756915e+02
#> variance........ 2.138181e-05
#> var 7:
#> best............ 6.299600e+02
#> mean............ 5.894523e+02
#> variance........ 6.354767e+03
#> 
#> GENERATION: 9
#> Lexical Fit..... 6.548664e-01  6.548664e-01  8.404743e-01  8.670786e-01  8.670786e-01  8.885959e-01  9.480178e-01  9.999296e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> #unique......... 13, #Total UniqueCount: 124
#> var 1:
#> best............ 6.218172e+02
#> mean............ 6.218193e+02
#> variance........ 3.941009e-05
#> var 2:
#> best............ 3.257457e+02
#> mean............ 3.174401e+02
#> variance........ 2.319724e+03
#> var 3:
#> best............ 3.137496e+02
#> mean............ 3.324364e+02
#> variance........ 5.238636e+03
#> var 4:
#> best............ 4.434174e+02
#> mean............ 4.280902e+02
#> variance........ 3.523276e+03
#> var 5:
#> best............ 7.981599e+02
#> mean............ 7.835862e+02
#> variance........ 3.185448e+03
#> var 6:
#> best............ 6.756930e+02
#> mean............ 6.847747e+02
#> variance........ 1.237822e+03
#> var 7:
#> best............ 6.299600e+02
#> mean............ 6.291727e+02
#> variance........ 9.330775e+00
#> 
#> 'wait.generations' limit reached.
#> No significant improvement in 1 generations.
#> 
#> Solution Lexical Fitness Value:
#> 6.548664e-01  6.548664e-01  8.404743e-01  8.670786e-01  8.670786e-01  8.885959e-01  9.480178e-01  9.999296e-01  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  1.000000e+00  
#> 
#> Parameters at the Solution:
#> 
#>  X[ 1] : 6.218172e+02
#>  X[ 2] : 3.257457e+02
#>  X[ 3] : 3.137496e+02
#>  X[ 4] : 4.434174e+02
#>  X[ 5] : 7.981599e+02
#>  X[ 6] : 6.756930e+02
#>  X[ 7] : 6.299600e+02
#> 
#> Solution Found Generation 7
#> Number of Generations Run 9
#> 
#> Sat Mar 13 14:38:27 2021
#> Total run time : 0 hours 0 minutes and 0 seconds

Y <- re78 / 1000

mout <- Match(Y = Y, Tr = treat, X = X, estimand = "ATE", Weight.matrix = genout)
summary(mout)
#> 
#> Estimate...  1.7376 
#> AI SE......  0.75892 
#> T-stat.....  2.2896 
#> p.val......  0.022046 
#> 
#> Original number of observations..............  445 
#> Original number of treated obs...............  185 
#> Matched number of observations...............  445 
#> Matched number of observations  (unweighted).  677

mb <- MatchBalance(treat ~ age + educ + black + hisp + married + nodegr + re74,
  match.out = mout, nboots = 500
)
#> 
#> ***** (V1) age *****
#>                        Before Matching        After Matching
#> mean treatment........     25.816             25.087 
#> mean control..........     25.054             25.117 
#> std mean diff.........     10.655           -0.44592 
#> 
#> mean raw eQQ diff.....    0.94054            0.25702 
#> med  raw eQQ diff.....          1                  0 
#> max  raw eQQ diff.....          7                  8 
#> 
#> mean eCDF diff........   0.025364          0.0069076 
#> med  eCDF diff........   0.022193          0.0051699 
#> max  eCDF diff........   0.065177           0.032496 
#> 
#> var ratio (Tr/Co).....     1.0278            0.99786 
#> T-test p-value........    0.26594            0.84047 
#> KS Bootstrap p-value..      0.548              0.628 
#> KS Naive p-value......     0.7481            0.86708 
#> KS Statistic..........   0.065177           0.032496 
#> 
#> 
#> ***** (V2) educ *****
#>                        Before Matching        After Matching
#> mean treatment........     10.346             10.209 
#> mean control..........     10.088             10.207 
#> std mean diff.........     12.806            0.13206 
#> 
#> mean raw eQQ diff.....    0.40541           0.072378 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          2                  2 
#> 
#> mean eCDF diff........   0.028698          0.0051699 
#> med  eCDF diff........   0.012682          0.0029542 
#> max  eCDF diff........    0.12651           0.017725 
#> 
#> var ratio (Tr/Co).....     1.5513              1.026 
#> T-test p-value........    0.15017            0.94802 
#> KS Bootstrap p-value..       0.01              0.914 
#> KS Naive p-value......   0.062873            0.99993 
#> KS Statistic..........    0.12651           0.017725 
#> 
#> 
#> ***** (V3) black *****
#>                        Before Matching        After Matching
#> mean treatment........    0.84324             0.8382 
#> mean control..........    0.82692            0.84045 
#> std mean diff.........     4.4767           -0.60952 
#> 
#> mean raw eQQ diff.....   0.016216          0.0014771 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  1 
#> 
#> mean eCDF diff........  0.0081601         0.00073855 
#> med  eCDF diff........  0.0081601         0.00073855 
#> max  eCDF diff........    0.01632          0.0014771 
#> 
#> var ratio (Tr/Co).....    0.92503             1.0114 
#> T-test p-value........    0.64736            0.65487 
#> 
#> 
#> ***** (V4) hisp *****
#>                        Before Matching        After Matching
#> mean treatment........   0.059459            0.08764 
#> mean control..........    0.10769            0.08764 
#> std mean diff.........    -20.341                  0 
#> 
#> mean raw eQQ diff.....   0.048649                  0 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  0 
#> 
#> mean eCDF diff........   0.024116                  0 
#> med  eCDF diff........   0.024116                  0 
#> max  eCDF diff........   0.048233                  0 
#> 
#> var ratio (Tr/Co).....    0.58288                  1 
#> T-test p-value........   0.064043                  1 
#> 
#> 
#> ***** (V5) married *****
#>                        Before Matching        After Matching
#> mean treatment........    0.18919            0.16854 
#> mean control..........    0.15385            0.16854 
#> std mean diff.........     8.9995                  0 
#> 
#> mean raw eQQ diff.....   0.037838                  0 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  0 
#> 
#> mean eCDF diff........   0.017672                  0 
#> med  eCDF diff........   0.017672                  0 
#> max  eCDF diff........   0.035343                  0 
#> 
#> var ratio (Tr/Co).....     1.1802                  1 
#> T-test p-value........    0.33425                  1 
#> 
#> 
#> ***** (V6) nodegr *****
#>                        Before Matching        After Matching
#> mean treatment........    0.70811            0.78202 
#> mean control..........    0.83462            0.78202 
#> std mean diff.........    -27.751                  0 
#> 
#> mean raw eQQ diff.....    0.12432                  0 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....          1                  0 
#> 
#> mean eCDF diff........   0.063254                  0 
#> med  eCDF diff........   0.063254                  0 
#> max  eCDF diff........    0.12651                  0 
#> 
#> var ratio (Tr/Co).....     1.4998                  1 
#> T-test p-value........  0.0020368                  1 
#> 
#> 
#> ***** (V7) re74 *****
#>                        Before Matching        After Matching
#> mean treatment........     2095.6             1918.8 
#> mean control..........       2107             1931.2 
#> std mean diff.........   -0.23437            -0.2616 
#> 
#> mean raw eQQ diff.....     487.98              195.5 
#> med  raw eQQ diff.....          0                  0 
#> max  raw eQQ diff.....       8413             7870.3 
#> 
#> mean eCDF diff........   0.019223           0.008013 
#> med  eCDF diff........     0.0158          0.0059084 
#> max  eCDF diff........   0.047089           0.032496 
#> 
#> var ratio (Tr/Co).....     0.7381            0.85018 
#> T-test p-value........    0.98186             0.8886 
#> KS Bootstrap p-value..       0.59               0.25 
#> KS Naive p-value......    0.97023            0.86708 
#> KS Statistic..........   0.047089           0.032496 
#> 
#> 
#> Before Matching Minimum p.value: 0.0020368 
#> Variable Name(s): nodegr  Number(s): 6 
#> 
#> After Matching Minimum p.value: 0.25 
#> Variable Name(s): re74  Number(s): 7

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.