From: The closer the sportier? Children’s sports activity and their distance to sports facilities
Step A-1 | Choose one observation in the subsample defined by d = 1 and delete it from that pool |
Step B-1 | Find an observation in the subsample defined by d = 0 that is as close as possible to the one chosen in step A-1 in terms of \( p(x),\tilde{x} \). ‘Closeness’ is based on the Mahalanobis distance |
Step C-1 | Repeat A-1 and B-1 until no observation with d = 1 is left |
Step D-1 | Compute the maximum distance (dist) obtained for any comparison between a member of the reference distribution and matched comparison observations |
Step A-2 | Repeat A-1 |
Step B-2 | Repeat B-1. If possible, find other observations in the subsample of d = 0 that are at least as close as R × dist to the one chosen in step A-2. Do not remove these observations, so that they can be used again. Compute weights for all chosen comparisons observations that are proportional to their distance. Normalize the weights such that they add to one |
Step C-2 | Repeat A-2 and B-2 until no participant in d = 1 is left |
Step D-2 | D-2, For any potential comparison observation, add the weights obtained in A-2 and B-2 |
Step E | Using the weights w(x i ) obtained in D-2, run a weighted linear regression of the outcome variable on the variables used to define the distance (and an intercept) |
Step F-1 | Predict the potential outcome y 0(x i ) of every observation using the coefficients of this regression: \( {\hat{y}^0}\left( {{x_i}} \right) \) |
Step F-2 | Estimate the bias of the matching estimator for \( E\left( {{Y^0}|D = 1} \right) \) as: \( \sum\limits_{{i = 1}}^N {\frac{{{d_i}{{\hat{y}}^0}\left( {{x_i}} \right)}}{{{N_1}}} - \frac{{\left( {1 - {d_i}} \right){w_i}{{\hat{y}}^0}\left( {{x_i}} \right)}}{{{N_0}}}} \) |
Step G | Using the weights obtained by weighted matching in D-2, compute a weighted mean of the outcome variables in d = 0. Subtract the bias from this estimate to get \( E\left( {{Y^0}|D = 1} \right) \) |