The Synthetic Control Method was first proposed by Abadie and Gardeazabal (2003) to evaluate the impact of a specific event or policy, implemented in one geographical area (city, country, region, and so on) on an outcome of interest. It works by generating a synthetic unit which is as similar as possible to the so-called “treated” unit, except for the intervention or event from which one wants to estimate a causal effect. This synthetic unit is constructed as a weighed average of similar units which have not received the treatment.
Formally, \(W = (w_2,...w_{J+1})'\) is a \((J \times 1)\) vector of weights, where \(J\) is the number of units in the donor pool. Assigning different values in this vector leads to different weighed averages for the control regions. The value of the outcome variable \(Y\) in post-treatment time \(t \ge t_0\) for a synthetic control indexed by \(W\) is \(\sum_{j=2}^{J + 1}w_jY_{jt}\). To select the optimal vector of weights \(W^*\), we consider a \(X_1\) vector that has pre-intervention covariates for the treated region, and a \(X_0\), \((J \times K)\) vector with the same pre-intervention covariates, but for control units, both of which including the pre-treatment trend of the dependent variable. \(W^*\) is chosen to minimise the distance \(|| X_1 - X_0W||\), solving the minimisation problem
\[\begin{equation} min(X_1 - X_0W)'V(X_1 = X_0 W)~s.t.~0 \le w_j \le 1~and~\sum_{j = 2}^{J + 1}w_j = 1 \end{equation}\]The synthetic controls method has become popular in economics, public policy evaluation, and political science, due to its intuitiveness and fitness for rigorous causal inference in single case studies. The optimization described above and by Abadie and Gardeazabal (2003) is implemented in R in the Synth
package (Abadie, Diamond, and Hainmueller 2011).
However, Synth
has not kept up to date with recent developments in synthetic controls. It has no automatic feature to implement the placebo tests suggested by Abadie, Diamond, and Hainmueller (2015) or Abadie, Diamond, and Hainmueller (2010), which are only available in Stata (Galiani and Quistorff 2017), nor more recent features such as synthetic controls for multiple treated units (Kreif et al. 2015). SCtools
comes in to fill these gaps providing these extensions to Synth
.
Abadie, Diamond, and Hainmueller (2010) suggests using “placebos” to test the significance of the effects identified with synthetic controls: the approach consists in creating a synthetic control for each unit in the donor pool. Given that none of them was treated, the estimated “effects” from these units, in relation to their synthetic controls, is the difference between a unit and its synthetic control one could expect by chance, under the null hypothesis of no treatment effect. Therefore, if the observed effect for the treated unit is larger than that for the placebos, this is considered evidence of an actual treatment effect, where the pseudo p-value is the proportion of placebos having an effect as large as or larger than the observed for the treated unit.
SCtools
automates the generation of placebos from the donor pool, based on objects created when running synthetic control analysis with Synth
. It returns the estimated synthetic control for each donor unit, and a gaps plot
with the respective curves comparing each donor pool unit to its synthetic control, along with the treated unit. Figure 1 shows this plot for the case of the Basque Country, from Abadie and Gardeazabal (2003), made with the plot_placebos
function in SCtools
, after running the Basque Country example from Synth
.
Figure 1: Example Placebos Plot Using Basque Country Data from Abadie and Gardeazabal (2003)
Placebos can then be used for another test proposed by Abadie, Diamond, and Hainmueller (2015), the post/pre Mean Square Prediction Error (MSPE) test. It is the difference between the observed outcome of a unit and its synthetic control, before and after treatement. A higher ratio means a small pre-treatment prediction error (a good synthetic control), combined with a high post-treatment MSPE, meaning a large difference between the unit and its synthetic control after the intervention. By calculating this ratio for all placebos, the test can be interpreted as looking at how likely the result obtained for a single treated case with a synthetic control analysis could have occurred by chance given no treatement. SCtools
also provides a post/pre MSPE plot and its associated pseudo p-value.
An important advance in Synthetic Controls has been the estimation of causal effects for interventions with multiple treated units (Kreif et al. 2015,Cavallo et al. (2013)). For these, one synthetic control is estimated for each treated unit. Then, the average distance between each treated unit and its synthetic control, before and after the intervention, is taken to indicate the goodness-of-fit (before treatment) and the esitmated average treatment effect (after treatment). SCtools
implements a change to the familiar dataprep()
function from Synth
, in order to accommodate multiple treated units and return a plot with the estimated average path for treated units and their synthetic controls.
Inference of causal effects with multiple treated units is also done using placebos. In this case, once again one synthetic control is created for each unit in the donor pool. SCtools
implements a bootstrap approach is used to calculate a p-value for the average treatment effect estimate. It works by sampling k
placebos (where k = the number of treated units) n
times, calculating the estimated average placebo effect each time, and at the end having a distribution of average placebo effects. Comparing the observed average treatment effect to this distribution of average placebo effects gives an estimated p-value for the observed ATT.
Other developments in synthetic controls recently include the Generalized Synthetic Controls (Xu 2017) and Augmented Synthetic Controls (Ben-Michael, Feller, and Rothstein 2018) – both of which would also rely on placebo tests for inference and which could be integrated with SCtools
.
SCtools is licensed under the GNU General Public License (v3.0), with all source code stored at GitHub (https://github.com/bcastanho/SCtools), and with a corresponding issue tracker for bug reporting and feature enhancements.
Abadie, Alberto, and Javier Gardeazabal. 2003. “The Economic Costs of Conflict: A Case Study of the Basque Country.” American Economic Review 93 (1): 113–32. doi:10.1257/000282803321455188.
Abadie, Alberto, Alexis Diamond, and Jens Hainmueller. 2010. “Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program.” Journal of the American Statistical Association 105 (490): 493–505. doi:10.1198/jasa.2009.ap08746.
———. 2011. “Synth: An R Package for Synthetic Control Methods in Comparative Case Studies.” Journal of Statistical Software 42 (13): 1–17. http://www.jstatsoft.org/v42/i13/.
———. 2015. “Comparative Politics and the Synthetic Control Method.” American Journal of Political Science 59 (2): 495–510. doi:10.1111/ajps.12116.
Ben-Michael, Eli, Avi Feller, and Jesse Rothstein. 2018. “The Augmented Synthetic Control Method.” arXiv:1811.04170 [Econ, Stat], November. http://arxiv.org/abs/1811.04170.
Cavallo, Eduardo, Sebastian Galiani, Ilan Noy, and Juan Pantano. 2013. “Catastrophic Natural Disasters and Economic Growth.” Review of Economics and Statistics 95 (5): 1549–61.
Galiani, Sebastian, and Brian Quistorff. 2017. “The Synth_runner Package: Utilities to Automate Synthetic Control Estimation Using Synth.” The Stata Journal 17 (4). SAGE Publications Sage CA: Los Angeles, CA: 834–49.
Kreif, Noémi, Richard Grieve, Dominik Hangartner, Alex James Turner, Silviya Nikolova, and Matt Sutton. 2015. “How sensitive is physician performance to alternative compensation schedules? Evidence from a large network of primary care clinics.” Health Economics, 1–16. doi:10.1002/hec.3258.
Xu, Yiqing. 2017. “Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models.” Political Analysis 25 (01): 57–76. doi:10.1017/pan.2016.2.