generated from IQSS/dss-template-quarto
-
Notifications
You must be signed in to change notification settings - Fork 1
/
conclusion.qmd
13 lines (7 loc) · 4.29 KB
/
conclusion.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
# Conclusion
In this guide, we have described the basic conceptual and practical steps for running a matching or weighting analysis. Despite their popularity, matching and weighting remains poorly understood methods that requires much conceptual work prior to the analysis in order for its conclusions to be valid and interpretable. The conceptual issues primarily concern the estimand (the causal quantity being estimated) and the assumptions required for causal inference.
Applying matching and weighting methods in practice requires care. Using the default methods in software packages or the most commonly used methods in a given field will not guarantee accurate results. Matching and weighting require a process of specification, assessment, and respecification before one can move forward with effect estimation. Assessment involves ensuring the covariate distributions are balanced, ensuring the remaining sample is representative of the target population, and ensuring the size of the remaining sample is sufficient for estimating an effect with precision. The most commonly applied methods make sacrifices in the name of simplicity, but those sacrifices often do not bear encouraging results. Newer methods that are just as easy to use but may be less studied and less familiar to researchers, reviewers, and audiences often perform much better, and should be prioritized over familiar methods for making valid inferences.
In particular, we recommend matching methods that relax the restriction of 1:1 matching when possible and retain the target estimand; (optimal or generalized) full matching and subclassification often outperform 1:1 matching without replacement. Methods that avoid the propensity score should be tried first. We recommend weighting methods that bypass estimation of a propensity score and estimate weights that directly balance the covariate distributions, such as entropy balancing and energy balancing. Whenever a propensity score is used, flexible machine learning models that can account for potential nonlinearities in the treatment model should be prioritized over simple logistic regression models. Of course, every method can be assessed empirically on a given dataset; results from simulation studies and broad recommendations can only provide hints of what methods might work best but cannot guarantee that a given method will succeed in a given dataset. It may be that simple methods are adequate for the data at hand; however, it is critical that this scenario be evaluated.
Although much focus in matching and weighting analysis has been on the method itself, estimating the treatment effect after matching or weighting is just as important and can have big impacts on the interpretation of the results. We recommend never using the coefficient on treatment in an outcome model as a treatment effect estimate unless no covariates are included in the model. That said, we recommend always including covariates when possible to improve precision. G-computation provides a means of computing a treatment effect on any effect measure scale from any outcome model in such a way that covariates can be included while retaining the target estimand. Standard errors and confidence intervals require consideration as well; in general, bootstrapping works well for both matching and weighting, though there are faster analytic methods that can yield comparable performance without requiring such a computationally intensive procedure.
Matching and weighting don't automatically give one causality, but they can help along the way. Satisfaction of the assumptions of causal inference are what allow the statistical quantities estimated using matching and weighting to be interpreted as causal. In addition to these assumptions, purely statistical assumptions are required for an estimate to be accurate. Researchers must articulate these assumptions and express any uncertainty around them as limitations that are often inherent to observational research. Whenever possible, researchers should provide information to readers about assessments of the the quality of the method used and about sensitivity to potential violations of these assumptions.
We hope this guide has been helpful in allowing researchers to move toward best practices in matching and weighting for causal inference with the aim of producing more valid and reproducible research.