Difference-in-differences meets tree-based methods: heterogeneous treatment effects estimation with unmeasured confounding
Article No.: 1407, Pages 33792 - 33803
Abstract
This study considers the estimation of conditional causal effects in the presence of unmeasured confounding for a balanced panel with treatment imposed at the last time point. To address this, we combine Difference-in-differences (DiD) and tree-based methods and propose a new identification assumption that allows for the violation of the (conditional) parallel trends assumption adopted by most existing DiD methods. Under this new assumption, we prove partial identifiability of the conditional average treatment effect on the treated group (CATT). Our proposed method estimates CATT through a tree-based causal approach, guided by a novel splitting rule that avoids model misspecification and unnecessary auxiliary parameter estimation. The splitting rule measures both the error of fitting observed data and the violation of conditional parallel trends simultaneously. We also develop an ensemble of multiple trees via gradient boosting to further enhance performance. Experimental results on both synthetic and real-world datasets validate the effectiveness of our proposed method.
References
[1]
Abadie, A. Semiparametric difference-in-difference estimators. Review of Economic Studies, 72:1-19, 2005.
[2]
Athey, S., Tibshirani, J., and Wager, S. Generalized random forests. The Annals of Statistics, 47(2):1148-1178, 2019.
[3]
Breslow, L. and Johnson, M. California's proposition 99 on tobacco, and its impact. Annual Review of Public Health, 14(1):585-604, 1993.
[4]
Gilotte, A., Calauzènes, C., Nedelec, T., Abraham, A., and Dollé, S. Offline A/B testing for recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 198-206, 2018.
[5]
Glass, T. A., Goodman, S. N., Hernán, M. A., and Samet, J. M. Causal inference in public health. Annual review of public health, 34:61-75, 2013.
[6]
Greenland, S. and Robins, J. M. Identifiability, exchange-ability, and epidemiological confounding. International Journal of Epidemiology, 15(3):413-419, 1986.
[7]
Hahn, P. R., Murray, J. S., and Carvalho, C. M. Bayesian regression tree models for causal inference: Regularization, confounding, and heterogeneous effects (with discussion). Bayesian Analysis, 15(3):965-1056, 2020.
[8]
Heckman, J. J., Ichimura, H., and Todd, P. E. Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme. Review of Economic Studies, 64:605-654, 1997.
[9]
Heckman, J. J., Ichimura, H., Smith, J. A., and Todd, P. E. Characterizing selection bias using experimental data. Econometrica, 66:1017-1098, 1998.
[10]
Imbens, G. W. and Rubin, D. B. Causal inference for statistics, social, and biomedical sciences: An introduction. Taylor & Francis, 2016.
[11]
Johansson, F., Shalit, U., and Sontag, D. Learning representations for counterfactual inference. In International Conference on Machine Learning, pp. 3020-3029, 2016.
[12]
Keele, L. J., Small, D. S., Hsu, J. Y., and Fogarty, C. B. Patterns of ef- fects and sensitivity analysis for differences-in-differences. arXiv:1901.01869, 2019.
[13]
Keller, M. C. Gene × environment interaction studies have not properly controlled for potential confounders: The problem and the (simple) solution. Biological Psychiatry, 75(1):18-24, 2014. Temperament: Genetic and Environmental Factors.
[14]
Künzel, S. R., Sekhon, J. S., Bickel, P. J., and Yu, B. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 116(10):4156-4165, 2019.
[15]
Manski, C. F. and Pepper, J. V. How do right-to-carry laws affect crime rates? coping with ambiguity using bounded-variation assumptions. Review of Economics and Statistics, 100(2):232-244, 2018.
[16]
Rambachan, A. and Roth, J. A more credible approach to parallel trends. Review of Economic Studies, Forthcoming, 2022.
[17]
Robin, J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling, 7(9-12):1393-1512, 1986.
[18]
Roth, J. and Sant'Anna, P. H. When is parallel trends sensitive to functional form? arXiv preprint arXiv:2010.04814, 2020.
[19]
Rubin, D. B. Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association, 75(371):591-593, 1980.
[20]
Sant'Anna, P. H. C. and Zhao, J. Doubly robust differencein-differences estimators. Journal of Econometrics, 219 (1):101-122, 2020.
[21]
Shalit, U., Johansson, F. D., and Sontag, D. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning, pp. 3076-3085, 2017.
[22]
Tang, C., Wang, H., Li, X., Cui, Q., Zhang, Y.-L., Zhu, F., Li, L., Zhou, J., and Jiang, L. Debiased causal tree: Heterogeneous treatment effects estimation with unmeasured confounding. In Advances in Neural Information Processing Systems, 2022.
[23]
Tchetgen Tchetgen, E. J., Michael, H., and Cui, Y. Marginal structural models for time-varying endogenous treatments: A time-varying instrumental variable approach. arXiv e-prints, pp. arXiv-1809, 2018.
[24]
Wacholder, S. When measurement errors correlate with truth: Surprising effects of nondifferential misclassification. Epidemiology, 6(2):157-161, 1995. ISSN 10443983.
[25]
Ying, A., Miao, W., Shi, X., and Tchetgen, E. J. T. Proximal causal inference for complex longitudinal studies. arXiv preprint arXiv:2109.07030, 2021.
[26]
Zimmert, M. Efficient difference-in-differences estimation with high-dimensional common trend confounding. arXiv preprint arXiv:1809.01643, 2018.
Recommendations
Estimation of tree lists from airborne laser scanning by combining single-tree and area-based methods
SilviLaser 2008Individual tree crown segmentation from airborne laser scanning (ALS) data often fails to detect all trees depending on the forest structure. This paper presents a new method to produce tree lists consistent with unbiased estimates at area level. First, ...
Approximating the selected-internal Steiner tree
In this paper, we consider a variant of the well-known Steiner tree problem. Given a complete graph G=(V,E) with a cost function c:E->R^+ and two subsets R and R^' satisfying R^'@?R@?V, a selected-internal Steiner tree is a Steiner tree which contains (...
Comments
Information & Contributors
Information
Published In
July 2023
43479 pages
Copyright © 2023.
Publisher
JMLR.org
Publication History
Published: 23 July 2023
Qualifiers
- Research-article
- Research
- Refereed limited
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024