How can a decision-maker forecast the causal impact of interventions over time? This question lies at the intersection of Causal Inference and Machine Learning.

For example, an e-commerce business may be interested in predicting the future sales of its products under a sequence of different pricing strategies. A quant trader could benefit from forecasting potential outcomes under different trading strategies. The ability to answer these questions can aid prospective and retrospective what-if analyses.

This semester, I took a course on Causal Inference that taught me the techniques and assumptions under which we can perform causal analysis on randomized, observational, and quasi-experimental studies. For my final project, I participated in the American Causal Inference Conference (ACIC 2023) data challenge and delved deeper into the problem of Causal Counterfactual Forecasting. Here are the key contributions of my work.

Deep Thinking on the Brooklyn Bridge
  1. Evaluated the Counterfactual Recurrent Network (CRN) in a business setting. Most recent methods, including the CRN, have been evaluated on a simulated tumor growth dataset or data in the healthcare domain. In this work, I evaluated the model on the ACIC challenge dataset, which mirrors data in an e-commerce business setting.

  2. General purpose implementation of CRN to work with any time-series dataset. The open-source implementation of CRN only works with the simulated tumor dataset. Here I provide a modified script and data processing notebook to repurpose your data to work with CRN.

  3. Plausibility analysis of causal assumptions. While most studies on counterfactual forecasting list the necessary assumptions for causal claims, the plausibility of these assumptions in real-world settings is not discussed. In this work, I discuss these assumptions in the context of the e-commerce example. Although these methods adjust for time-confounding bias that supervised machine learning methods do not, they make strong assumptions that may not hold in practice. It is therefore prudent to use the term “causal” with caution when describing the predictions of these models.

Check out the full paper and repo.