Why Causal Inference Matters More Than Prediction in Development Research

Predictive accuracy is a seductive metric. A model that predicts child stunting with 90% AUC feels like progress. But in development research, prediction is rarely the goal — intervention is.

The Core Problem

When a ministry of health asks “which children should we target with nutritional support?”, they are not asking for a prediction model. They are asking a causal question: which children will benefit most from the intervention?

These are different questions. A model trained to predict outcomes from observational data conflates correlation with causal effect. High socioeconomic status correlates with good outcomes — but targeting rich children for nutrition programs would be absurd.

Potential Outcomes Framework

The counterfactual framework makes this precise. For each child $i$, define:

$Y_i(1)$: outcome if treated
$Y_i(0)$: outcome if untreated
$\tau_i = Y_i(1) - Y_i(0)$: individual treatment effect

We can never observe both $Y_i(1)$ and $Y_i(0)$ for the same child — this is the fundamental problem of causal inference. But we can estimate the average treatment effect $\mathbb{E}[\tau_i]$ or the conditional average treatment effect $\mathbb{E}[\tau_i \mid X_i = x]$ under assumptions.

What This Means for Our Work

In the Green-NAS paper, we were careful to frame our contributions as predictive — we are not claiming that wider transformers cause better performance in all settings. In the child development encoder work, we explicitly model the deployment setting as a transfer learning problem, not a causal one.

The field needs more researchers who can draw this line clearly.