Lant Pritchett

The enthusiasm for the potential of RCTs in development rests in part on the assumption that the use of the rigorous evidence that emerges from an RCT (or from a small set of studies identified as rigorous in a “systematic” review) leads to the adoption of more effective policies, programs or projects. However, the supposed benefits of using rigorous evidence for “evidence based” policy making depend critically on the extent to which there is external validity. If estimates of causal impact or treatment effects that have internal validity (are unbiased) in one context (where the relevant “context” could be country, region, implementing organization, complementary policies, initial conditions, etc.) cannot be applied to another context then applying evidence that is rigorous in one context may actually reduce predictive accuracy in other contexts relative to simple evidence from that context—even if that evidence is biased (Pritchett and Sandefur 2015). Using empirical estimates from a large number of developing countries of the difference in student learning in public and private schools (just as one potential policy application) I show that commonly made assumptions about external validity are, in the face of the actual observed heterogeneity across contexts, both logically incoherent and empirically unhelpful. Logically incoherent, in that it is impossible to reconcile general claims about external validity of rigorous estimates of causal impact and the heterogeneity of the raw facts about differentials. Empirically unhelpful in that using a single (or small set) of rigorous estimates to apply to all other actually leads to a larger root mean square error of prediction of the “true” causal impact across contexts than just using the estimates from non-experimental data from each country. In the data about private and public schools, under plausible assumptions, an exclusive reliance on the rigorous evidence has RMSE three times worse than using the biased OLS result from each context. In making policy decisions one needs to rely on an understanding of the relevant phenomena that encompasses all of the available evidence.
Core dual ideas of early development economics and practice were that (a) national development was a four-fold transformation of countries towards: (i) a more productive economy, (ii) a more responsive state, (iii) more capable administration, and (iv) a shared identity and equal treatment of citizens and (b) this four-fold transformation of national development would lead to higher levels of human wellbeing. The second idea is strikingly correct: development delivers. National development is empirically necessary for high wellbeing (no country with low levels of national development has high human wellbeing) and also empirically sufficient (no country with high national development has low levels of human wellbeing). Three measures of national development: productive economy, capable administration, and responsive state, explain (essentially) all of the observed variation in an omnibus indicator of wellbeing, the Social Progress Index, which is based on 58 distinct non-economic indicators. How national development delivers on wellbeing varies, in three ways. One, economic growth is much more important for achieving wellbeing at low versus high levels of income. Two, economic growth matters more for “basic needs” than for other dimensions of wellbeing (like social inclusiveness or environmental quality). Three, state capability matters more for wellbeing outcomes that depend on public production than on private goods (and for some wellbeing indicators, like physical safety, for which growth doesn’t matter at all). While these findings may seem too common sense to be worth a paper, national development--and particularly economic growth—is, strangely, under severe challenge as an important and legitimate objective of action within the development industry.