Evidence Based Policy Evaluation and the Design of Natural Experiments: The Origins of Program Evaluation in the Department of Labor and Beyond

April 19, 2019

By: Orley C. Ashenfelter, Joseph Douglas Green 1895 Professor of Economics, Princeton University

This is part of the Michael H. Moskow Honorary Paper Series, which also includes: 

The title above is taken directly from one of the most famous and influential scientific books of the twentieth century, R.A. Fisher’s Design of Experiments. Fisher struggled with, and analyzed, what is now considered the “gold standard” method for making causal inferences. This method, which had been hinted at in scientific work for centuries, finally reached its full development in the twentieth century. Designed to produce highly credible evidence in complex situations, it is based on the idea that we determine the causal effect of a treatment or intervention by assigning randomly some fraction of the units we wish to influence and reserving the remainder as a control group. In medicine, it is said that we test a drug or procedure by using randomized clinical trials, but Fisher studied primarily agricultural experiments hence the naming “randomized field trials.” 

The key point is that these are experiments that take place in the real world — not a laboratory — and provide the final, conclusive test of the efficacy of a treatment or intervention. An interesting aspect of Fisher’s work is that it was always motivated by actual problems of experimental inference and evolved as a fundamentally practical analysis. Fisher’s lasting contribution, now taken for granted by virtually all scientists, is a set of methods for determining when observed differences are unlikely to be due to chance alone.

Much of what I have tried to do in my own research in the last forty years is to find some way to implement highly credible methods for the study of important and controversial problems of inference in economics. These methods, which tend to be opportunistic because they differ with the problem being studied, have come to be called “natural experiments.” Natural experiments are sometimes randomized trials (jokingly called “unnatural experiments” by a few of my colleagues), but often they must depend on some method that falls short of this gold standard. The key point is that these are analyses of what happens in practice, not just in theory, and the emphasis is on the credibility of the results.

Of course, most of the important methodological problems of economics, and especially labor economics where I have often worked, differ from agriculture and medicine. The differences can be categorized into three groups:

  1. We often do not have the data necessary to study a problem.
  2. We often cannot, and would not wish to, control the division of the units to be studied into control and treatment groups.
  3. Finally, even in the best of circumstances we cannot guarantee the integrity of the treatment assignment, so we must make provision for a difference between what we intend to measure and what we do measure.
     

In these remarks, what I would like to do is to share with you a few personal comments about how I was “converted” to the view that credible, that is, genuinely believable, inference was the key to progress in economics.

Evidence Based Policy Evaluation and the Design of Natural Experiments: The Origins of Program Evaluation in the Department of Labor and Beyond

About the Author