Stratified Sampling Using Cluster Analysis: A Sample Selection Strategy for Improved Generalizations From Experiments
In: Evaluation review: a journal of applied social research, Band 37, Heft 2, S. 109-139
ISSN: 1552-3926
Background: An important question in the design of experiments is how to ensure that the findings from the experiment are generalizable to a larger population. This concern with generalizability is particularly important when treatment effects are heterogeneous and when selecting units into the experiment using random sampling is not possible—two conditions commonly met in large-scale educational experiments. Method: This article introduces a model-based balanced-sampling framework for improving generalizations, with a focus on developing methods that are robust to model misspecification. Additionally, the article provides a new method for sample selection within this framework: First units in an inference population are divided into relatively homogenous strata using cluster analysis, and then the sample is selected using distance rankings. Result: In order to demonstrate and evaluate the method, a reanalysis of a completed experiment is conducted. This example compares samples selected using the new method with the actual sample used in the experiment. Results indicate that even under high nonresponse, balance is better on most covariates and that fewer coverage errors result. Conclusion: The article concludes with a discussion of additional benefits and limitations of the method.