Listwise Deletion in High Dimensions
In: Political analysis: PA ; the official journal of the Society for Political Methodology and the Political Methodology Section of the American Political Science Association, Band 31, Heft 1, S. 149-155
ISSN: 1476-4989
AbstractWe consider the properties of listwise deletion when both n and the number of variables grow large. We show that when (i) all data have some idiosyncratic missingness and (ii) the number of variables grows superlogarithmically in n, then, for large n, listwise deletion will drop all rows with probability 1. Using two canonical datasets from the study of comparative politics and international relations, we provide numerical illustration that these problems may emerge in real-world settings. These results suggest that, in practice, using listwise deletion may mean using few of the variables available to the researcher.