Randomization tests for experiments embedded in complex surveys
Date
2022
Authors
Brown, David A., author
Breidt, F. Jay, advisor
Sharp, Julia, committee member
Zhou, Tianjian, committee member
Ogle, Stephen, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
Embedding experiments in complex surveys has become increasingly important. For scientific questions, such embedding allows researchers to take advantage of both the internal validity of controlled experiments and the external validity of probability-based samples of a population. Within survey statistics, declining response rates have led to the development of new methods, known as adaptive and responsive survey designs, that try to increase or maintain response rates without negatively impacting survey quality. Such methodologies are assessed experimentally. Examples include a series of embedded experiments in the 2019 Triennial Community Health Survey (TCHS), conducted by the Health District of Northern Larimer County in collaboration with the Department of Statistics at Colorado State University, to determine the effects of monetary incentives, targeted mailing of reminders, and double-stuffed envelopes (including both English and Spanish versions of the survey) on response rates, cost, and representativeness of the sample. This dissertation develops methodology and theory of randomization-based tests embedded in complex surveys, assesses the methodology via simulation, and applies the methods to data from the 2019 TCHS. An important consideration in experiments to increase response rates is the overall balance of the sample, because higher overall response might still underrepresent important groups. There have been advances in recent years on methods to assess the representativeness of samples, including application of the dissimilarity index (DI) to help evaluate the representativeness of a sample under the different conditions in an incentive experiment (Biemer et al. [2018]). We develop theory and methodology for design-based inference for the DI when used in a complex survey. Simulation studies show that the linearization method has good properties, with good confidence interval coverage even in cases when the true DI is close to zero, even though point estimates may be biased. We then develop a class of randomization tests for evaluating experiments embedded in complex surveys. We consider a general parametric contrast, estimated using the design-weighted Narain-Horvitz-Thompson (NHT) approach, in either a completely randomized design or a randomized complete block design embedded in a complex survey. We derive asymptotic normal approximations for the randomization distribution of a general contrast, from which critical values can be derived for testing the null hypothesis that the contrast is zero. The asymptotic results are conditioned on the complex sample, but we include results showing that, under mild conditions, the inference extends to the finite population. Further, we develop asymptotic power properties of the tests under moderate conditions. Through simulation, we illustrate asymptotic properties of the randomization tests and compare the normal approximations of the randomization tests with corresponding Monte Carlo tests, with a design-based test developed by van den Brakel, and with randomization tests developed by Fisher-Pitman-Welch and Neyman. The randomization approach generalizes broadly to other kinds of embedded experimental designs and null hypothesis testing problems, for very general survey designs. The randomization approach is then extended from NHT estimators to generalized regression estimators that incorporate auxiliary information, and from linear contrasts to comparisons of nonlinear functions.