Repository logo
 

Nonparametric tests for informative selection and small area estimation for reconciling survey estimates

Date

2020

Authors

Liu, Teng, author
Breidt, F. Jay, advisor
Wang, Haonan, committee member
Estep, Donald J., committee member
Doherty, Paul F., Jr., committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Two topics in the analysis of complex survey data are addressed: testing for informative selection and addressing temporal discontinuities due to survey redesign. Informative selection, in which the distribution of response variables given that they are sampled is different from their distribution in the population, is pervasive in modern complex surveys. Failing to take such informativeness into account could produce severe inferential errors, such as biased parameter estimators, wrong coverage rates of confidence intervals, incorrect test statistics, and erroneous conclusions. While several parametric procedures exist to test for informative selection in the survey design, it is often hard to check the parametric assumptions on which those procedures are based. We propose two classes of nonparametric tests for informative selection, each motivated by a nonparametric test for two independent samples. The first nonparametric class generalizes classic two-sample tests that compare empirical cumulative distribution functions, including Kolmogorov–Smirnov and Cramér–von Mises, by comparing weighted and unweighted empirical cumulative distribution functions. The second nonparametric class adapts two-sample tests that compare distributions based on the maximum mean discrepancy to the setting of weighted and unweighted distributions. The asymptotic distributions of both test statistics are established under the null hypothesis of noninformative selection. Simulation results demonstrate the usefulness of the asymptotic approximations, and show that our tests have competitive power with parametric tests in a correctly specified parametric setting while achieving greater power in misspecified scenarios. Many surveys face the problem of comparing estimates obtained with different methodology, including differences in frames, measurement instruments, and modes of delivery. Differences may exist within the same survey; for example, multi-mode surveys are increasingly common. Further, it is inevitable that surveys need to be redesigned from time to time. Major redesign of survey processes could affect survey estimates systematically, and it is important to quantify and adjust for such discontinuities between the designs to ensure comparability of estimates over time. We propose a small area estimation approach to reconcile two sets of survey estimates, and apply it to two surveys in the Marine Recreational Information Program (MRIP). We develop a log-normal model for the estimates from the two surveys, accounting for temporal dynamics through regression on population size and state-by-wave seasonal factors, and accounting in part for changing coverage properties through regression on wireless telephone penetration. Using the estimated design variances, we develop a regression model that is analytically consistent with the log-normal mean model. We use the modeled design variances in a Fay-Herriot small area estimation procedure to obtain empirical best linear unbiased predictors of the reconciled effort estimates for all states and waves, and provide an asymptotically valid mean square error approximation.

Description

Rights Access

Subject

nonparametric tests
informative selection
small area estimation

Citation

Associated Publications