Bayesian model averaging and spatial prediction
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Abstract
I consider the problem of modeling spatial data, addressing the uncertainty involved in both the selection of explanatory variables and in the estimation of parameters. The standard practice of selecting a set of explanatory variables, estimating parameters and predicting the random process at unobserved locations ignores the uncertainty in the entire procedure. This thesis focuses on Bayesian methods to provide a more ‘honest' assessment of prediction uncertainty. I also look into the effect the spatial sampling pattern has on estimation and prediction. I adopt the Matern class of autocorrelation functions throughout which allows for flexible modeling of a variety of autocorrelation curves. Standard practice for model selection for spatially correlated data consists first of choosing a set of explanatory variables assuming independent errors and second modeling the residuals using an autocorrelation function. In this thesis I propose a technique which fits the mean and autocorrelation functions simultaneously. Models are compared using Akaike's Information Corrected Criteria (AICC) (Hurvich and Tsai. 1989). This method removes the ambiguity in determining which set of explanatory variables best fits the data when the process is correlated. Simulation results show an improvement in mean square prediction error and predictive coverage as compared to traditional methods for model selection for spatially correlated data. This thesis includes three methods for incorporating uncertainty into prediction. First, I detail a Bayesian kriging setup, with proper prior distributions on both the regression parameters and the spatial parameters. Second, I develop methodology for Bayesian model averaging for spatial data, incorporating all possible subsets of explanatory variables into the predictive distribution. Additionally, I develop a Bayesian model selection criteria for selecting one model while incorporating uncertainty in the estimation procedure. Simulation results for these three methods show improvements in terms of predictive coverage, mean square prediction error, and model selection with the true explanatory variables as compared to traditional methods for spatial prediction. The configuration of sample locations often plays a key role in modeling the data. There are many types of location patterns that may arise in spatial analysis including grid sampling, sampling over geographic areas such as counties or random sampling, or a mixture of grid and clustering. The simulations in this thesis show that sampling patterns can alter the ability to select explanatory variables, estimate parameters, and predict the response at unobserved locations. Finally, this thesis compares two methods for spatial parameter estimation, restricted maximum likelihood estimation (REML) and maximum likelihood estimation (ML). REML estimation better approximates the true autocorrelation function and gives more accurate parameter estimates, but results vary depending upon the configuration of sampling locations.
Description
Rights Access
Subject
statistics
