Random effects graphical models for discrete compositional data
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Abstract
In this dissertation, we consider state-space models for the analysis of discrete compositional data. Compositional data are non-negative multivariate vectors that lay on the simplex defined by the sum-to-one constraint. The sum-to-one constraint simply implies that the vector elements sum to one (or some other scaler constant) for every element of the multivariate sample space. Discrete compositional data are multivariate vectors of integer counts that have been normalized to give the relative abundance of each element of the multivariate count vector. The logistic normal (LN) distribution and the associated perturbation operator provide a flexible model for compositional data. However, the LN distribution may be a poor model for discrete compositional data due to the extra sampling variability of integer counts and the possible presence of zeros in the compositional observation. Here, we propose a class of state-space models for compositional data based on traditional graphical models. Graphical models are distributions for analyzing the conditional relationships of a Markov random field. We propose a two component graphical chain model, the discrete regression distribution, in which a set of categorical (or discrete) random variables is modeled as a response to a set of categorical and continuous covariates. This new graphical model, for a single observation of a multivariate count vector, serves as the basis for a state-space model for compositional data. We examine necessary and sufficient conditions for a discrete regression distribution to be described by the graph of a Markov random field. The discrete regression formulation is extended to a state-space representation for the analysis of many discrete compositional observations. Models are constructed for compositions defined by a single classification criteria and, also, those defined by multiple classification criteria. We define an extended chain graph which possesses an extra vertex associated with the random state. Necessary and sufficient conditions are given for a random effects discrete regression to be Markovian with respect to this extended graph. We also give sufficient conditions for the Markov properties of the marginal distribution of the covariates and categorical response. A Bayesian approach to parameter inference is adopted. Markov chain Monte Carlo (MCMC) methods are used for estimated for data sets concerning feeding type composition of stream invertebrates in Oregon and fish species richness in the Mid-Atlantic Highlands. Following the analysis of traditional discrete compositional data, we examine a state-space representation of another type of ecological composition analysis, capture-recapture models. Capture-recapture models provide inference to survival in wild animal populations. In state-space capture-recapture models the survival rate for each time period represents an unobserved composition. We illustrate why capture-recapture models are nearly identical to traditional multi-way discrete composition models. An autoregressive random survival state is incorporated into traditional capture-recapture models and an MCMC methodology presented for inference. The methodology is demonstrated on a long term data set of marked Pintail ducks.
Description
Rights Access
Subject
statistics
