Repository logo
 

Simulating species assemblages and evaluating species richness estimators

Date

2012

Authors

Reese, Gordon C., author
Wilson, Kenneth R., advisor
Flather, Curtis H., committee member
Stohlgren, Thomas J., committee member
Angert, Amy L., committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Conservation efforts have long emphasized protecting biologically diverse areas. Species richness, the number of species in a defined area, is the most frequently used biodiversity measure and it can be used for selecting amongst different areas and for studying process effects over time. Despite its intuitive appeal and conceptual simplicity, species richness is often difficult to quantify, even in well-surveyed areas, because of sampling limitations such as survey effort and species detection probability. This has led to the development of numerous species richness estimators. Nonparametric estimators present the least biased option, but no particular estimator has consistently performed best. Factors such as abundance, behavior, and survey design vary widely between locations, species, and datasets, affecting richness estimates and revealing the limitations of estimators. Increasing our understanding of the relationships between estimator performance and important factors can improve prediction and, ultimately, estimator utility. My objective was to evaluate the performance of nonparametric species richness estimators, both established and new, across a wide range of species assemblages. Given the difficulties of surveying many different assemblages and of assessing performance when the true state of an assemblage is unknown, I choose to develop a program for estimating the species richness of assemblages simulated with user-specified parameters. I also sought to use the following studies to develop a framework for selecting the best estimator given particular assemblage attributes and survey design parameters. In the following studies, I assumed that every individual was: 1) independent, i.e., there were no clonal colonies, 2) detectable, 3) correctly identified, and 4) sessile for the duration of a survey. Simulations were used because they are convenient, possibly the only, means of simultaneously controlling many characteristics on an assemblage. By controlling only those factors that are of interest and excluding others, simulations represent a simplification of the real world, i.e., they trade convenience for realism, which can benefit cause-and-effect assessments because the real world involves many additional factors that can complicate estimation efforts. For example, there are difficult to detect species, e.g., cryptic and extremely small species, as well as limited sampling efforts that can further reduce estimator performance. The real world might therefore not conform to the trends detected in a simulated environment, particularly beyond the range of evaluated factors. For such reasons, I recommend that application of these results to the real world, especially extrapolation, be done with caution. Simulated environments ultimately represent a best case scenario, so if estimators perform poorly there, how can we trust them in the much more complicated real world? Several factors influence estimator performance including the number of species in the assemblage, total abundance or density, distribution of abundances across species, spatial configuration of individuals, species detection probability, and survey effort. In Chapter 2, I developed a species assemblage simulator for assessing estimator performance across a wide range of conditions. The program, SimAssem, allows a user to specify both assemblage and survey parameters and generates encounter histories as input for various estimators. In addition to nonparametric species richness estimators, SimAssem includes: 1) estimators of the additional amount of survey effort required to encounter user-specified proportions of the estimates from the Chao estimators and 2) an option to process existing encounter histories. In Chapter 3, I evaluated the bias, precision, and accuracy of 13 nonparametric estimators across simulated assemblages that are systematically varied for the number of species, distribution of species abundances, total abundance, spatial configuration of individuals, and species detection probability. I also varied sampling effort and survey design. When averaged across all assemblages, the estimators were less negatively biased than a raw count of species in a sample and there was generally a tradeoff between bias and precision. Two relatively new estimators based on the similarity of repeated subsets of surveys were most accurate and appeared to reach asymptotes more quickly than the other estimators when used with real data. The number of species, distribution of species abundances, and effort had the largest effects on performance, largely by affecting sample coverage, i.e., the proportion of the species pool contained in the sample. Increases in the true number of species and decreases in the evenness of abundances negatively affected bias and accuracy. Increasing the rate of encounters via total abundance, species detection probability, and effort generally improved bias and accuracy. There was a moderate increase in bias when individuals were aggregated and sampled using a non-random survey design. Also, a refined estimator selection framework based on sample coverage showed promising results when applied to real datasets. Point estimates of species richness are of limited value without some measure of reliability; nevertheless, species richness estimates are often reported without any measure of precision. For many species richness estimators, analytically derived variance estimators exist. For others, approaches such as bootstrap and jackknife resampling can be used. In Chapter 4, I evaluated variance estimators across levels of the factors with the largest effects on species richness estimators, representing a portion of the data simulated for Chapter 3. Variation in the species richness estimates generally increased with the true number of species. The analytical variance estimates usually exceeded those of the two resampling procedures, but all three methods were negatively biased at most factor levels. Similarly, the analytical estimators often resulted in the largest confidence interval coverage levels, though coverage was less than the nominal 95% in all except one case. Furthermore, there was generally a negative relationship between the achieved coverage level and true number of species. Bootstrap resampling always produced the best coverage for the bootstrap species richness estimator and occasionally performed similarly well with other species richness estimators. Confidence interval coverage was, in general: 1) smallest in assemblages with log-series distributions and largest in assemblages with particulate-niche distributions, 2) positively related to effort and species detection probability, and 3) variable across species richness estimators as a function of total abundance. The abundance-based coverage estimator and its associated analytical variance estimator regularly achieved the largest coverage levels, so I recommended its use when there is little or no information to suggest that another estimator is more appropriate.

Description

Rights Access

Subject

Citation

Associated Publications