Browsing by Author "Hoeting, Jennifer, committee member"
Now showing 1 - 16 of 16
Results Per Page
Sort Options
Item Embargo A case for context in quantitative ecology: statistical techniques to increase efficiency, accuracy, and equity in biodiversity research(Colorado State University. Libraries, 2024) McCaslin, Hanna M., author; Bombaci, Sara, advisor; Hooten, Mevin, committee member; Koons, David, committee member; Hoeting, Jennifer, committee memberThe current era of ecological research is characterized by rapid technological innovation, large datasets, and numerous computational and quantitative techniques. Together, big data and advanced computing are expanding our understanding of natural systems, allowing us to capture more complexity in our models, and helping us find solutions for salient challenges facing modern ecology and conservation, including climate change and biodiversity loss. However, large datasets are often characterized by noise, complex observational processes, and other challenges that can impede our ability to apply these data to address ecological research gaps. In each chapter of this dissertation, I seek to address a data problem inherent to the 'big data' that characterizes modern ecological research. Together, they extend the strategies available for addressing a problem facing many ecologists – how to make use of the large volumes of data we are collecting given (1) current computational limitations and (2) specific sampling biases that characterize various methods for data collection. In the first chapter, I present a recursive Bayesian computing (RB) method that can be used to fit Bayesian hierarchical models in sequential MCMC stages to ease computation and streamline hierarchical inference. I also demonstrate the application of transformation-assisted RB (TARB) to a hierarchical animal movement model to create unsupervised MCMC algorithms and obtain inference about individual- and population-level migratory characteristics. This recursive procedure reduced computation time for fitting our hierarchical movement model by half compared to fitting the model with a single MCMC algorithm. Transformation-assisted RB is a relatively accessible method for reducing the computational demands of fitting complex ecological statistical models, like those for animal movement, multi-species systems, or large spatial and temporal scales. Biodiversity monitoring projects that rely on collaborative, crowdsourced data collection are characterized by huge volumes of data that represent a major facet of 'big data ecology,' and quantitative methods designed to use these data for ecological research and conservation represent a leading edge of contemporary quantitative ecology. However, because participants select where to observe biodiversity, crowdsourced data are often influenced by sampling bias, including being biased toward affluent, white neighborhoods in urban areas. Despite the growing evidence of social sampling bias, research has yet to explore how socially driven sampling bias impacts inference and prediction informed by crowdsourced data, or if existing data pre-processing or analytical methods can effectively mitigate this bias. Thus, in Chapters 2 and 3, I explored social sampling bias in data from the crowdsourced avian biodiversity platform eBird. In Chapter 2, I studied patterns of social sampling bias in the locations of eBird "hotspots" to determine whether hotspots in Fresno, California, U.S.A. are more biased by social factors than the locations of Fresno eBird observations overall. My findings support previous work showing that eBird locations are biased by demographics. Further, I found that demographic bias is most pronounced in the locations of hotspots specifically, with hotspots being more likely to occur in areas with higher proportions of non-Hispanic white residents than eBird locations overall. This relationship is reinforced because hotspots in these predominantly white areas also amass more eBird checklists overall than hotspots in areas with more demographic diversity. These findings raise concerns that the eBird hotspot system may be exacerbating spatial bias in sampling and reinforcing patterns of inequity in data availability and eBird participation, by leading to datasets and user-facing maps of birding hotspots that mostly represent predominantly white neighborhoods. Then, in Chapter 3, I investigated the impacts of not accounting for socially biased sampling when using eBird data to study patterns of urban biodiversity. The luxury effect has emerged as a prominent hypothesis in urban ecology, describing a pattern of higher biodiversity associated with greater socioeconomic status observed in many cities. Using eBird data from 2015-2019, I tested whether an avian luxury effect is observed in Raleigh-Durham, North Carolina, U.S.A. before and after accounting for social sampling bias. By jointly modeling sampling intensity and species richness, I found that sampling intensity and species richness are positively correlated and sampling bias influences the estimated relationship between species richness and income. Thus, failing to account for sampling bias can hinder our ability to accurately observe social-ecological dynamics. Additionally, I found that randomly spatially subsampling eBird data prior to analysis, as recommended by existing guidelines to mitigate sampling bias in eBird data, does not reduce biased sampling related to demographics, because there are data gaps in communities of color and low-income communities that cannot be addressed via spatial subsampling. Therefore, it is paramount that crowdsourced and contributory science projects prioritize more equitable participation in their platforms, both for more ethical, equitable practice and because current sampling inequity negatively impacts data quality and project goals. Quantitative techniques can help us understand the complex observational processes influencing ecological data, and each chapter of this dissertation highlights how tailoring statistical or computing methods to these observational contexts can advance ecological knowledge – either by extending the complexity of models we can feasibly fit, as in Chapter 1, or by acknowledging and accounting for sampling inequity, in Chapters 2 and 3. We are all participants actively shaping the ecological processes we observe, and the actions, approaches, and assumptions used in our research reflect societal systems and biases. Data are never objective, and it is dangerous and false to assume that quantitative techniques can take data out of the contexts in which they were collected. Instead, quantitative frameworks that embrace, reflect, and seek to improve the ways in which social and observational contexts inform what is observed can elevate analytical techniques to tools towards more just, inclusive, and transparent ecological research and conservation.Item Open Access Advances in statistical analysis and modeling of extreme values motivated by atmospheric models and data products(Colorado State University. Libraries, 2018) Fix, Miranda J., author; Cooley, Daniel, advisor; Hoeting, Jennifer, committee member; Wilson, Ander, committee member; Barnes, Elizabeth, committee memberThis dissertation presents applied and methodological advances in the statistical analysis and modeling of extreme values. We detail three studies motivated by the types of data found in the atmospheric sciences, such as deterministic model output and observational products. The first two investigations represent novel applications and extensions of extremes methodology to climate and atmospheric studies. The third investigation proposes a new model for areal extremes and develops methods for estimation and inference from the proposed model. We first detail a study which leverages two initial condition ensembles of a global climate model to compare future precipitation extremes under two climate change scenarios. We fit non-stationary generalized extreme value (GEV) models to annual maximum daily precipitation output and compare impacts under the RCP8.5 and RCP4.5 scenarios. A methodological contribution of this work is to demonstrate the potential of a "pattern scaling" approach for extremes, in which we produce predictive GEV distributions of annual precipitation maxima under RCP4.5 given only global mean temperatures for this scenario. We compare results from this less computationally intensive method to those obtained from our GEV model fitted directly to the RCP4.5 output and find that pattern scaling produces reasonable projections. The second study examines, for the first time, the capability of an atmospheric chemistry model to reproduce observed meteorological sensitivities of high and extreme surface ozone (O3). This work develops a novel framework in which we make three types of comparisons between simulated and observational data, comparing (1) tails of the O3 response variable, (2) distributions of meteorological predictor variables, and (3) sensitivities of high and extreme O3 to meteorological predictors. This last comparison is made using quantile regression and a recent tail dependence optimization approach. Across all three study locations, we find substantial differences between simulations and observational data in both meteorology and meteorological sensitivities of high and extreme O3. The final study is motivated by the prevalence of large gridded data products in the atmospheric sciences, and presents methodological advances in the (finite-dimensional) spatial setting. Existing models for spatial extremes, such as max-stable process models, tend to be geostatistical in nature as well as very computationally intensive. Instead, we propose a new model for extremes of areal data, with a common-scale extension, that is inspired by the simultaneous autoregressive (SAR) model in classical spatial statistics. The proposed model extends recent work on transformed-linear operations applied to regularly varying random vectors, and is unique among extremes models in being directly analogous to a classical linear model. We specify a sufficient condition on the spatial dependence parameter such that our extreme SAR model has desirable properties. We also describe the limiting angular measure, which is discrete, and corresponding tail pairwise dependence matrix (TPDM) for the model. After examining model properties, we then investigate two approaches to estimation and inference for the common-scale extreme SAR model. First, we consider a censored likelihood approach, implemented using Bayesian MCMC with a data augmentation step, but find that this approach is not robust to model misspecification. As an alternative, we develop a novel estimation method that minimizes the discrepancy between the TPDM for the fitted model and the estimated TPDM, and find that it is able to produce reasonable estimates of extremal dependence even in the case of model misspecification.Item Open Access Bayesian methods for environmental exposures: mixtures and missing data(Colorado State University. Libraries, 2022) Hoskovec, Lauren, author; Wilson, Ander, advisor; Magzamen, Sheryl, committee member; Hoeting, Jennifer, committee member; Cooley, Dan, committee memberAir pollution exposure has been linked to increased morbidity and mortality. Estimating the association between air pollution exposure and health outcomes is complicated by simultaneous exposure to multiple pollutants, referred to as a multipollutant mixture. In a multipollutant mixture, exposures may have both independent and interactive effects on health. In addition, observational studies of air pollution exposure often involve missing data. In this dissertation, we address challenges related to model choice and missing data when studying exposure to a mixture of environmental pollutants. First, we conduct a formal simulation study of recently developed methods for estimating the association between a health outcome and exposure to a multipollutant mixture. We evaluate methods on their performance in estimating the exposure-response function, identifying mixture components associated with the outcome, and identifying interaction effects. Other studies have reviewed the literature or compared performance on a single data set; however, none have formally compared such a broad range of new methods in a simulation study. Second, we propose a statistical method to analyze multiple asynchronous multivariate time series with missing data for use in personal exposure assessments. We develop an infinite hidden Markov model for multiple time series to impute missing data and identify shared time-activity patterns in exposures. We estimate hidden states that represent latent environments presenting a unique distribution of a mixture of environmental exposures. Through our multiple imputation algorithm, we impute missing exposure data conditional on the hidden states. Finally, we conduct an individual-level study of the association between long-term exposure to air pollution and COVID-19 severity in a Denver, Colorado, USA cohort. We develop a Bayesian multinomial logistic regression model for data with partially missing categorical outcomes. Our model uses Polya-gamma data augmentation, and we propose a visualization approach for inference on the odds ratio. We conduct one of the first individual-level studies of air pollution exposure and COVID-19 health outcomes using detailed clinical data and individual-level air pollution exposure data.Item Open Access Characteristics of wildfire-igniting lightning in the western United States(Colorado State University. Libraries, 2015) Burris, Lucy Ellen, author; Sibold, Jason, advisor; Kelly, Eugene, committee member; Hoeting, Jennifer, committee memberAnnually, over half the wildfires on federal lands in the conterminous western United States are caused by lightning. However, broad-scale characteristics of wildfire-igniting lightning flashes are poorly understood, and limit our ability to predict what role climate change might have on lightning patterns and in turn on future patterns of wildfire. I investigated lightning-wildfire relationships by comparing the characteristics of lightning flashes that start fires to those that do not across 29 ecoregions in the western US from 2003-2007. After accounting for ecoregional variation, I found little meaningful difference in characteristics of igniting flashes including the proportion of positive flashes, proportion of negative flashes with long continuing current, number of strokes per flash (multiplier), or flash peak current (all attributes thought to be related to ignition potential). In contrast, I found that wildfires are associated with significantly higher lightning flash densities near fire locations compared to further away. However, the role of flash density varied significantly between ecoregions. Given the non-uniqueness of igniting flashes, simple proxies such as storm frequency or intensity may be sufficient to estimate likelihood of lightning ignitions under changing climatic conditions. However, these estimates must be mediated based on ecosystem response to potential ignitions.Item Open Access Classification ensemble methods for mitigating concept drift within online data streams(Colorado State University. Libraries, 2012) Barber, Michael J., author; Howe, Adele E., advisor; Anderson, Charles, committee member; Hoeting, Jennifer, committee memberThe task of instance classification within very large data streams is challenged by both the overwhelming amount of data, and a phenomenon known as concept drift. In this research we provide a comprehensive comparison of several state of the art ensemble methods that purport to handle concept drift, and we propose two additional algorithms. Our two new methods, the AMPE and AMPE2 algorithms are then used to further our understanding of concept drift and the algorithmic factors that influence the performance of ensemble based concept drift algorithms.Item Open Access Evaluation of parameter and model uncertainty in simple applications of a 1D sediment transport model(Colorado State University. Libraries, 2011) Sabatine, Shaina M., author; Niemann, Jeffrey D., advisor; Greimann, Blair, committee member; Hoeting, Jennifer, committee memberThis paper aims to quantify parameter and model uncertainty in simulations from a 1D sediment transport model using two methods from Bayesian statistics. The first method, Multi-Variable Shuffled Complex Evolution Metropolis - Uncertainty Analysis (MSU), is an algorithm that identifies the most likely parameter values and estimates parameter uncertainty for models with multiple outputs. The other method, Bayesian Model Averaging (BMA), determines a combined prediction based on three sediment transport equations and evaluates the uncertainty associated with the selection of a transport equation. These tools are applied to simulations of three flume experiments. Results show that MSU's ability to consider correlation between parameters improves its estimate of the uncertainty in the model forecasts. Also, BMA results suggest that a combination of transport equations usually provides a better forecast than an individual equation, and the selection of a single transport equation substantially increases the overall uncertainty in the model forecasts.Item Open Access Improved estimation and prediction for computationally expensive ecological and paleoclimate models(Colorado State University. Libraries, 2016) Tipton, John, author; Hooten, Mevin, advisor; Opsomer, Jean, advisor; Hoeting, Jennifer, committee member; Aldridge, Cameron, committee memberIn this dissertation, we present statistical methods to evaluate estimation and prediction performance for applied ecological problems. We explore a variety of applied problems and, within this context, we investigate how each method performs. We evaluate empirical performance of a model-based estimator of mean percent canopy cover using a representative United States Forest Service Forest Inventory and Analysis dataset. For two paleoclimate reconstructions, we develop novel modeling methodologies and evaluate model performance using both resampling and simulation methods. In each application, we use proper scoring rules while leveraging parallel computing and computational techniques, that allow fitting of complex models in a finite amount of time.Item Open Access Investigating experimental and environmental factors to provide a mechanistic understanding of benthic algal biomass accumulation in freshwater streams(Colorado State University. Libraries, 2019) Beck, Whitney S., author; Poff, N. LeRoy, advisor; Hall, Ed, committee member; Hoeting, Jennifer, committee member; Spaulding, Sarah, committee memberTo view the abstract, please see the full text of the document.Item Open Access Model based analyses of the cesium dynamics in Pond 4, Savannah River Site(Colorado State University. Libraries, 2018) Miller, Vivien, author; Johnson, Thomas E., advisor; Brandl, Alexander, committee member; Sudowe, Ralf, committee member; Hoeting, Jennifer, committee memberTo view the abstract, please see the full text of the document.Item Open Access Probabilistic foundation of nonlocal diffusion and formulation and analysis for elliptic problems on uncertain domains(Colorado State University. Libraries, 2011) Burch, Nathanial J., author; Estep, Donald, advisor; Hoeting, Jennifer, committee member; Lehoucq, Richard, committee member; Shipman, Patrick, committee member; Tavener, Simon, committee memberIn the first part of this dissertation, we study the nonlocal diffusion equation with so-called Lévy measure ν as the master equation for a pure-jump Lévy process. In the case ν ∈ L1(R), a relationship to fractional diffusion is established in a limit of vanishing nonlocality, which implies the convergence of a compound Poisson process to a stable process. In the case ν ∉ L1(R), the smoothing of the nonlocal operator is shown to correspond precisely to the activity of the underlying Lévy process and the variation of its sample paths. We introduce volume-constrained nonlocal diffusion equations and demonstrate that they are the master equations for Lévy processes restricted to a bounded domain. The ensuing variational formulation and conforming finite element method provide a powerful tool for studying both Lévy processes and fractional diffusion on bounded, non-simple geometries with volume constraints. In the second part of this dissertation, we consider the problem of estimating the distribution of a quantity of interest computed from the solution of an elliptic partial differential equation posed on a domain Ω(θ) ⊂ R2 with a randomly perturbed boundary, where (θ) is a random vector with given probability structure. We construct a piecewise smooth transformation from a partition of Ω(θ) to a reference domain Ω in order to avoid the complications associated with solving the problems on Ω(θ). The domain decomposition formulation is exploited by localizing the effect of the randomness to boundary elements in order to achieve a computationally efficient Monte Carlo sampling procedure. An a posteriori error analysis for the approximate distribution, which includes a deterministic error for each sample and a stochastic error from the effect of sampling, is also presented. We thus provide an efficient means to estimate the distribution of a quantity of interest via a Monte Carlo sampling procedure while also providing a posteriori error estimates for each sample.Item Open Access Semiparametric regression in the presence of complex variance structures arising from small angle x-ray scattering data(Colorado State University. Libraries, 2014) Bugbee, Bruce D., author; Breidt, F. Jay, advisor; Estep, Don, advisor; Meyer, Mary, committee member; Hoeting, Jennifer, committee member; Luger, Karolin, committee memberAn ongoing problem in structural biology is how best to infer structural information for complex, biological macromolecules from indirect observational data. Molecular shape dictates functionality but is not always directly observable. There exists a wide class of experimental methods whose data can be used for indirectly inferring molecular shape features with varying degrees of resolution. Of these methods, small angle X-ray scattering (SAXS) is desirable due to low requirements on the sample of interest. However, SAXS data suffers numerous statistical problems that require the development of novel methodologies. A primary concern is the impact of radially reducing two-dimensional sensor data to a series of smooth mean and variance curves. Additionally, pronounced heteroskedasticity is often observed near sensor boundaries. The work presented here focuses on developing general model frameworks and implementation methods appropriate for SAXS data. Semiparametric regression refers to models that combine known parametric structures with flexible nonparametric components. Three semiparametric regression model frameworks that are well-suited for handling smooth data are presented. The first model introduced is the standard semiparametric regression model, described as a mixed model with low rank penalized splines as random effects. The second model extends the first to the case of heteroskedastic errors, which violate standard model assumptions. The latent variance function in the model is estimated through an additional semiparametric regression, allowing for appropriate uncertainty estimation at the mean level. The final model considers a data structure unique to SAXS experiments. This model incorporates both radial mean and radial variance data in hopes to better infer three-dimensional shape properties and understand experimental effects by including all available data. Each of the three model frameworks is structured hierarchically. Bayesian inference is appealing in this context, as it provides efficient and generalized modeling frameworks in a unified way. The main statistical contributions of this thesis are from the specific methods developed to address the computational challenges of Bayesian inference for these models. The contributions include new Markov Chain Monte Carlo (MCMC) procedures for numerical approximation of posterior distributions and novel variational approximations that are extremely fast and accurate. For the heteroskedastic semiparametric case, known form posterior conditionals are available for all model parameters save for the regression coefficients controlling the latent model variance function. A novel implementation of a multivariate delayed rejection adaptive Metropolis (DRAM) procedure is used to sample from this posterior conditional distribution. The joint model for radial mean and radial variance data is shown to be of comparable structure to the heteroskedastic case and the new DRAM methodology is extended to handle this case. Simulation studies of all three methods are provided, showing that these models provide accurate fits of observed data and latent variance functions. The demands of scientific data processing in the context of SAXS, where large data sets are rapidly attained, lead to consideration of fast approximations as alternatives to MCMC. {Variational approximations} or {Variational Bayes} describes a class of approximation methods where the posterior distribution of the parameters is approximated by minimizing the Kullback-Leibler divergence between the true posterior and a class of distributions under mild structural constraints. Variational approximations have been shown to be good approximations of true posteriors in many cases. A novel variational approximation for the general heteroskedastic semiparametric regression model is derived here. Simulation studies are provided demonstrating fit and coverage properties comparable to the DRAM results at a fraction of the computational cost. A variational approximation for the joint model of radial mean and variance data is also provided but is shown to suffer from poor performance due to high correlation across a subset of regression parameters. The heteroskedastic semiparametric regression framework has some strong structural relationships with a distinct, important problem: spatially adaptive smoothing. A noisy function with different amounts of smoothness over its domain may be systematically under-smoothed or over-smoothed if the smoothing is not spatially adaptive. A novel variational approximation is derived for the problem of spatially adaptive penalized spline regression, and shown to have excellent performance. This approximation method is shown to be able to fit highly oscillatory data while not requiring the traditional tuning and computational resources of standard MCMC implementations. Potential scientific contribution of the statistical methodology developed here are illuminated with SAXS data examples. Analysis of SAXS data typically has two primary concerns: description of experimental effects and estimation of physical shape parameters. Formal statistical procedures for testing the effect of sample concentration and exposure time are presented as alternatives to current methods, in which data sets are evaluated subjectively and often combined in ad hoc ways. Additionally, estimation procedures for the scattering intensity at zero angle, known to be proportional to molecular weight, and the radius of gyration are described along with appropriate measures of uncertainty. Finally, a brief example of the joint radial mean and variance method is provided. Guidelines for extending the models presented here to more complex SAXS problems are also given.Item Open Access Statistical models for animal movement and landscape connectivity(Colorado State University. Libraries, 2013) Hanks, Ephraim M., author; Hooten, Mevin B., advisor; Hoeting, Jennifer, committee member; Wang, Haonan, committee member; Alldredge, Mat, committee member; Theobald, David, committee memberThis dissertation considers statistical approaches to the study of animal movement behavior and landscape connectivity, with particular attention paid to modeling how movement and connectivity are influenced by landscape characteristics. For animal movement data, a novel continuous-time, discrete-space model of animal movement is proposed. This model yields increased computational efficiency relative to existing discrete-space models for animal movement, and a more flexible modeling framework than existing continuous-space models. In landscape genetic approaches to landscape connectivity, spatially-referenced genetic allele data are used to study landscape effects on gene flow. An explicit link is described between a common circuit-theoretic approach to landscape genetics and variogram fitting for Gaussian Markov random fields. A hierarchical model for landscape genetic data is also proposed, with a multinomial data model and latent spatial random effects to model spatial correlation.Item Open Access The pooling of prior distributions via logarithmic and supra-Bayesian methods with application to Bayesian inference in deterministic simulation models(Colorado State University. Libraries, 1998) Roback, Paul J., author; Givens, Geof, advisor; Hoeting, Jennifer, committee member; Howe, Adele, committee member; Tweedie, Richard, committee memberWe consider Bayesian inference when priors and likelihoods are both available for inputs and outputs of a deterministic simulation model. Deterministic simulation models are used frequently by scientists to describe natural systems, and the Bayesian framework provides a natural vehicle for incorporating uncertainty in a deterministic model. The problem of making inference about parameters in deterministic simulation models is fundamentally related to the issue of aggregating (i. e. pooling) expert opinion. Alternative strategies for aggregation are surveyed and four approaches are discussed in detail- logarithmic pooling, linear pooling, French-Lindley supra-Bayesian pooling, and Lindley-Winkler supra-Bayesian pooling. The four pooling approaches are compared with respect to three suitability factors-theoretical properties, performance in examples, and the selection and sensitivity of hyperparameters or weightings incorporated in each method and the logarithmic pool is found to be the most appropriate pooling approach when combining exp rt opinions in the context of deterministic simulation models. We develop an adaptive algorithm for estimating log pooled priors for parameters in deterministic simulation models. Our adaptive estimation approach relies on importance sampling methods, density estimation techniques for which we numerically approximate the Jacobian, and nearest neighbor approximations in cases in which the model is noninvertible. This adaptive approach is compared to a nonadaptive approach over several examples ranging from a relatively simple R1 → R1 example with normally distributed priors and a linear deterministic model, to a relatively complex R2 → R2 example based on the bowhead whale population model. In each case, our adaptive approach leads to better and more efficient estimates of the log pooled prior than the nonadaptive estimation algorithm. Finally, we extend our inferential ideas to a higher-dimensional, realistic model for AIDS transmission. Several unique contributions to the statistical discipline are contained in this dissertation, including: 1. the application of logarithmic pooling to inference in deterministic simulation models; 2. the algorithm for estimating log pooled priors using an adaptive strategy; 3. the Jacobian-based approach to density estimation in this context, especially in higher dimensions; 4. the extension of the French-Lindley supra-Bayesian methodology to continuous parameters; 5. the extension of the Lindley-Winkler supra-Bayesian methodology to multivariate parameters; and, 6. the proofs and illustrations of the failure of Relative Propensity Consistency under the French-Lindley supra-Bayesian approach.Item Open Access Time-filtered inverse modeling of land-atmosphere carbon exchange(Colorado State University. Libraries, 2015) Geyer, Nicholas M., author; Denning, Scott, advisor; Hoeting, Jennifer, committee member; O'Dell, Christopher, committee memberThe sources and sinks of biospheric carbon dioxide represent one of the least understood and most critical processes in carbon science. Since the 1990's, carbon dioxide inversion models have estimated the magnitude, location, and uncertainty of carbon sources and sinks. These inversions are underconstrained statistical problems that employ aggressive statistical regularizations in both space and time to estimate quantities like net ecosystem exchange (NEE) on weekly timescales over fine spatial scales. This study developed and tested a new regularization that leverages the available observational information toward a small number of estimates associated with the longer-lived slowly varying biospheric processes, which control time-averaged sources and sinks of carbon dioxide. This approach multiplicatively adjusts the longer lived component fluxes, gross primary production (GPP) and total respiration (RESP), using several timescale harmonics. This methodology was tested by estimating adjustments to either net or component fluxes from Simple Biosphere Model 4 (SiB4) using observational data from 8 different eddy-covariance flux towers selected from the North American Carbon Program (NACP) site synthesis dataset. The time-filtering methodology was robustly capable of accurately estimating both net and component fluxes given high observational uncertainty. Furthermore, the methodology was flexible of correctly producing estimates of all three fluxes when given a component flux as an additional observational constraint.Item Open Access Understanding extreme behavior by optimizing tail dependence with application to ground level ozone via data mining and spatial modeling(Colorado State University. Libraries, 2015) Russell, Brook T., author; Cooley, Daniel S., advisor; Hoeting, Jennifer, committee member; Wang, Haonan, committee member; Schumacher, Russ, committee memberThis dissertation presents novel work in statistical methods for extremes. Our underlying modeling procedure identifies the linear combination of covariates that is associated with extreme values of a response variable, and is based on the framework of bivariate regular variation. We propose a data mining strategy that is suitable for an analysis of ground level ozone, and spatially model the primary drivers of extreme ozone over a large study region. In this dissertation, we first review statistical methods for univariate and multivariate extremes. We then discuss tail dependence parameters and their estimators and introduce γ, a tail dependence metric which is better suited for optimization than other existing metrics. We also introduce the idea of tail dependence estimators that utilize a smooth threshold rather than the 'hard' threshold common to extremes. A smooth threshold is necessary to perform optimization, which has not previously been considered in extremes studies. We also show consistency of estimators with smooth thresholds. Subsequently, we outline our procedure for optimizing tail dependence and discuss parameter estimation. We also propose a model selection procedure that is based on cross-validation. Then we give a simulation study where we demonstrate our method's ability to detect complicated conditions which lead to extreme behavior and compare our approach to competing methods. Next, we propose a data mining procedure that can be used to find the set of covariates that produces the linear combination that has the highest degree of tail dependence with a response variable. Our data mining procedure is a model selection exercise where the model space is too large to be searched exhaustively. We use an automated model search procedure based on simulated annealing. We also give an analysis of ground level ozone, applying our data mining procedure to data from Atlanta, Georgia and Charlotte, North Carolina. We discuss how our method can be modified to deal with non-continuous covariates such as precipitation. Lastly, we seek to model how a set of primary drivers varies spatially over a study region. We utilize data from 160 EPA stations in 13 US states plus the District of Columbia. We model the parameters in our extreme value procedure spatially using a hierarchical modeling technique. For inference, we utilize a two-step procedure.Item Open Access Wolves, elk, and willows: alternate states and transition thresholds on Yellowstone's northern range(Colorado State University. Libraries, 2012) Marshall, Kristin N., author; Cooper, David, advisor; Hobbs, N. Thompson, advisor; Hoeting, Jennifer, committee member; Theobald, David, committee memberThe detection and prediction of alternate states of ecosystem configuration is of increasing importance in our changing world. Ecosystems may be perturbed by shifts in climate, or by human activity. Many perturbations to ecosystems can be reversed by reducing the initiating stressor. Sometimes shifts in ecosystem states are irreversible, and alternate configurations persist long after the initiating stressor is reduced. The reintroduction of wolves to Yellowstone National Park 17 years ago provided a rare opportunity to study whether the effects of predation could restore an ecosystem degraded by herbivory. Wolves were absent from the Yellowstone ecosystem for approximately 70 years. When wolves were absent, elk numbers increased and heavy herbivory degraded vegetation communities, particularly in riparian areas. Herbivory induced an alternate state in riparian vegetation, where willows, once dominant, were rare on the landscape and short in stature. My dissertation research describes how the top-down effects of predation and herbivory interact with the bottom-up effects of resource availability in northern range riparian areas. My research addressed three questions: 1) How do water table depth and browsing intensity constrain willow height and annual production? 2) What is the role of landscape heterogeneity in determining spatial variation in the configuration of alternate states? 3) How have climate patterns interacted with trophic effects of ungulates and wolves over the last 40 years to shape willow canopy cover, growth, and establishment? My work provides broad understanding of limitations to willow growth on the northern range, and revealed that wolf reintroduction has not restored riparian areas. A decade-long experiment showed that the effects of removing herbivory on willow height and production depend on water table depth. My second study showed that topography and temporal variation in water table depth influence willow height and growth more strongly than does herbivory. My third study found that bottom-up effects of growing season length and precipitation drive patterns in willow height over four decades. Far less support existed for the effects of elk and wolves on willows through time. All of these studies led to the conclusion that bottom-up effects of resource limitation influence northern range willows more strongly than top-down effects of top predators or herbivores. Results from my research show that wolf reintroduction has not uniformly restored riparian areas along small streams on the northern range. Instead, water table depth, topography, and climate drivers influence willows more strongly than herbivory or wolves.