Department of Statistics
Permanent URI for this community
These digital collections include theses, dissertations, and datasets from the Department of Statistics. Due to departmental name changes, materials from the following historical department are also included here: Mathematics and Statistics.
Browse
Browsing Department of Statistics by Author "Adams, Henry, committee member"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Open Access Modeling the upper tail of the distribution of facial recognition non-match scores(Colorado State University. Libraries, 2016) Hunter, Brett D., author; Cooley, Dan, advisor; Givens, Geof, advisor; Kokoszka, Piotr, committee member; Fosdick, Bailey, committee member; Adams, Henry, committee memberIn facial recognition applications, the upper tail of the distribution of non-match scores is of interest because existing algorithms classify a pair of images as a match if their score exceeds some high quantile of the non-match distribution. I construct a general model for the distribution above the (1-τ)th quantile borrowing ideas from extreme value theory. The resulting distribution can be viewed as a reparameterized generalized Pareto distribution (GPD), but it differs from the traditional GPD in that τ is treated as fixed. Inference for both the (1-τ)th quantile uτ and the GPD scale and shape parameters is performed via M-estimation, where my objective function is a combination of the quantile regression loss function and reparameterized GPD densities. By parameterizing uτ and the GPD parameters in terms of available covariates, understanding of these covariates' influence on the tail of the distribution of non-match scores is attained. A simulation study shows that my method is able to estimate both the set of parameters describing the covariates' influence and high quantiles of the non-match distribution. The simulation study also shows that my model is competitive with quantile regression in estimating high quantiles and that it outperforms quantile regression for extremely high quantiles. I apply my method to a data set of non-match scores and find that covariates such as gender, use of glasses, and age difference have a strong influence on the tail of the non-match distribution.Item Open Access New methods for fixed-margin binary matrix sampling, Fréchet covariance, and MANOVA tests for random objects in multiple metric spaces(Colorado State University. Libraries, 2022) Fout, Alex M., author; Fosdick, Bailey, advisor; Kaplan, Andee, committee member; Cooley, Daniel, committee member; Adams, Henry, committee memberMany approaches to the analysis of network data essentially view the data as Euclidean and apply standard multivariate techniques. In this dissertation, we refrain from this approach, exploring two alternate approaches to the analysis of networks and other structured data. The first approach seeks to determine how unique an observed simple, directed network is by comparing it to like networks which share its degree distribution. Generating networks for comparison requires sampling from the space of all binary matrices with the prescribed row and column margins, since enumeration of all such matrices is often infeasible for even moderately sized networks with 20-50 nodes. We propose two new sampling methods for this problem. First, we extend two Markov chain Monte Carlo methods to sample from the space non-uniformly, allowing flexibility in the case that some networks are more likely than others. We show that non-uniform sampling could impede the MCMC process, but in certain special cases is still valid. Critically, we illustrate the differential conclusions that could be drawn from uniform vs. nonuniform sampling. Second, we develop a generalized divide and conquer approach which recursively divides matrices into smaller subproblems which are much easier to count and sample. Each division step reveals interesting mathematics involving the enumeration of integer partitions and points in convex lattice polytopes. The second broad approach we explore is comparing random objects in metric spaces lacking a coordinate system. Traditional definitions of the mean and variance no longer apply, and standard statistical tests have needed reconceptualization in terms of only distances in the metric space. We consider the multivariate setting where random objects exist in multiple metric spaces, which can be thought of as distinct views of the random object. We define the notion of Fréchet covariance to measure dependence between two metric spaces, and establish consistency for the sample estimator. We then propose several tests for differences in means and covariance matrices among two or more groups in multiple metric spaces, and compare their performance on scenarios involving random probability distributions and networks with node covariates.