Browsing by Author "Bates, Dan, committee member"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
Item Open Access Algorithms for feature selection and pattern recognition on Grassmann manifolds(Colorado State University. Libraries, 2015) Chepushtanova, Sofya, author; Kirby, Michael, advisor; Peterson, Chris, committee member; Bates, Dan, committee member; Ben-Hur, Asa, committee memberThis dissertation presents three distinct application-driven research projects united by ideas and topics from geometric data analysis, optimization, computational topology, and machine learning. We first consider hyperspectral band selection problem solved by using sparse support vector machines (SSVMs). A supervised embedded approach is proposed using the property of SSVMs to exhibit a model structure that includes a clearly identifiable gap between zero and non-zero feature vector weights that permits important bands to be definitively selected in conjunction with the classification problem. An SSVM is trained using bootstrap aggregating to obtain a sample of SSVM models to reduce variability in the band selection process. This preliminary sample approach for band selection is followed by a secondary band selection which involves retraining the SSVM to further reduce the set of bands retained. We propose and compare three adaptations of the SSVM band selection algorithm for the multiclass problem. We illustrate the performance of these methods on two benchmark hyperspectral data sets. Second, we propose an approach for capturing the signal variability in data using the framework of the Grassmann manifold (Grassmannian). Labeled points from each class are sampled and used to form abstract points on the Grassmannian. The resulting points have representations as orthonormal matrices and as such do not reside in Euclidean space in the usual sense. There are a variety of metrics which allow us to determine distance matrices that can be used to realize the Grassmannian as an embedding in Euclidean space. Multidimensional scaling (MDS) determines a low dimensional Euclidean embedding of the manifold, preserving or approximating the Grassmannian geometry based on the distance measure. We illustrate that we can achieve an isometric embedding of the Grassmann manifold using the chordal metric while this is not the case with other distances. However, non-isometric embeddings generated by using the smallest principal angle pseudometric on the Grassmannian lead to the best classification results: we observe that as the dimension of the Grassmannian grows, the accuracy of the classification grows to 100% in binary classification experiments. To build a classification model, we use SSVMs to perform simultaneous dimension selection. The resulting classifier selects a subset of dimensions of the embedding without loss in classification performance. Lastly, we present an application of persistent homology to the detection of chemical plumes in hyperspectral movies. The pixels of the raw hyperspectral data cubes are mapped to the geometric framework of the Grassmann manifold where they are analyzed, contrasting our approach with the more standard framework in Euclidean space. An advantage of this approach is that it allows the time slices in a hyperspectral movie to be collapsed to a sequence of points in such a way that some of the key structure within and between the slices is encoded by the points on the Grassmannian. This motivates the search for topological structure, associated with the evolution of the frames of a hyperspectral movie, within the corresponding points on the manifold. The proposed framework affords the processing of large data sets, such as the hyperspectral movies explored in this investigation, while retaining valuable discriminative information. For a particular choice of a distance metric on the Grassmannian, it is possible to generate topological signals that capture changes in the scene after a chemical release.Item Open Access Covariance integral invariants of embedded Riemannian manifolds for manifold learning(Colorado State University. Libraries, 2018) Álvarez Vizoso, Javier, author; Peterson, Christopher, advisor; Kirby, Michael, advisor; Bates, Dan, committee member; Cavalieri, Renzo, committee member; Eykholt, Richard, committee memberThis thesis develops an effective theoretical foundation for the integral invariant approach to study submanifold geometry via the statistics of the underlying point-set, i.e., Manifold Learning from covariance analysis. We perform Principal Component Analysis over a domain determined by the intersection of an embedded Riemannian manifold with spheres or cylinders of varying scale in ambient space, in order to generalize to arbitrary dimension the relationship between curvature and the eigenvalue decomposition of covariance matrices. In the case of regular curves in general dimension, the covariance eigenvectors converge to the Frenet-Serret frame and the corresponding eigenvalues have ratios that asymptotically determine the generalized curvatures completely, up to a constant that we determine by proving a recursion relation for a certain sequence of Hankel determinants. For hypersurfaces, the eigenvalue decomposition has series expansion given in terms of the dimension and the principal curvatures, where the eigenvectors converge to the Darboux frame of principal and normal directions. In the most general case of embedded Riemannian manifolds, the eigenvalues and limit eigenvectors of the covariance matrices are found to have asymptotic behavior given in terms of the curvature information encoded by the third fundamental form of the manifold, a classical tensor that we generalize to arbitrary dimension, and which is related to the Weingarten map and Ricci operator. These results provide descriptors at scale for the principal curvatures and, in turn, for the second fundamental form and the Riemann curvature tensor of a submanifold, which can serve to perform multi-scale Geometry Processing and Manifold Learning, making use of the advantages of the integral invariant viewpoint when only a discrete sample of points is available.Item Open Access Grassmann, Flag, and Schubert varieties in applications(Colorado State University. Libraries, 2017) Marrinan, Timothy P., author; Kirby, Michael, advisor; Peterson, Chris, advisor; Azimi-Sadjadi, Mahmood R., committee member; Bates, Dan, committee member; Draper, Bruce, committee memberThis dissertation develops mathematical tools for signal processing and pattern recognition tasks where data with the same identity is assumed to vary linearly. We build on the growing canon of techniques for analyzing and optimizing over data on Grassmann manifolds. Specifically we expand on a recently developed method referred to as the flag mean that finds an average representation for a collection data that consists of linear subspaces of possibly different dimensions. When prior knowledge exists about relationships between these data, we show that a point analogous to the flag mean can be found as an element of a Schubert variety to incorporates this theoretical information. This domain restriction relates closely to a recent result regarding point-to-set functions. This restricted average along with a property of the flag mean that prioritizes weak but common information, leads to practical applications of the flag mean such as chemical plume detection in long-wave infrared hyperspectral videos, and a modification of the well-known diffusion map for adaptively visualizing data relationships in 2-dimensions.Item Open Access Highly scalable algorithms for scheduling tasks and provisioning machines on heterogeneous computing systems(Colorado State University. Libraries, 2015) Tarplee, Kyle M., author; Maciejewski, Anthony A., advisor; Siegel, Howard Jay, committee member; Chong, Edwin, committee member; Bates, Dan, committee memberAs high performance computing systems increase in size, new and more efficient algorithms are needed to schedule work on the machines, understand the performance trade-offs inherent in the system, and determine which machines to provision. The extreme scale of these newer systems requires unique task scheduling algorithms that are capable of handling millions of tasks and thousands of machines. A highly scalable scheduling algorithm is developed that computes high quality schedules, especially for large problem sizes. Large-scale computing systems also consume vast amounts of electricity, leading to high operating costs. Through the use of novel resource allocation techniques, system administrators can examine this trade-off space to quantify how much a given performance level will cost in electricity, or see what kind of performance can be expected when given an energy budget. Trading-off energy and makespan is often difficult for companies because it is unclear how each affects the profit. A monetary-based model of high performance computing is presented and a highly scalable algorithm is developed to quickly find the schedule that maximizes the profit per unit time. As more high performance computing needs are being met with cloud computing, algorithms are needed to determine the types of machines that are best suited to a particular workload. An algorithm is designed to find the best set of computing resources to allocate to the workload that takes into account the uncertainty in the task arrival rates, task execution times, and power consumption. Reward rate, cost, failure rate, and power consumption can be optimized, as desired, to optimally trade-off these conflicting objectives.Item Open Access Homotopy continuation methods, intrinsic localized modes, and cooperative robotic workspaces(Colorado State University. Libraries, 2012) Brake, Daniel Abram, author; Putkaradze, Vakhtang, advisor; Maciejewski, Tony, advisor; Marconi, Mario, committee member; Bates, Dan, committee member; Shipman, Patrick, committee memberThis dissertation considers three topics that are united by the theme of application of geometric and nonlinear mechanics to practical problems. Firstly we consider the parallel implementation of numerical solution of nonlinear polynomial systems depending on parameters. The program written to do this is called Paramotopy, and uses the Message Passing Interface to distribute homotopy continuation solves in another program called Bertini across a supercomputer. Paramotopy manages writing of Bertini input files, allows automatic re-solution of the system at points at which paths failed, and makes data management easy. Furthermore, parameter homotopy nets huge performance gains over fresh homotopy continuation runs. Superlinear speedup was achieved, up to hard drive throughput capacity. Various internal settings are demonstrated and explored, and the User's Manual is included. Second, we apply nonlinear theory and simulation to nanomechanical sensor arrays. Using vibrating GaAs pillars, we model Intrinsic Localized Modes (ILMs), and investigate ILM-defect pinning, formation, lifetime, travel and movement, and parameter dependence. Intrinsic Localized Modes have been analyzed on arrays of nonlinear oscillators. So far, these oscillators have had a single direction of vibration. In current experiments for single molecule detection, arrays made of Gallium Arsenide will be innately bidirectional, forced, dissipative. We expand previous full models to bidirectionality, and simulate using ODE solvers. We show that small regions of a very large parameter space permit strong ILM formation. Additionally, we use Hamiltonian mechanics to derive new simplified models for the monodirectional ILM travel on an infinite array. This monodirectional ILMs of constant amplitude have unrealistic behavior. Permitting the amplitude of the ILM to vary in time produces much more realistic behavior, including wandering and intermittent pinning. The final set of problems concerns the application of numerical algebraic geometric methods to untangle the phase space of cooperating robots, and optimize configuration for fault tolerance. Given two robots in proximity to each other, if one experiences joint failure, the other may be able to assist, restoring lost workspace. We define a new multiplicity-weighted workspace measure, and use it to solve the optimization problem of finding the best location for an assistance socket and separation distance for the two robots, showing that the solution depends on robot geometry, which link is being grasped, and the choice of objective function.Item Open Access Low rank representations of matrices using nuclear norm heuristics(Colorado State University. Libraries, 2014) Osnaga, Silvia Monica, author; Kirby, Michael, advisor; Peterson, Chris, advisor; Bates, Dan, committee member; Wang, Haonan, committee memberThe pursuit of low dimensional structure from high dimensional data leads in many instances to the finding the lowest rank matrix among a parameterized family of matrices. In its most general setting, this problem is NP-hard. Different heuristics have been introduced for approaching the problem. Among them is the nuclear norm heuristic for rank minimization. One aspect of this thesis is the application of the nuclear norm heuristic to the Euclidean distance matrix completion problem. As a special case, the approach is applied to the graph embedding problem. More generally, semi-definite programming, convex optimization, and the nuclear norm heuristic are applied to the graph embedding problem in order to extract invariants such as the chromatic number, Rn embeddability, and Borsuk-embeddability. In addition, we apply related techniques to decompose a matrix into components which simultaneously minimize a linear combination of the nuclear norm and the spectral norm. In the case when the Euclidean distance matrix is the distance matrix for a complete k-partite graph it is shown that the nuclear norm of the associated positive semidefinite matrix can be evaluated in terms of the second elementary symmetric polynomial evaluated at the partition. We prove that for k-partite graphs the maximum value of the nuclear norm of the associated positive semidefinite matrix is attained in the situation when we have equal number of vertices in each set of the partition. We use this result to determine a lower bound on the chromatic number of the graph. Finally, we describe a convex optimization approach to decomposition of a matrix into two components using the nuclear norm and spectral norm.Item Open Access Mean variants on matrix manifolds(Colorado State University. Libraries, 2012) Marks, Justin D., author; Peterson, Chris, advisor; Kirby, Michael, advisor; Bates, Dan, committee member; Anderson, Chuck, committee memberThe geometrically elegant Stiefel and Grassmann manifolds have become organizational tools for data applications, such as illumination spaces for faces in digital photography. Modern data analysis involves increasingly large-scale data sets, both in terms of number of samples and number of features per sample. In circumstances such as when large-scale data has been mapped to a Stiefel or Grassmann manifold, the computation of mean representatives for clusters of points on these manifolds is a valuable tool. We derive three algorithms for determining mean representatives for a cluster of points on the Stiefel manifold and the Grassmann manifold. Two algorithms, the normal mean and the projection mean, follow the theme of the Karcher mean, relying upon inversely related maps that operate between the manifold and the tangent bundle. These maps are informed by the geometric definition of the tangent bundle and the normal bundle. From the cluster of points, each algorithm exploits these maps in a predictor/corrector loop until converging, with prescribed tolerance, to a fixed point. The fixed point acts as the normal mean representative, or projection mean representative, respectively, of the cluster. This method shares its principal structural characteristics with the Karcher mean, but utilizes a distinct pair of inversely related maps. The third algorithm, called the flag mean, operates in a context comparable to a generalized Grassmannian. It produces a mean subspace of arbitrary dimension. We provide applications and discuss generalizations of these means to other manifolds.Item Open Access On the formulation and uses of SVD-based generalized curvatures(Colorado State University. Libraries, 2016) Arn, Robert T., author; Kirby, Michael, advisor; Peterson, Chris, advisor; Bates, Dan, committee member; Reiser, Raoul, committee memberIn this dissertation we consider the problem of computing generalized curvature values from noisy, discrete data and applications of the provided algorithms. We first establish a connection between the Frenet-Serret Frame, typically defined on an analytical curve, and the vectors from the local Singular Value Decomposition (SVD) of a discretized time-series. Next, we expand upon this connection to relate generalized curvature values, or curvatures, to a scaled ratio of singular values. Initially, the local singular value decomposition is centered on a point of the discretized time-series. This provides for an efficient computation of curvatures when the underlying curve is known. However, when the structure of the curve is not known, for example, when noise is present in the tabulated data, we propose two modifications. The first modification computes the local singular value decomposition on the mean-centered data of a windowed selection of the time-series. We observe that the mean-center version increases the stability of the curvature estimations in the presence of signal noise. The second modification is an adaptive method for selecting the size of the window, or local ball, to use for the singular value decomposition. This allows us to use a large window size when curvatures are small, which reduces the effects of noise thanks to the use of a large number of points in the SVD, and to use a small window size when curvatures are large, thereby best capturing the local curvature. Overall we observe that adapting the window size to the data, enhances the estimates of generalized curvatures. The combination of these two modifications produces a tool for computing generalized curvatures with reasonable precision and accuracy. Finally, we compare our algorithm, with and without modifications, to existing numerical curvature techniques on different types of data such as that from the Microsoft Kinect 2 sensor. To address the topic of action segmentation and recognition, a popular topic within the field of computer vision, we created a new dataset from this sensor showcasing a pose space skeletonized representation of individuals performing continuous human actions as defined by the MSRC-12 challenge. When this data is optimally projected onto a low-dimensional space, we observed each human motion lies on a distinguished line, plane, hyperplane, etc. During transitions between motions, either the dimension of the optimal subspace significantly, or the trajectory of the curve through pose space nearly reverses. We use our methods of computing generalized curvature values to identify these locations, categorized as either high curvatures or changing curvatures. The geometric characterization of the time-series allows us to segment individual,or geometrically distinct, motions. Finally, using these segments, we construct a methodology for selecting motions to conjoin for the task of action classification.Item Open Access Performance bounds for greedy strategies in submodular optimization problems(Colorado State University. Libraries, 2018) Liu, Yajing, author; Chong, Edwin K. P., advisor; Pezeshki, Ali, advisor; Luo, J. Rockey, committee member; Bates, Dan, committee memberTo view the abstract, please see the full text of the document.Item Open Access The D-neighborhood complex of a graph(Colorado State University. Libraries, 2014) Previte, Corrine, author; Peterson, Chris, advisor; Hulpke, Alexander, advisor; Bates, Dan, committee member; Gelfand, Martin, committee memberThe Neighborhood complex of a graph, G, is an abstract simplicial complex formed by the subsets of the neighborhoods of all vertices in G. The construction of this simplicial complex can be generalized to use any subset of graph distances as a means to form the simplices in the associated simplicial complex. Consider a simple graph G with diameter d. Let D be a subset of {0,1,..., d}. For each vertex, u, the D-neighborhood is the simplex consisting of all vertices whose graph distance from u lies in D. The D-neighborhood complex of G, denoted DN(G,D), is the simplicial complex generated by the D-neighborhoods of vertices in G. We relate properties of the graph G with the homology of the chain complex associated to DN(G,D).Item Open Access The flag of best fit as a representative for a collection of linear subspaces(Colorado State University. Libraries, 2013) Marrinan, Timothy P., author; Kirby, Michael, advisor; Bates, Dan, committee member; Draper, Bruce, committee member; Peterson, Chris, committee memberThis thesis will develop a technique for representing a collection of subspaces with a flag of best fit, and apply it to practical problems within computer vision and pattern analysis. In particular, we will find a nested sequence of subspaces that are central, with respect to an optimization criterion based on the projection Frobenius norm, to a set of points in the disjoint union of a collection of Grassmann manifolds. Referred to as the flag mean, this sequence can be computed analytically. Three existing subspace means in the literature, the Karcher mean, the extrinsic manifold mean, and the L2-median, will be discussed to determine the need and relevance of the flag mean. One significant point of separation between the flag mean and existing means is that the flag mean can be computed for points that lie on different Grassmann manifolds, under certain constraints. Advantages of this distinction will be discussed. Additionally, results of experiments based on data from DARPA's Mind's Eye Program will be compared between the flag mean and the Karcher mean. Finally, distance measures for comparing flags to other flags, and similarity scores for comparing flags to subspaces will be discussed and applied to the Carnegie Mellon University, 'Pose, Illumination, and Expression' database.