Publications
Permanent URI for this collection
Browse
Recent Submissions
Item Open Access Constant depth circuit complexity for generating quasigroups(Colorado State University. Libraries, 2024-07-16) Collins, Nathaniel A., author; Grochow, Joshua A., author; Levet, Michael, author; Weiß, Armin, author; ACM, publisherWe investigate the constant-depth circuit complexity of the Isomorphism Problem, Minimum Generating Set Problem (MGS), and Sub(qasi)group Membership Problem (Membership) for groups and quasigroups (=Latin squares), given as input in terms of their multiplication (Cayley) tables. Despite decades of research on these problems, lower bounds for these problems even against depth-2 AC circuits remain unknown. Perhaps surprisingly, Chattopadhyay, Torán, and Wagner (FSTTCS 2010; ACM Trans. Comput. Theory, 2013) showed that Quasigroup Isomorphism could be solved by AC circuits of depth O(log log n) using O(log2 n) nondeterministic bits, a class we denote ∃log2 nFOLL. We narrow this gap by improving the upper bound for these problems to quasiAC0, thus decreasing the depth to constant. In particular, we show that Membership can be solved in NTIME(polylog(n)) … Our results suggest that understanding the constant-depth circuit complexity may be key to resolving the complexity of problems concerning (quasi)groups in the multiplication table model.Item Open Access Mixed-precision S/DGEMM using the TF32 and TF64 frameworks on low-precision AI tensor cores(Colorado State University. Libraries, 2023-11-12) Valero-Lara, Pedro, author; Liu, Frank, autor; Vetter, Jeffrey S., author; Jorquera, Ian, author; ACM, publisherUsing NVIDIA graphics processing units (GPUs) equipped with Tensor Cores has enabled the significant acceleration of general matrix multiplication (GEMM) for applications in machine learning (ML) and artificial intelligence (AI) and in high-performance computing (HPC) generally. The use of such power-efficient, specialized accelerators can provide a performance increase between 8× and 20×, albeit with a loss in precision. However, a high level of precision is required in many large scientific and HPC applications, and computing in single or double precision is still necessary for many of these applications to maintain accuracy. Fortunately, mixed-precision methods can be employed to maintain a higher level of numerical precision while also taking advantage of the performance increases from computing with lower-precision AI cores. With this in mind, we extend the state of the art by using NVIDIA's new TF32 framework. This new framework not only burdens some constraints of the previous frameworks, such as costly 32 16-bit castings but also provides an equivalent precision and performance by using a much simpler approach. We also propose a new framework called TF64 that attempts double-precision arithmetic with low-precision Tensor Cores. Although this framework does not exist yet, we validated the correctness of this idea and achieved an equivalent of 64-bit precision on 32-bit hardware.Item Open Access Algorithms for parallel generic hp-adaptive finite element software(Colorado State University. Libraries, 2023-09-19) Fehling, Marc, author; Bangerth, Wolfgang, author; ACM, publisherThe hp-adaptive finite element method—where one independently chooses the mesh size (h) and polynomial degree (p) to be used on each cell—has long been known to have better theoretical convergence properties than either h- or p-adaptive methods alone. However, it is not widely used, owing at least in part to the difficulty of the underlying algorithms and the lack of widely usable implementations. This is particularly true when used with continuous finite elements. Herein, we discuss algorithms that are necessary for a comprehensive and generic implementation of hp-adaptive finite element methods on distributed-memory, parallel machines. In particular, we will present a multistage algorithm for the unique enumeration of degrees of freedom suitable for continuous finite element spaces, describe considerations for weighted load balancing, and discuss the transfer of variable size data between processes. We illustrate the performance of our algorithms with numerical examples and demonstrate that they scale reasonably up to at least 16,384 message passage interface processes. We provide a reference implementation of our algorithms as part of the open source library deal.II.Item Open Access Calculus for biological scientists - OER project materials(Colorado State University. Libraries, 2024-05-09) Shriner, Jeff, authorFor interactive version and optimum user experience, please see: https://www.math.colostate.edu/~shriner/meta_frontmatter.html.Item Open Access Mathematics for computational science: lecture notes for MATH 156 - OER project materials(Colorado State University. Libraries, 2022-01-03) Hulpke, Alexander, author; Bonney, K., author; Meade, H., author; Moy, M., author; Neighbors, T., author; Tremaine, R., authorItem Open Access Eigenvalues and completeness for regular and simply irregular two-point differential operators(Colorado State University. Libraries, 2006) Locker, John, authorIn this monograph the author develops the spectral theory for an nth order two-point differential operator L in the Hilbert space L2[0,1], where L is determined by an nth order formal differential operator ℓ having variable coefficients and by n linearly independent boundary values B1,…,Bn. Using the Birkhoff approximate solutions of the differential equation (ρnI−ℓ)u=0, the differential operator L is classified as belonging to one of three possible classes: regular, simply irregular, or degenerate irregular. For the regular and simply irregular classes, the author develops asymptotic expansions of solutions of the differential equation (ρnI−ℓ)u=0, constructs the characteristic determinant and Green's function, characterizes the eigenvalues and the corresponding algebraic multiplicities and ascents, and shows that the generalized eigenfunctions of L are complete in L2[0,1]. He also gives examples of degenerate irregular differential operators illustrating some of the unusual features of this class.