Browsing by Author "Johnson, Mats S., author"

Now showing 1 - 2 of 2

Open Access
Data-driven methods for compact modeling of stochastic processes
(Colorado State University. Libraries, 2024) Johnson, Mats S., author; Aristoff, David, advisor; Cheney, Margaret, committee member; Pinaud, Olivier, committee member; Krapf, Diego, committee member
Stochastic dynamics are prevalent throughout many scientific disciplines where finding useful compact models is an ongoing pursuit. However, the simulations involved are often high-dimensional, complex problems necessitating vast amounts of data. This thesis addresses two approaches for handling such complications, coarse graining and neural networks. First, by combining Markov renewal processes with Mori-Zwanzig theory, coarse graining error can be eliminated when modeling the transition probabilities of the system. Second, instead of explicitly defining the low-dimensional approximation, using kernel approximations and a scaling matrix the appropriate subspace is uncovered through iteration. The algorithm, named the Fast Committor Machine, applies the recent Recursive Feature Machine of Radhakrishnan et al. to the committor problem using randomized numerical linear algebra. Both projects outline practical data-driven methods for estimating quantities of interest in stochastic processes that are tunable with only a few hyperparameters. The success of these methods is demonstrated numerically against standard methods on the biomolecule alanine dipeptide.
Open Access
Weighted ensemble: practical variance reduction techniques
(Colorado State University. Libraries, 2022) Johnson, Mats S., author; Aristoff, David, advisor; Cheney, Margaret, committee member; Krapf, Diego, committee member; Pinaud, Olivier, committee member
Computational biology and chemistry is proliferated with important constants that are desirable for researchers. The mean-first-passage time (MFPT) is one such important quantity of interest and is pursued in molecular dynamics simulating protein conformational changes, enzyme reaction rates, and more. Often, the simulation of these processes is hindered by such events having prohibitively small probability of observation. For these rare-events, direct estimation by Monte Carlo techniques can be burdened by high variance. We analyzed an importance sampling splitting and killing algorithm called weighted ensemble to address these drawbacks. We used weighted ensemble in the context of a stochastic process governed by a Markov chain (Xt)t≥0 with steady state distribution μ to estimate the MFPT. Weighted ensemble works by partitioning the state space into bins and replicating trajectories in an advantageous and unbiased manner. By introducing a recycling boundary condition, we improved the convergence of our problem to steady state and made use of the Hill relation to estimate the MFPT. This change allows relevant conclusions to be drawn from simulations that are much shorter in time scale when compared to direct estimation of the MFPT. After defining the weighted ensemble algorithm, we decomposed the variance of the weighted ensemble estimator in a way that admits simple optimization problems to be posed. We also defined the relevant coordinate, the flux-discrepancy function, for splitting trajectories in the weighted ensemble method and its associated variance function. When combined with the variance formulas, the flux-discrepancy function was used to guide parameter choices for choosing binning and replication strategies for the weighted ensemble algorithm. Finally, we discuss practical implementations of solutions to the aforementioned optimization problems and demonstrate their effectiveness in the context of a toy problem. We found that the techniques we presented offered a significant variance reduction over a naive implementation of weighted ensemble that is commonly used in practice and direct simulation by naive Monte Carlo. The optimizations we presented correspond to a reduced computational cost for implementing the weighted ensemble algorithm. We further found that our results were applicable even in the case of limited resources which makes their application even more appealing.