Repository logo
 

Weighted ensemble: practical variance reduction techniques

Date

2022

Authors

Johnson, Mats S., author
Aristoff, David, advisor
Cheney, Margaret, committee member
Krapf, Diego, committee member
Pinaud, Olivier, committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Computational biology and chemistry is proliferated with important constants that are desirable for researchers. The mean-first-passage time (MFPT) is one such important quantity of interest and is pursued in molecular dynamics simulating protein conformational changes, enzyme reaction rates, and more. Often, the simulation of these processes is hindered by such events having prohibitively small probability of observation. For these rare-events, direct estimation by Monte Carlo techniques can be burdened by high variance. We analyzed an importance sampling splitting and killing algorithm called weighted ensemble to address these drawbacks. We used weighted ensemble in the context of a stochastic process governed by a Markov chain (Xt)t≥0 with steady state distribution μ to estimate the MFPT. Weighted ensemble works by partitioning the state space into bins and replicating trajectories in an advantageous and unbiased manner. By introducing a recycling boundary condition, we improved the convergence of our problem to steady state and made use of the Hill relation to estimate the MFPT. This change allows relevant conclusions to be drawn from simulations that are much shorter in time scale when compared to direct estimation of the MFPT. After defining the weighted ensemble algorithm, we decomposed the variance of the weighted ensemble estimator in a way that admits simple optimization problems to be posed. We also defined the relevant coordinate, the flux-discrepancy function, for splitting trajectories in the weighted ensemble method and its associated variance function. When combined with the variance formulas, the flux-discrepancy function was used to guide parameter choices for choosing binning and replication strategies for the weighted ensemble algorithm. Finally, we discuss practical implementations of solutions to the aforementioned optimization problems and demonstrate their effectiveness in the context of a toy problem. We found that the techniques we presented offered a significant variance reduction over a naive implementation of weighted ensemble that is commonly used in practice and direct simulation by naive Monte Carlo. The optimizations we presented correspond to a reduced computational cost for implementing the weighted ensemble algorithm. We further found that our results were applicable even in the case of limited resources which makes their application even more appealing.

Description

Rights Access

Subject

path-sampling
weighted
ensemble
weighted ensemble
variance

Citation

Associated Publications