Published January 6, 2023 | Version v1
Journal article Open

Chasing collective variables using temporal data-driven strategies

  • 1. Université de Lorraine
  • 2. University of Chicago

Description

The convergence of free-energy calculations based on importance sampling depends heavily on the choice of collective variables (CVs), which in principle, should include the slow degrees of freedom of the biological processes to be investigated. Autoencoders (AEs), as emerging data-driven dimension reduction tools, have been utilised for discovering CVs. AEs, however, are often treated as black boxes, and what AEs actually encode during training, and whether the latent variables from encoders are suitable as CVs for further free-energy calculations remains unknown. In this contribution, we review AEs and their time-series-based variants, including time-lagged AEs (TAEs) and modified TAEs, as well as the closely related model variational approach for Markov processes networks (VAMPnets). We then show through numerical examples that AEs learn the high-variance modes instead of the slow modes. In stark contrast, time series-based models are able to capture the slow modes. Moreover, both modified TAEs with extensions from slow feature analysis and the state-free reversible VAMPnets (SRVs) can yield orthogonal multidimensional CVs. As an illustration, we employ SRVs to discover the CVs of the isomerizations of N-acetyl-N′-methylalanylamide and trialanine by iterative learning with trajectories from biased simulations. Last, through numerical experiments with anisotropic diffusion, we investigate the potential relationship of time-series-based models and committor probabilities.

Data availability

The data that support the findings of this study are openly available upon request.

Files

Chasing-collective-variables-using-temporal-data-driven-strategies.pdf

Files (3.8 MB)

Name Size Download all
Article
md5:75c31098221162c0bf6bf0cc17249f7d
3.3 MB Preview Download
Supplementary materials
md5:385c935b7a12f3f13359c5ae5982de17
495.4 kB Preview Download

Additional details

Identifiers

DOI
10.1017/qrd.2022.23
Other
oai:uchicago.tind.io:5790

Funding

Agence Nationale de la Recherche (Lorraine Artificicial Intelligence – LOR-AI and ProteaseInAction)

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Biochemistry and Molecular Biology