An autoregressive model generates protein conformations and dynamics
Proteins are not static molecules - they constantly move and transition between different conformations. Understanding these dynamics is crucial for explaining how proteins function. Molecular dynamics (MD) simulations capture atomic motion over time by modeling physical interactions and gradually mapping out the conformational space, providing rich data for studying protein conformational ensembles and dynamics. ConfRover learns from MD simulation data to directly generate protein conformations or motion trajectories, providing a fast alternative to costly MD runs.
The key idea is simple: sampling protein conformations or trajectories can be viewed as generating each conformation (frame) either independently or autoregressively conditioned on preceding frames, like language models. This unified view provides an efficient framework for learning and generating protein conformational dynamics across a variety of tasks.
ConfRover brings together the strengths of modern protein structure predictions, language-model-like sequence models, and diffusion probabilistic models to capture the complex spatiotemporal dependencies in protein motion and sample new conformations conditioning on historical context.