S.E. Harnstrarn, D. Moy, et al.
Journal of Vacuum Science and Technology A: Vacuum, Surfaces and Films
We present a novel approach to chemical foundation models, leveraging structured state space sequence models (SSMs) to overcome the limitations of traditional Transformer-based architectures. While Transformers have achieved state-of-the-art results in chemical tasks such as property prediction and molecule generation, their self-attention mechanism is constrained by its inability to model data outside of a finite context window and its quadratic scaling with respect to window length. In contrast, SSMs offer a promising alternative for sequence modeling, enabling the capture of complex patterns and dependencies in molecular structures. Our Mamba architecture, a simplified end-to-end SSM-based neural network, eliminates the need for attention and MLP blocks, allowing for faster inference. We pre-train Mamba on a large, curated dataset of 91 million SMILES samples (equivalent to 4 billion molecular tokens) sourced from PubChem, and evaluate its performance on various benchmark datasets. Our experiments demonstrate the SSM's capacity to provide state-of-the-art results while maintaining fast inference, supporting complex tasks such as molecular property prediction, classification, molecular reconstruction, and synthesis yield prediction. This work advances the state-of-the-art in AI methodology in chemical sciences, offering a promising direction for future research in molecular modeling and discovery.
S.E. Harnstrarn, D. Moy, et al.
Journal of Vacuum Science and Technology A: Vacuum, Surfaces and Films
Eduardo Almeida Soares, Emilio Ashton Vital Brazil, et al.
Communications Chemistry
R.M. Feenstra
Applied Surface Science
A. Nagarajan, S. Mukherjee, et al.
Journal of Applied Mechanics, Transactions ASME