Michelle Cheng, Wei-Chih Chien, et al.
MRS Spring Meeting 2023
Given the costs of evaluating molecular properties, and the vast size of molecular search space, there is a need for data efficient algorithms which actively and strategically select candidate molecules for evaluation. While target properties are costly to compute or measure, in many cases auxiliary physical/chemical properties can be queried at a lower cost, and are known to be predictive of (and mechanistically related to) target properties. We introduce a Bayesian active learning algorithm which (i) maintains a graphical Gaussian Process based model of the dependencies between molecular structure, auxiliary physical/chemical properties, and target properties such as toxicity and biodegradability, and (ii) adaptively selects properties to evaluate depending on evaluation costs, model uncertainty, and expected task-relevant information gain. We discuss its ability to identify molecules with target property values in a cost-effective manner on a class of anionic photoacid generator (PAG) molecules, and study the dependence of learned molecule evaluation strategies on the relative query costs and mutual information between target and auxiliary properties.
Michelle Cheng, Wei-Chih Chien, et al.
MRS Spring Meeting 2023
Kenneth L. Clarkson, Elad Hazan, et al.
Journal of the ACM
Eduardo Almeida Soares, Dmitry Zubarev, et al.
ICLR 2025
Aditya Malik, Nalini Ratha, et al.
CAI 2024