Human-in-the-loop for a Disconnection Aware Retrosynthesis
Abstract
In single-step retrosynthesis, a target molecule is broken down by considering the bonds to be changed and/or functional group interconversions. In modern computer-assisted synthesis planning tools, the predictions of these changes are typically carried out automatically. Several deep-learning-based approaches to single-step retrosynthesis treat the prediction of possible disconnections as a translation task, relying on the use of the Transformer architecture [1] and the simplified molecular-input line-entry system (SMILES) [2,3] notation [4-7]. Given a target molecule, these approaches suggest the best set of precursors (i.e. reactants, and possibly other reagents) as the translation's outcome, with the possibility to generate multiple such sets. One possible downside of such models is that they provide chemists little control over the disconnections they want to investigate for a given target molecule. The recommended precursors are not guaranteed to be consistent with the chemist's desired disconnections because the single-step retrosynthetic models suggest the precursors thought to be optimal based on the training dataset. In this work, we extend transformer-based models for single-step retrosynthesis in order to enhance the control by chemists when determining a retrosynthetic route by exploring user-defined disconnections. We developed a 'human-in-the-loop' component that combines expert knowledge and experience with the power of deep learning. We can therefore use human knowledge and decision-making strategies that statistical and machine learning algorithms cannot yet encode due to a lack of relevant training data to provide an enhanced experience in retrosynthetic problems. [1] Vaswani, A. et al.; Advances in neural information processing systems 2017, 5998–6008. [2] Weininger, D.; J. Chem. Inf. Comput. Sci. 1988, 28, 31–36. [3] Weininger, D.; Weininger, A.; Weininger, J. L.; J. Chem. Inf. Comput. Sci. 1989, 29, 97–101. [4] Yang, Q. et al.; Chem. Commun. 2019, 55, 12152–12155. [5] Karpov, P.; Godin, G.; and Tetko, I. V.; International Conference on Artificial Neural Networks 2019, 817–830. [6] Duan, H.; Wang, L.; Zhang, C.; Guo, L.; and Li, J.; RSC Adv. 2020, 10, 1371–1378. [7] Schwaller, P. et al.; Chem. Sci. 2020, 11, 3316–3325.