ChemChat | Conversational Expert Assistant in Material Science and Data Visualization
Abstract
In recent decades, remarkable advancements have been made in the field of computational chemistry and machine learning (ML), yielding a plethora of sophisticated tools and artificial intelligence (AI) models. Despite their potential, these resources have yet to be fully harnessed due to their steep learning curves and their tendency to operate in isolation. Furthermore, the need for capabilities in programming and ML constitute access barriers to the targeted community – often lab scientists. Concurrently, the advent of large-language models (LLMs) like (Chat)GPT has been revolutionizing various domains. Nevertheless, their efficacy in addressing chemistry-related challenges has been limited. Especially, these models lack knowledge about scientific workflows and the employed operations (e.g. in drug discovery), access to information sources providing up-to-date data, and the ability to accurately reference – but tend to hallucinate in their responses – what questions credibility, trust, and applicability. However, this crucial gap between AI and science can be overcome by integrating task-specific agents into the LLM-powered conversational application and allowing the LLM to reason over their appropriate usage based on provided instructions. It can be anticipated that this will result in a significant increase in the utilization of the developed cheminformatic tools and AI models and contribute to the scientific discovery overall. Here, we present ChemChat, a web application and conversational assistant with a chatbot-driven user interface that is powered by non-GPT/OpenAI LLMs. Through the integration of existing cheminformatics tools and expert-developed AI models such as PubChem, CIRCA, RDKit, GT4SD, RXN, MolFormer and other knowledge sources the application is capable of assisting chemist in tasks like property calculations, tailored design of molecules, retrosynthesis, forward reaction planning, data visualization, and literature research. Central to the talk we will be demonstrating use case-specific capabilities in comparison to related applications and the architecture and workflow behind ChemChat.