Thermodynamic Prediction Enabled by Automatic Dataset Building and Machine Learning
By: Juejing Liu , Haydn Anderson , Noah I. Waxman and more
Potential Business Impact:
Computers read science papers, predict new materials.
New discoveries in chemistry and materials science, with increasingly expanding volume of requisite knowledge and experimental workload, provide unique opportunities for machine learning (ML) to take critical roles in accelerating research efficiency. Here, we demonstrate (1) the use of large language models (LLMs) for automated literature reviews, and (2) the training of an ML model to predict chemical knowledge (thermodynamic parameters). Our LLM-based literature review tool (LMExt) successfully extracted chemical information and beyond into a machine-readable structure, including stability constants for metal cation-ligand interactions, thermodynamic properties, and other broader data types (medical research papers, and financial reports), effectively overcoming the challenges inherent in each domain. Using the autonomous acquisition of thermodynamic data, an ML model was trained using the CatBoost algorithm for accurately predicting thermodynamic parameters (e.g., enthalpy of formation) of minerals. This work highlights the transformative potential of integrated ML approaches to reshape chemistry and materials science research.
Similar Papers
Distilling and exploiting quantitative insights from Large Language Models for enhanced Bayesian optimization of chemical reactions
Machine Learning (CS)
Teaches computers to find better ways to make chemicals.
A Machine Learning-Fueled Modelfluid for Flowsheet Optimization
Computational Engineering, Finance, and Science
Helps design better chemical factories using smart predictions.
Machine Learning for Improved Density Functional Theory Thermodynamics
Materials Science
Makes computer predictions of metal mixes more accurate.