Score: 1

CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction

Published: August 5, 2025 | arXiv ID: 2508.03159v1

By: Jueon Park , Yein Park , Minju Song and more

Potential Business Impact:

Finds drug dangers before testing on people.

Drug toxicity remains a major challenge in pharmaceutical development. Recent machine learning models have improved in silico toxicity prediction, but their reliance on annotated data and lack of interpretability limit their applicability. This limits their ability to capture organ-specific toxicities driven by complex biological mechanisms. Large language models (LLMs) offer a promising alternative through step-by-step reasoning and integration of textual data, yet prior approaches lack biological context and transparent rationale. To address this issue, we propose CoTox, a novel framework that integrates LLM with chain-of-thought (CoT) reasoning for multi-toxicity prediction. CoTox combines chemical structure data, biological pathways, and gene ontology (GO) terms to generate interpretable toxicity predictions through step-by-step reasoning. Using GPT-4o, we show that CoTox outperforms both traditional machine learning and deep learning model. We further examine its performance across various LLMs to identify where CoTox is most effective. Additionally, we find that representing chemical structures with IUPAC names, which are easier for LLMs to understand than SMILES, enhances the model's reasoning ability and improves predictive performance. To demonstrate its practical utility in drug development, we simulate the treatment of relevant cell types with drug and incorporated the resulting biological context into the CoTox framework. This approach allow CoTox to generate toxicity predictions aligned with physiological responses, as shown in case study. This result highlights the potential of LLM-based frameworks to improve interpretability and support early-stage drug safety assessment. The code and prompt used in this work are available at https://github.com/dmis-lab/CoTox.

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

Artificial Intelligence

Fixes drug molecules to make them safer.

12 Jun 2025 0

89%

ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection

Computation and Language

Finds mean online comments better in French.

15 Aug 2025 3

88%

Task-Specific Sparse Feature Masks for Molecular Toxicity Prediction with Chemical Language Models

Computational Engineering, Finance, and Science

Shows drug parts that make them safe or unsafe.

12 Dec 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

8 pages

CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction

Finds drug dangers before testing on people.

Technical Abstract

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection

Task-Specific Sparse Feature Masks for Molecular Toxicity Prediction with Chemical Language Models