Score: 1

Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys

Published: November 15, 2025 | arXiv ID: 2511.12036v1

By: Satanu Ghosh , Collin Holgate , Neal R. Brodnik and more

Potential Business Impact:

Creates new super-strong metals using AI.

Business Areas:

Advanced Materials Manufacturing, Science and Engineering

We apply preference learning to the task of language model-guided design of novel structural alloys. In contrast to prior work that focuses on generating stable inorganic crystals, our approach targets the synthesizeability of a specific structural class: BCC/B2 superalloys, an underexplored family of materials with potential applications in extreme environments. Using three open-weight models (LLaMA-3.1, Gemma-2, and OLMo-2), we demonstrate that language models can be optimized for multiple design objectives using a single, unified reward signal through Direct Preference Optimization (DPO). Unlike prior approaches that rely on heuristic or human-in-the-loop feedback (costly), our reward signal is derived from thermodynamic phase calculations, offering a scientifically grounded criterion for model tuning. To our knowledge, this is the first demonstration of preference-tuning a language model using physics-grounded feedback for structural alloy design. The resulting framework is general and extensible, providing a path forward for intelligent design-space exploration across a range of physical science domains.

When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

Computation and Language

Makes AI understand what you like better.

14 Nov 2025 4

88%

Alignment as Distribution Learning: Your Preference Model is Explicitly a Language Model

Machine Learning (CS)

Makes AI better at following instructions.

2 Jun 2025 2

88%

Improving LLMs for Machine Translation Using Synthetic Preference Data

Computation and Language

Makes computer translations much better and more accurate.

20 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

21 pages

Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys

Creates new super-strong metals using AI.

Technical Abstract

When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

Alignment as Distribution Learning: Your Preference Model is Explicitly a Language Model

Improving LLMs for Machine Translation Using Synthetic Preference Data