NSR-Boost: A Neuro-Symbolic Residual Boosting Framework for Industrial Legacy Models
By: Ziming Dai , Dabiao Ma , Jinle Tong and more
Although the Gradient Boosted Decision Trees (GBDTs) dominate industrial tabular applications, upgrading legacy models in high-concurrency production environments still faces prohibitive retraining costs and systemic risks. To address this problem, we present NSR-Boost, a neuro-symbolic residual boosting framework designed specifically for industrial scenarios. Its core advantage lies in being "non-intrusive". It treats the legacy model as a frozen model and performs targeted repairs on "hard regions" where predictions fail. The framework comprises three key stages: first, finding hard regions through residuals, then generating interpretable experts by generating symbolic code structures using Large Language Model (LLM) and fine-tuning parameters using Bayesian optimization, and finally dynamically integrating experts with legacy model output through a lightweight aggregator. We report on the successful deployment of NSR-Boost within the core financial risk control system at Qfin Holdings. This framework not only significantly outperforms state-of-the-art (SOTA) baselines across six public datasets and one private dataset, more importantly, shows excellent performance gains on real-world online data. In conclusion, it effectively captures long-tail risks missed by traditional models and offers a safe, low-cost evolutionary paradigm for industry.
Similar Papers
Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
Machine Learning (CS)
Finds hidden math rules even in messy data.
Towards symbolic regression for interpretable clinical decision scores
Machine Learning (CS)
Creates easy-to-understand doctor rules from patient data.
Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks
Computation and Language
Helps computers understand and connect many facts.