History Is Not Enough: An Adaptive Dataflow System for Financial Time-Series Synthesis
By: Haochong Xia , Yao Long Teng , Regan Tan and more
In quantitative finance, the gap between training and real-world performance-driven by concept drift and distributional non-stationarity-remains a critical obstacle for building reliable data-driven systems. Models trained on static historical data often overfit, resulting in poor generalization in dynamic markets. The mantra "History Is Not Enough" underscores the need for adaptive data generation that learns to evolve with the market rather than relying solely on past observations. We present a drift-aware dataflow system that integrates machine learning-based adaptive control into the data curation process. The system couples a parameterized data manipulation module comprising single-stock transformations, multi-stock mix-ups, and curation operations, with an adaptive planner-scheduler that employs gradient-based bi-level optimization to control the system. This design unifies data augmentation, curriculum learning, and data workflow management under a single differentiable framework, enabling provenance-aware replay and continuous data quality monitoring. Extensive experiments on forecasting and reinforcement learning trading tasks demonstrate that our framework enhances model robustness and improves risk-adjusted returns. The system provides a generalizable approach to adaptive data management and learning-guided workflow automation for financial data.
Similar Papers
Adaptive Information Routing for Multimodal Time Series Forecasting
Machine Learning (CS)
Helps predict prices by reading news.
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Machine Learning (CS)
Makes AI learn better from data.
Causify DataFlow: A Framework For High-performance Machine Learning Stream Computing
Machine Learning (CS)
Makes computer programs work the same, always.