Score: 0

Evaluating the Sensitivity of BiLSTM Forecasting Models to Sequence Length and Input Noise

Published: December 7, 2025 | arXiv ID: 2512.06926v1

By: Salma Albelali, Moataz Ahmed

Potential Business Impact:

Makes computer predictions more accurate by fixing bad data.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Deep learning (DL) models, a specialized class of multilayer neural networks, have become central to time-series forecasting in critical domains such as environmental monitoring and the Internet of Things (IoT). Among these, Bidirectional Long Short-Term Memory (BiLSTM) architectures are particularly effective in capturing complex temporal dependencies. However, the robustness and generalization of such models are highly sensitive to input data characteristics - an aspect that remains underexplored in existing literature. This study presents a systematic empirical analysis of two key data-centric factors: input sequence length and additive noise. To support this investigation, a modular and reproducible forecasting pipeline is developed, incorporating standardized preprocessing, sequence generation, model training, validation, and evaluation. Controlled experiments are conducted on three real-world datasets with varying sampling frequencies to assess BiLSTM performance under different input conditions. The results yield three key findings: (1) longer input sequences significantly increase the risk of overfitting and data leakage, particularly in data-constrained environments; (2) additive noise consistently degrades predictive accuracy across sampling frequencies; and (3) the simultaneous presence of both factors results in the most substantial decline in model stability. While datasets with higher observation frequencies exhibit greater robustness, they remain vulnerable when both input challenges are present. These findings highlight important limitations in current DL-based forecasting pipelines and underscore the need for data-aware design strategies. This work contributes to a deeper understanding of DL model behavior in dynamic time-series environments and provides practical insights for developing more reliable and generalizable forecasting systems.

Hidden Leaks in Time Series Forecasting: How Data Leakage Affects LSTM Evaluation Across Configurations and Validation Strategies

Machine Learning (CS)

Fixes computer predictions so they don't cheat.

7 Dec 2025 0

87%

Parallel BiLSTM-Transformer networks for forecasting chaotic dynamics

Machine Learning (CS)

Predicts chaotic systems better than before.

27 Oct 2025 0

87%

An Explainable, Attention-Enhanced, Bidirectional Long Short-Term Memory Neural Network for Joint 48-Hour Forecasting of Temperature, Irradiance, and Relative Humidity

Machine Learning (CS)

Predicts weather two days ahead for smart homes.

28 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇸🇦 Saudi Arabia

Page Count

15 pages

Evaluating the Sensitivity of BiLSTM Forecasting Models to Sequence Length and Input Noise

Makes computer predictions more accurate by fixing bad data.

Technical Abstract

Hidden Leaks in Time Series Forecasting: How Data Leakage Affects LSTM Evaluation Across Configurations and Validation Strategies

Parallel BiLSTM-Transformer networks for forecasting chaotic dynamics

An Explainable, Attention-Enhanced, Bidirectional Long Short-Term Memory Neural Network for Joint 48-Hour Forecasting of Temperature, Irradiance, and Relative Humidity