TimeSAE: Sparse Decoding for Faithful Explanations of Black-Box Time Series Models
By: Khalid Oublal , Quentin Bouniot , Qi Gan and more
As black box models and pretrained models gain traction in time series applications, understanding and explaining their predictions becomes increasingly vital, especially in high-stakes domains where interpretability and trust are essential. However, most of the existing methods involve only in-distribution explanation, and do not generalize outside the training support, which requires the learning capability of generalization. In this work, we aim to provide a framework to explain black-box models for time series data through the dual lenses of Sparse Autoencoders (SAEs) and causality. We show that many current explanation methods are sensitive to distributional shifts, limiting their effectiveness in real-world scenarios. Building on the concept of Sparse Autoencoder, we introduce TimeSAE, a framework for black-box model explanation. We conduct extensive evaluations of TimeSAE on both synthetic and real-world time series datasets, comparing it to leading baselines. The results, supported by both quantitative metrics and qualitative insights, show that TimeSAE provides more faithful and robust explanations. Our code is available in an easy-to-use library TimeSAE-Lib: https://anonymous.4open.science/w/TimeSAE-571D/.
Similar Papers
Sparse Autoencoders for Sequential Recommendation Models: Interpretation and Flexible Control
Information Retrieval
Explains why computers suggest what they do.
Sparse Autoencoders are Topic Models
CV and Pattern Recognition
Finds hidden themes in pictures and words.
Interpretable and Testable Vision Features via Sparse Autoencoders
CV and Pattern Recognition
Changes AI's understanding of pictures without retraining.