Latent Space Data Fusion Outperforms Early Fusion in Multimodal Mental Health Digital Phenotyping Data
By: Youcef Barkat , Dylan Hamitouche , Deven Parekh and more
Potential Business Impact:
Helps doctors predict depression using phone data.
Background: Mental illnesses such as depression and anxiety require improved methods for early detection and personalized intervention. Traditional predictive models often rely on unimodal data or early fusion strategies that fail to capture the complex, multimodal nature of psychiatric data. Advanced integration techniques, such as intermediate (latent space) fusion, may offer better accuracy and clinical utility. Methods: Using data from the BRIGHTEN clinical trial, we evaluated intermediate (latent space) fusion for predicting daily depressive symptoms (PHQ-2 scores). We compared early fusion implemented with a Random Forest (RF) model and intermediate fusion implemented via a Combined Model (CM) using autoencoders and a neural network. The dataset included behavioral (smartphone-based), demographic, and clinical features. Experiments were conducted across multiple temporal splits and data stream combinations. Performance was evaluated using mean squared error (MSE) and coefficient of determination (R2). Results: The CM outperformed both RF and Linear Regression (LR) baselines across all setups, achieving lower MSE (0.4985 vs. 0.5305 with RF) and higher R2 (0.4695 vs. 0.4356). The RF model showed signs of overfitting, with a large gap between training and test performance, while the CM maintained consistent generalization. Performance was best when integrating all data modalities in the CM (in contradistinction to RF), underscoring the value of latent space fusion for capturing non-linear interactions in complex psychiatric datasets. Conclusion: Latent space fusion offers a robust alternative to traditional fusion methods for prediction with multimodal mental health data. Future work should explore model interpretability and individual-level prediction for clinical deployment.
Similar Papers
Latent Sensor Fusion: Multimedia Learning of Physiological Signals for Resource-Constrained Devices
Signal Processing
Lets computers understand many body signals together.
Meta Fusion: A Unified Framework For Multimodality Fusion with Mutual Learning
Machine Learning (CS)
Combines different data to make better predictions.
Cross-Modal Temporal Fusion for Financial Market Forecasting
Machine Learning (CS)
Predicts stock prices better using different data.