EGRA:Toward Enhanced Behavior Graphs and Representation Alignment for Multimodal Recommendation
By: Xiaoxiong Zhang , Xin Zhou , Zhiwei Zeng and more
Potential Business Impact:
Helps movie apps show you better movies.
MultiModal Recommendation (MMR) systems have emerged as a promising solution for improving recommendation quality by leveraging rich item-side modality information, prompting a surge of diverse methods. Despite these advances, existing methods still face two critical limitations. First, they use raw modality features to construct item-item links for enriching the behavior graph, while giving limited attention to balancing collaborative and modality-aware semantics or mitigating modality noise in the process. Second, they use a uniform alignment weight across all entities and also maintain a fixed alignment strength throughout training, limiting the effectiveness of modality-behavior alignment. To address these challenges, we propose EGRA. First, instead of relying on raw modality features, it alleviates sparsity by incorporating into the behavior graph an item-item graph built from representations generated by a pretrained MMR model. This enables the graph to capture both collaborative patterns and modality aware similarities with enhanced robustness against modality noise. Moreover, it introduces a novel bi-level dynamic alignment weighting mechanism to improve modality-behavior representation alignment, which dynamically assigns alignment strength across entities according to their alignment degree, while gradually increasing the overall alignment intensity throughout training. Extensive experiments on five datasets show that EGRA significantly outperforms recent methods, confirming its effectiveness.
Similar Papers
Semantic Item Graph Enhancement for Multimodal Recommendation
Information Retrieval
Helps online stores show you better stuff.
MMGraphRAG: Bridging Vision and Language with Interpretable Multimodal Knowledge Graphs
Artificial Intelligence
Helps computers understand pictures and words together better.
CEMG: Collaborative-Enhanced Multimodal Generative Recommendation
Information Retrieval
Helps online stores show you better stuff.