Incremental Human-Object Interaction Detection with Invariant Relation Representation Learning
By: Yana Wei , Zeen Chi , Chongyu Wang and more
Potential Business Impact:
Helps robots learn new object actions over time.
In open-world environments, human-object interactions (HOIs) evolve continuously, challenging conventional closed-world HOI detection models. Inspired by humans' ability to progressively acquire knowledge, we explore incremental HOI detection (IHOID) to develop agents capable of discerning human-object relations in such dynamic environments. This setup confronts not only the common issue of catastrophic forgetting in incremental learning but also distinct challenges posed by interaction drift and detecting zero-shot HOI combinations with sequentially arriving data. Therefore, we propose a novel exemplar-free incremental relation distillation (IRD) framework. IRD decouples the learning of objects and relations, and introduces two unique distillation losses for learning invariant relation features across different HOI combinations that share the same relation. Extensive experiments on HICO-DET and V-COCO datasets demonstrate the superiority of our method over state-of-the-art baselines in mitigating forgetting, strengthening robustness against interaction drift, and generalization on zero-shot HOIs. Code is available at \href{https://github.com/weiyana/ContinualHOI}{this HTTP URL}
Similar Papers
Learning Human-Object Interaction as Groups
CV and Pattern Recognition
Helps computers understand group actions, not just pairs.
HOID-R1: Reinforcement Learning for Open-World Human-Object Interaction Detection Reasoning with Multimodal Large Language Model
CV and Pattern Recognition
Helps robots understand what people do with things.
HOI-R1: Exploring the Potential of Multimodal Large Language Models for Human-Object Interaction Detection
CV and Pattern Recognition
Lets computers understand actions between people and things.