Improvement of Human-Object Interaction Action Recognition Using Scene Information and Multi-Task Learning Approach
By: Hesham M. Shehata, Mohammad Abdolrahmani
Potential Business Impact:
Helps computers see people using objects.
Recent graph convolutional neural networks (GCNs) have shown high performance in the field of human action recognition by using human skeleton poses. However, it fails to detect human-object interaction cases successfully due to the lack of effective representation of the scene information and appropriate learning architectures. In this context, we propose a methodology to utilize human action recognition performance by considering fixed object information in the environment and following a multi-task learning approach. In order to evaluate the proposed method, we collected real data from public environments and prepared our data set, which includes interaction classes of hands-on fixed objects (e.g., ATM ticketing machines, check-in/out machines, etc.) and non-interaction classes of walking and standing. The multi-task learning approach, along with interaction area information, succeeds in recognizing the studied interaction and non-interaction actions with an accuracy of 99.25%, outperforming the accuracy of the base model using only human skeleton poses by 2.75%.
Similar Papers
Improvement of Human-Object Interaction Action Recognition Using Scene Information and Multi-Task Learning Approach
CV and Pattern Recognition
Helps computers see people using objects.
Label-Efficient Skeleton-based Recognition with Stable-Invertible Graph Convolutional Networks
CV and Pattern Recognition
Teaches computers to recognize actions with less data.
Active Learning for GCN-based Action Recognition
CV and Pattern Recognition
Teaches computers to recognize actions with less training.