UnCageNet: Tracking and Pose Estimation of Caged Animal
By: Sayak Dutta , Harish Katti , Shashikant Verma and more
Potential Business Impact:
Clears cages from animal videos for better tracking.
Animal tracking and pose estimation systems, such as STEP (Simultaneous Tracking and Pose Estimation) and ViTPose, experience substantial performance drops when processing images and videos with cage structures and systematic occlusions. We present a three-stage preprocessing pipeline that addresses this limitation through: (1) cage segmentation using a Gabor-enhanced ResNet-UNet architecture with tunable orientation filters, (2) cage inpainting using CRFill for content-aware reconstruction of occluded regions, and (3) evaluation of pose estimation and tracking on the uncaged frames. Our Gabor-enhanced segmentation model leverages orientation-aware features with 72 directional kernels to accurately identify and segment cage structures that severely impair the performance of existing methods. Experimental validation demonstrates that removing cage occlusions through our pipeline enables pose estimation and tracking performance comparable to that in environments without occlusions. We also observe significant improvements in keypoint detection accuracy and trajectory consistency.
Similar Papers
STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans
CV and Pattern Recognition
Tracks and guesses body parts of animals and people.
MitUNet: Enhancing Floor Plan Recognition using a Hybrid Mix-Transformer and U-Net Architecture
CV and Pattern Recognition
Builds 3D rooms from flat blueprints accurately.
WALDO: Where Unseen Model-based 6D Pose Estimation Meets Occlusion
CV and Pattern Recognition
Helps robots see objects even when they're partly hidden.