CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective
By: Zongheng Tang , Yi Liu , Yifan Sun and more
Potential Business Impact:
Lets cars see around corners together.
Collaborative perception shares information among different agents and helps solving problems that individual agents may face, e.g., occlusions and small sensing range. Prior methods usually separate the multi-agent fusion and multi-time fusion into two consecutive steps. In contrast, this paper proposes an efficient collaborative perception that aggregates the observations from different agents (space) and different times into a unified spatio-temporal space simultanesouly. The unified spatio-temporal space brings two benefits, i.e., efficient feature transmission and superior feature fusion. 1) Efficient feature transmission: each static object yields a single observation in the spatial temporal space, and thus only requires transmission only once (whereas prior methods re-transmit all the object features multiple times). 2) superior feature fusion: merging the multi-agent and multi-time fusion into a unified spatial-temporal aggregation enables a more holistic perspective, thereby enhancing perception performance in challenging scenarios. Consequently, our Collaborative perception with Spatio-temporal Transformer (CoST) gains improvement in both efficiency and accuracy. Notably, CoST is not tied to any specific method and is compatible with a majority of previous methods, enhancing their accuracy while reducing the transmission bandwidth.
Similar Papers
CoCMT: Communication-Efficient Cross-Modal Transformer for Collaborative Perception
Machine Learning (CS)
Lets robots see better by sharing smart information.
SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
CV and Pattern Recognition
Cars share data to see around corners.
MCOP: Multi-UAV Collaborative Occupancy Prediction
CV and Pattern Recognition
Drones see better together, even hidden things.