LeanStereo: A Leaner Backbone based Stereo Network
By: Rafia Rahim, Samuel Woerz, Andreas Zell
Potential Business Impact:
Makes 3D cameras faster and use less power.
Recently, end-to-end deep networks based stereo matching methods, mainly because of their performance, have gained popularity. However, this improvement in performance comes at the cost of increased computational and memory bandwidth requirements, thus necessitating specialized hardware (GPUs); even then, these methods have large inference times compared to classical methods. This limits their applicability in real-world applications. Although we desire high accuracy stereo methods albeit with reasonable inference time. To this end, we propose a fast end-to-end stereo matching method. Majority of this speedup comes from integrating a leaner backbone. To recover the performance lost because of a leaner backbone, we propose to use learned attention weights based cost volume combined with LogL1 loss for stereo matching. Using LogL1 loss not only improves the overall performance of the proposed network but also leads to faster convergence. We do a detailed empirical evaluation of different design choices and show that our method requires 4x less operations and is also about 9 to 14x faster compared to the state of the art methods like ACVNet [1], LEAStereo [2] and CFNet [3] while giving comparable performance.
Similar Papers
Lite Any Stereo: Efficient Zero-Shot Stereo Matching
CV and Pattern Recognition
Makes computers see depth with less power.
Distilling Stereo Networks for Performant and Efficient Leaner Networks
CV and Pattern Recognition
Makes 3D cameras see depth faster and better.
Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
CV and Pattern Recognition
Makes 3D cameras work super fast and smart.