Score: 0

Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning

Published: December 9, 2025 | arXiv ID: 2512.08820v1

By: Yi Zhang , Chun-Wun Cheng , Junyi He and more

Recent research in Vision-Language Models (VLMs) has significantly advanced our capabilities in cross-modal reasoning. However, existing methods suffer from performance degradation with domain changes or require substantial computational resources for fine-tuning in new domains. To address this issue, we develop a new adaptation method for large vision-language models, called \textit{Training-free Dual Hyperbolic Adapters} (T-DHA). We characterize the vision-language relationship between semantic concepts, which typically has a hierarchical tree structure, in the hyperbolic space instead of the traditional Euclidean space. Hyperbolic spaces exhibit exponential volume growth with radius, unlike the polynomial growth in Euclidean space. We find that this unique property is particularly effective for embedding hierarchical data structures using the Poincaré ball model, achieving significantly improved representation and discrimination power. Coupled with negative learning, it provides more accurate and robust classifications with fewer feature dimensions. Our extensive experimental results on various datasets demonstrate that the T-DHA method significantly outperforms existing state-of-the-art methods in few-shot image recognition and domain generalization tasks.

Adapt-As-You-Walk Through the Clouds: Training-Free Online Test-Time Adaptation of 3D Vision-Language Foundation Models

CV and Pattern Recognition

Fixes 3D object recognition in messy data.

19 Nov 2025 2

88%

dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning

CV and Pattern Recognition

Makes self-driving cars better at tricky situations.

4 Dec 2025 0

88%

Fine-Grained VLM Fine-tuning via Latent Hierarchical Adapter Learning

CV and Pattern Recognition

Teaches computers to learn new things faster.

15 Aug 2025 1

View PDF Login to Bookmark

Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning

Technical Abstract

Adapt-As-You-Walk Through the Clouds: Training-Free Online Test-Time Adaptation of 3D Vision-Language Foundation Models

dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning

Fine-Grained VLM Fine-tuning via Latent Hierarchical Adapter Learning