UniComp: Rethinking Video Compression Through Informational Uniqueness
By: Chao Yuan , Shimin Chen , Minliang Lin and more
Potential Business Impact:
Makes videos smaller while keeping important parts clear.
Distinct from attention-based compression methods, this paper presents an information uniqueness driven video compression framework, termed UniComp, which aims to maximize the information fidelity of video representations under constrained computational budgets. Starting from the information-theoretic perspective, we formulate the vision compression as an optimization problem that minimizes conditional entropy (reconstruction error) between retained and full tokens. To achieve this, we introduce the notion of information uniqueness to measure intrinsic redundancy among tokens to link with reconstruction error. Based on uniqueness, we design three modules-Frame Group Fusion, Token Allocation, and Spatial Dynamic Compression-that progressively perform semantic frame grouping, adaptive resource allocation, and fine-grained spatial compression. Extensive experiments demonstrate that UniComp consistently outperforms existing compression methods in preserving essential visual tokens under limited computational budgets, highlighting the pivotal role of information uniqueness in token compression efficacy.
Similar Papers
Towards Lossless Ultimate Vision Token Compression for VLMs
CV and Pattern Recognition
Makes AI understand pictures much faster.
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
CV and Pattern Recognition
Makes AI learn from videos using way less data.
Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models
CV and Pattern Recognition
Makes AI understand pictures faster and better.