Switchable Token-Specific Codebook Quantization For Face Image Compression
By: Yongbo Wang , Haonan Wang , Guodong Mu and more
Potential Business Impact:
Makes photos clearer with less data.
With the ever-increasing volume of visual data, the efficient and lossless transmission, along with its subsequent interpretation and understanding, has become a critical bottleneck in modern information systems. The emerged codebook-based solution utilize a globally shared codebook to quantize and dequantize each token, controlling the bpp by adjusting the number of tokens or the codebook size. However, for facial images, which are rich in attributes, such global codebook strategies overlook both the category-specific correlations within images and the semantic differences among tokens, resulting in suboptimal performance, especially at low bpp. Motivated by these observations, we propose a Switchable Token-Specific Codebook Quantization for face image compression, which learns distinct codebook groups for different image categories and assigns an independent codebook to each token. By recording the codebook group to which each token belongs with a small number of bits, our method can reduce the loss incurred when decreasing the size of each codebook group. This enables a larger total number of codebooks under a lower overall bpp, thereby enhancing the expressive capability and improving reconstruction performance. Owing to its generalizable design, our method can be integrated into any existing codebook-based representation learning approach and has demonstrated its effectiveness on face recognition datasets, achieving an average accuracy of 93.51% for reconstructed images at 0.05 bpp.
Similar Papers
A Theoretically-Grounded Codebook for Digital Semantic Communications
Information Theory
Makes computers understand pictures better.
A Theoretically-Grounded Codebook for Digital Semantic Communications
Information Theory
Makes computers understand pictures better.
Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems
CV and Pattern Recognition
Makes computers understand pictures with less data.