Machines Serve Human: A Novel Variable Human-machine Collaborative Compression Framework
By: Zifu Zhang , Shengxi Li , Xiancheng Sun and more
Potential Business Impact:
Makes pictures smaller for people and computers.
Human-machine collaborative compression has been receiving increasing research efforts for reducing image/video data, serving as the basis for both human perception and machine intelligence. Existing collaborative methods are dominantly built upon the de facto human-vision compression pipeline, witnessing deficiency on complexity and bit-rates when aggregating the machine-vision compression. Indeed, machine vision solely focuses on the core regions within the image/video, requiring much less information compared with the compressed information for human vision. In this paper, we thus set out the first successful attempt by a novel collaborative compression method based on the machine-vision-oriented compression, instead of human-vision pipeline. In other words, machine vision serves as the basis for human vision within collaborative compression. A plug-and-play variable bit-rate strategy is also developed for machine vision tasks. Then, we propose to progressively aggregate the semantics from the machine-vision compression, whilst seamlessly tailing the diffusion prior to restore high-fidelity details for human vision, thus named as diffusion-prior based feature compression for human and machine visions (Diff-FCHM). Experimental results verify the consistently superior performances of our Diff-FCHM, on both machine-vision and human-vision compression with remarkable margins. Our code will be released upon acceptance.
Similar Papers
Diff-ICMH: Harmonizing Machine and Human Vision in Image Compression with Generative Prior
CV and Pattern Recognition
Makes pictures look good and computers understand them.
Guided Diffusion for the Extension of Machine Vision to Human Visual Perception
CV and Pattern Recognition
Makes pictures good for people and computers.
Emerging Standards for Machine-to-Machine Video Coding
CV and Pattern Recognition
Lets computers share video data faster and safer.