Score: 1

A Preprocessing Framework for Video Machine Vision under Compression

Published: December 17, 2025 | arXiv ID: 2512.15331v1

By: Fei Zhao , Mengxi Guo , Shijie Zhao and more

BigTech Affiliations: ByteDance

Potential Business Impact:

Makes videos smaller for computers to understand.

Business Areas:

Computer Vision Hardware, Software

There has been a growing trend in compressing and transmitting videos from terminals for machine vision tasks. Nevertheless, most video coding optimization method focus on minimizing distortion according to human perceptual metrics, overlooking the heightened demands posed by machine vision systems. In this paper, we propose a video preprocessing framework tailored for machine vision tasks to address this challenge. The proposed method incorporates a neural preprocessor which retaining crucial information for subsequent tasks, resulting in the boosting of rate-accuracy performance. We further introduce a differentiable virtual codec to provide constraints on rate and distortion during the training stage. We directly apply widely used standard codecs for testing. Therefore, our solution can be easily applied to real-world scenarios. We conducted extensive experiments evaluating our compression method on two typical downstream tasks with various backbone networks. The experimental results indicate that our approach can save over 15% of bitrate compared to using only the standard codec anchor version.

Generative Preprocessing for Image Compression with Pre-trained Diffusion Models

Image and Video Processing

Makes pictures look better when they're made smaller.

17 Dec 2025 1

89%

Machines Serve Human: A Novel Variable Human-machine Collaborative Compression Framework

CV and Pattern Recognition

Makes pictures smaller for people and computers.

12 Nov 2025 0

89%

VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction

CV and Pattern Recognition

Makes AI learn from videos using way less data.

24 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

10 pages

A Preprocessing Framework for Video Machine Vision under Compression

Makes videos smaller for computers to understand.

Technical Abstract

Generative Preprocessing for Image Compression with Pre-trained Diffusion Models

Machines Serve Human: A Novel Variable Human-machine Collaborative Compression Framework

VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction