JPEG (Joint Photographic Experts Group) is a widely used image compression standard designed for reducing the file size of digital images while preserving a reasonable level of visual quality. The JPEG standard employs lossy compression techniques, meaning that some amount of image information is discarded during the compression process. This compression method is particularly effective for photographic images with natural scenes and gradients.

Posts

A Perspective on Deep Vision Performance with Standard Image and Video Codecs

Resource-constrained hardware such as edge devices or cell phones often rely on cloud servers to provide the required computational resources for inference in deep vision models. However transferring image and video data from an edge or mobile device to a cloud server requires coding to deal with network constraints. The use of standardized codecs such as JPEG or H.264 is prevalent and required to ensure interoperability. This paper aims to examine the implications of employing standardized codecs within deep vision pipelines. We find that using JPEG and H.264 coding significantly deteriorates the accuracy across a broad range of vision tasks and models. For instance strong compression rates reduce semantic segmentation accuracy by more than 80% in mIoU. In contrast to previous findings our analysis extends beyond image and action classification to localization and dense prediction tasks thus providing a more comprehensive perspective.

Differentiable JPEG: The Devil is in The Details

JPEG remains one of the most widespread lossy image coding methods. However, the non-differentiable nature of JPEG restricts the application in deep learning pipelines. Several differentiable approximations of JPEG have recently been proposed to address this issue. This paper conducts a comprehensive review of existing diff. JPEG approaches and identifies critical details that have been missed by previous methods. To this end, we propose a novel diff. JPEG approach, overcoming previous limitations. Our approach is differentiable w.r.t. the input image, the JPEG quality, the quantization tables, and the color conversion parameters. We evaluate the forward and backward performance of our diff. JPEG approach against existing methods. Additionally, extensive ablations are performed to evaluate crucial design choices. Our proposed diff. JPEG resembles the (non-diff.) reference implementation best, significantly surpassing the recent-best diff. approach by 3.47dB (PSNR) on average. For strong compression rates, we can even improve PSNR by 9.51dB. Strong adversarial attack results are yielded by our diff. JPEG, demonstrating the effective gradient approximation. Our code is available at https://github.com/necla-ml/Diff-JPEG.