Improving the Efficiency-Accuracy Trade-off of DETR-Style Models in Practice

Publication Date: 6/17/2024

Event: The 7th Workshop on Efficient Deep Learning for Computer Vision at CVPR 2024

Reference: pp. 1-5, 2024

Authors: Yumin Suh, NEC Laboratories America, Inc.; Dongwan Kim, NEC Laboratories America, Inc., Seoul National University; Abhishek Aich, NEC Laboratories America, Inc.; Samuel Schulter, NEC Laboratories America, Inc.; Jong-Chyi Su, NEC Laboratories America, Inc.; Bohyung Han, Seoul National University ; Manmohan Chandraker, NEC Laboratories America, Inc.

Abstract: This report aims to provide a comprehensive view on the inference efficiency of DETR-style detection models. We provide the effect of the basic efficiency techniques and identify the factors that are easily applicable yet effectively improve the efficiency-accuracy trade-off. Specifically, we explore the effect of input resolution, multi-scale feature enhancement, and backbone pre-training. Our experiments support that 1) improving the detection accuracy for smaller objects while minimizing the increase in inference cost is a good strategy to achieve a better trade-off between accuracy and efficiency. 2) Multi-scale feature enhancement can be lightened with marginal accuracy loss and 3) improved backbone pre-training can further enhance the trade-off.

Publication Link: