Publication Date: 4/12/2018
Authors: Ankan Bansal, University of Maryland, College Park; Karan Sikka, SRI International; Gaurav Sharma, NEC Laboratories America, Inc.; Rama Chellappa, University of Maryland, College Park; Ajay Divakaran, SRI International
Abstract: We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories cf. prior works on zero-shot classification. We follow a principled approach by first adapting visual-semantic embeddings for ZSD. We then discuss the problems associated with selecting a background class and motivate two background-aware approaches for learning robust detectors. One of these models uses a fixed background class and the other is based on iterative latent assignments. We also outline the challenge associated with using a limited number of training classes and propose a solution based on dense sampling of the semantic label space using auxiliary data with a large number of categories. We propose novel splits of two standard detection datasets – MSCOCO and VisualGenome and discuss extensive empirical results to highlight the benefits of the proposed methods. We provide useful insights into the algorithm and conclude by posing some open questions to encourage further research.
Publication Link: https://arxiv.org/pdf/1804.04340v1.pdf