Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection [preprint]

Preprint date

July 23, 2020

Authors

Xianyu Chen (Ph.D. student), Ming Jiang (postdoc researcher), Qi Zhao (assistant professor)

Abstract

Few-shot object detection aims at detecting objects with few annotated examples, which remains a challenging research problem yet to be explored. Recent studies have shown the effectiveness of self-learned top-down attention mechanisms in object detection and other vision tasks. The top-down attention, however, is less effective at improving the performance of few-shot detectors. Due to the insufficient training data, object detectors cannot effectively generate attention maps for few-shot examples. To improve the performance and interpretability of few-shot object detectors, we propose an attentive few-shot object detection network (AttFDNet) that takes the advantages of both top-down and bottom-up attention. Being task-agnostic, the bottom-up attention serves as a prior that helps detect and localize naturally salient objects. We further address specific challenges in few-shot object detection by introducing two novel loss terms and a hybrid few-shot learning strategy. Experimental results and visualization demonstrate the complementary nature of the two types of attention and their roles in few-shot object detection.

Link to full paper

Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection

Keywords

computer vision