A Global Receptive Field refers to the effective region of an input image or signal that a particular layer or network can perceive. Each layer in a neural network processes a local region of the input, and as information propagates through the network, the receptive field increases. The global receptive field represents the entire area of the input that influences a particular output. It’s a concept used to understand how much context a neural network can capture, especially in tasks like image recognition where capturing global context is crucial for understanding the relationships between different parts of an image.


Channel Recurrent Attention Networks for Video Pedestrian Retrieval

Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks. In this work, we propose a fully attentional network, termed channel recurrent attention network, for the task of video pedestrian retrieval. The main attention unit, channel recurrent attention, identifies attention maps at the frame level by jointly leveraging spatial and channel patterns via a recurrent neural network. This channel recurrent attention is designed to build a global receptive field by recurrently receiving and learning the spatial vectors. Then, a set aggregation cell is employed to generate a compact video representation. Empirical experimental results demonstrate the superior performance of the proposed deep network, outperforming current state-of-the-art results across standard video person retrieval benchmarks, and a thorough ablation study shows the effectiveness of the proposed units.