SEED: Sound Event Early Detection via Evidential Uncertainty

Publication Date: 5/27/2022

Event: 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Singapore (virtual paper presentations)

Reference: pp. 3618-3622, 2022

Authors: Xujiang Zhao, NEC Laboratories America, Inc.; University of Texas at Dallas; Xuchao Zhang, NEC Laboratories America, Inc.; Wei Cheng, NEC Laboratories America, Inc.; Wenchao Yu, NEC Laboratories America, Inc.; Yuncong Chen, NEC Laboratories America, Inc.; Haifeng Chen, NEC Laboratories America, Inc.; Feng Chen, University of Texas at Dallas

Abstract: Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0% and 3.8% in time delay and detection F1 score compared to the state-of-the-art methods.

Publication Link: