Retrieval refers to the process of obtaining and accessing relevant information or data from the environment or sensors. These smart sensing platforms are equipped with various sensors and technologies that collect data to make informed decisions, perform tasks, or provide valuable insights. Retrieval involves accessing and utilizing the collected data to achieve specific goals or functionalities.


At the Speed of Sound: Efficient Audio Scene Classification

Efficient audio scene classification is essential for smart sensing platforms such as robots, medical monitoring, surveillance, or autonomous vehicles. We propose a retrieval-based scene classification architecture that combines recurrent neural networks and attention to compute embeddings for short audio segments. We train our framework using a custom audio loss function that captures both the relevance of audio segments within a scene and that of sound events within a segment. Using experiments on real audio scenes, we show that we can discriminate audio scenes with high accuracy after listening in for less than a second. This preserves 93% of the detection accuracy obtained after hearing the entire scene.