StreamingRAG: Real-time Contextual Retrieval and Generation Framework

Publication Date: 6/3/2024

Event: AI4Sys ’24 At HPDC 2024

Reference: pp. 1-6, 2024

Authors: Murugan Sankaradas, NEC Laboratories America, Inc.; Ravi K. Rajendran, NEC Laboratories America, Inc.; Srimat T. Chakradhar, NEC Laboratories America, Inc.

Abstract: Extracting real-time insights from multi-modal data streams from various domains such as healthcare, intelligent transportation, and satellite remote sensing remains a challenge. High computational demands and limited knowledge scope restrict the applicability of Multi-Modal Large Language Models (MM-LLMs) on these data streams. Traditional Retrieval-Augmented Generation (RAG) systems address knowledge limitations of these models, but suffer from slow preprocessing, making them unsuitable for real-time analysis. We propose StreamingRAG, a novel RAG framework designed for streaming data. StreamingRAG constructs evolving knowledge graphs capturing scene-object-entity relationships in real-time. The knowledge graph achieves temporal-aware scene representations using MM-LLMs and enables timely responses for specific events or user queries. StreamingRAG addresses limitations in existing methods, achieving significant improvements in real-time analysis (5-6x faster throughput), contextual accuracy (through a temporal knowledge graph), and reduced resource consumption (using lightweight models by 2-3x).

Publication Link: N/A