
Our Integrated Systems department innovates, designs, and prototypes high-performance intelligent distributed systems, applications, and services on complex, large-scale communication networks like 5G and beyond. We develop next-generation wireless technologies for sensing the world, localizing critical assets, and improving the capacity, coverage, and scalability of communication networks like 5G and beyond.
New application needs have always sparked human innovation. A decade ago, cloud computing enabled high-value enterprise services with a global reach and scale but with several minutes or seconds of delay. Large-scale services like enterprise resource planning (ERP) were a corner-case scenario, often designed as one-off systems. Today, applications like social networks, automated trading, and video streaming have made large-scale services the norm rather than the exception. In the future, advances in 5G networks and an explosion in smart devices, microservices, databases, networking, and computing tiers will make services so complex that humans cannot tune or manage them.
The sheer scale, dynamic nature, and concurrency in services on 5G slices will require them to be intelligent and autonomic. They will need to continuously self-assess, learn, and automatically adjust for resource needs, data quality, and service reliability. The need for increased efficiency and reduced latency between measurement and action drives our design of real-time distributed systems for feature extraction, computation, and machine learning on multimodal streaming data. We are conducting extensive research on creating end-to-end solutions using multimodal sensing technologies in the retail, public safety, and transportation domains.
Our 5G cellular network research encompasses the development of technologies on the Radio Access Network (RAN), the mobile edge, and the 5G LAN. Within the RAN, we are developing technologies that optimize massive MIMO/MU-MIMO deployments and millimeter-wave access (e.g., transmission at 28 GHz to nomadic/mobile users). At the mobile edge (MEC), we focus on virtualization, scalability, and cloud deployment of appropriate services. Our 5G LAN research extends the benefits of 5G slicing technology to enterprise LANs to position the enterprise as the new MEC.
Read our news and publications from our world-class team of researchers from our Integrated Systems department.
Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly when domain-specific data (context) is used alongside queries for accurate domain-specific LLM responses. Extracting context from domain-specific data is implemented by a Retrieval Augmented Generation (RAG) approach. One option is to summarize the RAG context by using LLMs and reduce the context. However, this can also filter out useful information that is necessary to answer some domain-specific queries. In this paper, we shift from human-oriented summarizers to AI model-friendly summaries. Our approach, LeanContext, efficiently extracts k key sentences from the context that are closely aligned with the query. The choice of k is neither static nor random; we introduce a reinforcement learning technique that dynamically determines k based on the query and context. The rest of the less important sentences are either reduced using a free open-source text reduction method or eliminated. We evaluate LeanContext against several recent query-aware and query-unaware context reduction approaches on prominent datasets (arxiv papers and BBC news articles, NarrativeQA). Despite cost reductions of 37.29% to 67.81%, LeanContexts ROUGE-1 score decreases only by 1.41% to 2.65% compared to a baseline that retains the entire context (no summarization). LeanContext stands out for its ability to provide precise responses, outperforming competitors by leveraging open-source summarization techniques. Human evaluations of the responses further confirm and validate this superiority. Additionally, if open-source pre-trained LLM-based summarizers are used to reduce context (into human consumable summaries), LeanContext can further modify the reduced context to enhance the accuracy (ROUGE-1 score) by 13.22% to 24.61%.
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
0
0
NEC Labs America
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
NEC Labs America2024-06-01 00:00:002024-07-16 13:45:50LeanContext: Cost-efficient Domain-specific Question Answering Using LLMsFor microservices-based real-time stream processing applications, computing at the edge delivers fast responses for low workloads, but as workload increases, the response time starts to slow down due to limited compute capacity. Abundant compute capacity in the cloud delivers fast responses even for higher workloads but incurs very high cost of operation. For applications which can tolerate latencies up to a certain limit, using either of them has one or the other drawback and for different applications and edge infrastructures, it is non-trivial to decide when to use only edge resources and when to leverage cloud resources. In this paper, we propose CLAP, which dynamically understands the relationship between workload and application latency, and automatically adjusts placement of microservices across edge and cloud computing continuum, with the goal of jointly reducing latency as well as cost of running microservices based streaming applications. CLAP leverages Reinforcement Learning (RL) technique to learn the optimal placement for a given workload and based on the learnings, adjusts placement of microservices as the application workload changes. We conduct experiments with real-world video analytics applications and show that CLAP adapts placement of microservices in response to varying workloads and achieves low latency for applications in a cost-efficient manner. Particularly, we show that for two real world video analytics applications i.e. human attributes and face recognition, CLAP is able to reduce average cost (across 4 days at different locations) by 47% and 58% for human attributes detection and face recognition application, respectively, while consistently maintaining latency below the tolerable limit.
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
0
0
NEC Labs America
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
NEC Labs America2024-05-06 00:00:002024-10-15 13:09:14CLAP: Cost and Latency-Aware Placement of Microservices on the Computing ContinuumRetrieval augmented generation (RAG) systems combine the strengths of language generation and information retrieval to power many real-world applications like chatbots. Use of RAG for combined understanding of multimodal data such as text, images and videos is appealing but two critical limitations exist: one-time, upfront capture of all content in large multimodal data as text descriptions entails high processing times, and not all information in the rich multimodal data is typically in the text descriptions. Since the user queries are not known apriori, developing a system for multimodal to text conversion and interactive querying of multimodal data is challenging.To address these limitations, we propose iRAG, which augments RAG with a novel incremental workflow to enable interactive querying of large corpus of multimodal data. Unlike traditional RAG, iRAG quickly indexes large repositories of multimodal data, and in the incremental workflow, it uses the index to opportunistically extract more details from select portions of the multimodal data to retrieve context relevant to an interactive user query. Such an incremental workflow avoids long multimodal to text conversion times, overcomes information loss issues by doing on-demand query-specific extraction of details in multimodal data, and ensures high quality of responses to interactive user queries that are often not known apriori. To the best of our knowledge, iRAG is the first system to augment RAG with an incremental workflow to support efficient interactive querying of large, real-world multimodal data. Experimental results on real-world long videos demonstrate 23x to 25x faster video to text ingestion, while ensuring that quality of responses to interactive user queries is comparable to responses from a traditional RAG where all video data is converted to text upfront before any querying.
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
0
0
NEC Labs America
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
NEC Labs America2024-04-24 00:00:002025-01-29 14:59:18iRAG: An Incremental Retrieval Augmented Generation System for VideosIn the evolving Artificial Intelligence (AI) era, the need for real-time algorithm processing in marine edge environments has become a crucial challenge. Data acquisition, analysis, and processing in complex marine situations require sophisticated and highly efficient platforms. This study optimizes real-time operations on a containerized distributed processing platform designed for Autonomous Surface Vehicles (ASV) to help safeguard the marine environment. The primary objective is to improve the efficiency and speed of data processing by adopting a microservice management system called DataX. DataX leverages containerization to break down operations into modular units, and resource coordination is based on Kubernetes. This combination of technologies enables more efficient resource management and real-time operations optimization, contributing significantly to the success of marine missions. The platform was developed to address the unique challenges of managing data and running advanced algorithms in a marine context, which often involves limited connectivity, high latencies, and energy restrictions. Finally, as a proof of concept to justify this platforms evolution, experiments were carried out using a cluster of single-board computers equipped with GPUs, running an AI-based marine litter detection application and demonstrating the tangible benefits of this solution and its suitability for the needs of maritime missions.
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
0
0
NEC Labs America
https://neclabs.wpengine.com/wp-content/uploads/2022/08/NEC-Labs-Blue-Logo-Square-300x267.jpg
NEC Labs America2024-03-20 00:00:002024-05-06 14:22:10Improving Real-time Data Streams Performance on Autonomous Surface Vehicles using DataX