Real-Time refers to the capability of processing, analyzing, and reacting to data immediately as it is generated or received, with minimal delay. Real-time data systems aim to provide instantaneous responses to changing data conditions, enabling organizations to make timely decisions, monitor events as they occur, and respond promptly to emerging situations. The term “real-time” is often used to describe systems that operate with extremely low latency, where the delay between data generation and system response is minimized.

Posts

DataXc: Flexible and efficient communication in microservices-based stream analytics pipelines

A big challenge in changing a monolithic application into a performant microservices-based application is the design of efficient mechanisms for microservices to communicate with each other. Prior proposals range from custom point-to-point communication among microservices using protocols like gRPC to service meshes like Linkerd to a flexible, many-to-many communication using broker-based messaging systems like NATS. We propose a new communication mechanism, DataXc, that is more efficient than prior proposals in terms of message latency, jitter, message processing rate and use of network resources. To the best of our knowledge, DataXc is the first communication design that has the desirable flexibility of a broker-based messaging systems like NATS and the high-performance of a rigid, custom point-to-point communication method. DataXc proposes a novel “pull” based communication method (i.e consumers fetch messages from producers). This is unlike prior proposals like NATS, gRPC or Linkerd, all of which are “push” based (i.e. producers send messages to consumers). Such communication methods make it difficult to take advantage of differential processing rates of consumers like video analytics tasks. In contrast, DataXc proposes a “pull” based design that avoids unnecessary communication of messages that are eventually discarded by the consumers. Also, unlike prior proposals, DataXc successfully addresses several key challenges in streaming video analytics pipelines like non-uniform processing of frames from multiple cameras, and high variance in latency of frames processed by consumers, all of which adversely affect the quality of insights from streaming video analytics. We report results on two popular real-world, streaming video analytics pipelines (video surveillance, and video action recognition). Compared to NATS, DataXc is just as flexible, but it has far superior performance: upto 80% higher processing rate, 3X lower latency, 7.5X lower jitter and 4.5X lower network bandwidth usage. Compared to gRPC or Linkerd, DataXc is highly flexible, achieves up to 2X higher processing rate, lower latency and lower jitter, but it also consumes more network bandwidth.