Mohammad Khojastepour NEC Labs America

Mohammad A. Khojastepour

Senior Researcher

Integrated Systems

Posts

Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming

Ensuring high-quality video content for wireless users has become increasingly vital. Nevertheless, maintaining a consistent level of video quality faces challenges due to the fluctuating encoded bitrate, primarily caused by dynamic video content, especially in live streaming scenarios. Video compression is typically employed to eliminate unnecessary redundancies within and between video frames, thereby reducing the required bandwidth for video transmission. The encoded bitrate and the quality of the compressed video depend on encoder parameters, specifically, the quantization parameter (QP). Poor choices of encoder parameters can result in reduced bandwidth efficiency and high likelihood of non-conformance. Non-conformance refers to the violation of the peak signal-to-noise ratio (PSNR) constraint for an encoded video segment. To address these issues, a real-time deep learning-based H.264 controller is proposed. This controller dynamically estimates the optimal encoder parameters based on the content of a video chunk with minimal delay. The objective is to maintain video quality in terms of PSNR above a specified threshold while minimizing the average bitrate of the compressed video. Experimental results, conducted on both QCIF dataset and a diverse range of random videos from public datasets, validate the effectiveness of this approach. Notably, it achieves improvements of up to 2.5 times in average bandwidth usage compared to the state-of-the-art adaptive bitrate video streaming, with a negligible non-conformance probability below 10?2.

Deep Learning-Based Real-Time Rate Control for Live Streaming on Wireless Networks

Providing wireless users with high-quality video content has become increasingly important. However, ensuring consistent video quality poses challenges due to variable encodedbitrate caused by dynamic video content and fluctuating channel bitrate caused by wireless fading effects. Suboptimal selection of encoder parameters can lead to video quality loss due to underutilized bandwidth or the introduction of video artifacts due to packet loss. To address this, a real-time deep learning-based H.264 controller is proposed. This controller leverages instantaneous channel quality data driven from the physical layer, along with the video chunk, to dynamically estimate the optimal encoder parameters with a negligible delay in real-time. The objective is to maintain an encoded video bitrate slightly below the available channel bitrate. Experimental results, conducted on both QCIF dataset and a diverse selection of random videos from public datasets, validate the effectiveness of the approach. Remarkably, improvements of 10-20 dB in PSNR with respect to the state-of-the art adaptive bitrate video streaming is achieved, with an average packet drop rate as low as 0.002.

Transformer-Aided Semantic Communications

The transformer structure employed in large language models (LLMs), as a specialized category of deep neural networks (DNNs) featuring attention mechanisms, stands out for their ability to identify and highlight the most relevant aspects of input data. Such a capability is particularly beneficial in addressing a variety of communication challenges, notably in the realm of semantic communication where proper encoding of the relevant data is critical especially in systems with limited bandwidth. In this work, we employ vision transformers specifically for the purpose of compression and compact representation of the input image, with the goal of preserving semantic information throughout the transmission process. Through the use of the attention mechanism inherent in transformers, we create an attention mask. This mask effectively prioritizes critical segments of images for transmission, ensuring that the reconstruction phase focuses on key objects highlighted by the mask. Our methodology significantly improves the quality of semantic communication and optimizes bandwidth usage by encoding different parts of the data in accordance with their semantic information content, thus enhancing overall efficiency. We evaluate the effectiveness of our proposed framework using the TinyImageNet dataset, focusing on both reconstruction quality and accuracy. Our evaluation results demonstrate that our framework successfully preserves semantic information, even when only a fraction of the encoded data is transmitted, according to the intended compression rates.

Enabling Cooperative Hybrid Beamforming in TDD-based Distributed MIMO Systems

Distributed massive MIMO networks are envisioned to realize cooperative multi-point transmission in next-generation wireless systems. For efficient cooperative hybrid beamforming, the cluster of access points (APs) needs to obtain precise estimates of the uplink channel to perform reliable downlink precoding. However, due to the radio frequency (RF) impairments between the transceivers at the two en-points of the wireless channel, full channel reciprocity does not hold which results in performance degradation in the cooperative hybrid beamforming (CHBF) unless a suitable reciprocity calibration mechanism is in place. We propose a two-step approach to calibrate any two hybrid nodes in the distributed MIMO system. We then present and utilize the novel concept of reciprocal tandem to propose a low-complexity approach for jointly calibrating the cluster of APs and estimating the downlink channel. Finally, we validate our calibration technique’s effectiveness through numerical simulation.

Semantic Multi-Resolution Communications

Deep learning based joint source-channel coding (JSCC) has demonstrated significant advancements in data reconstruction compared to separate source-channel coding (SSCC). This superiority arises from the suboptimality of SSCC when dealing with finite block-length data. Moreover, SSCC falls short in reconstructing data in a multi-user and/or multi-resolution fashion, as it only tries to satisfy the worst channel and/or the highest quality data. To overcome these limitations, we propose a novel deep learning multi-resolution JSCC framework inspired by the concept of multi-task learning (MTL). This proposed framework excels at encoding data for different resolutions through hierarchical layers and effectively decodes it by leveraging both current and past layers of encoded data. Moreover, this framework holds great potential for semantic communication, where the objective extends beyond data reconstruction to preserving specific semantic attributes throughout the communication process. These semantic features could be crucial elements such as class labels, essential for classification tasks, or other key attributes that require preservation. Within this framework, each level of encoded data can be carefully designed to retain specific data semantics. As a result, the precision of a semantic classifier can be progressively enhanced across successive layers, emphasizing the preservation of targeted semantics throughout the encoding and decoding stages. We conduct experiments on MNIST and CIFAR10 dataset. The experiment with both datasets illustrates that our proposed method is capable of surpassing the SSCC method in reconstructing data with different resolutions, enabling the extraction of semantic features with heightened confidence in successive layers. This capability is particularly advantageous for prioritizing and preserving more crucial semantic features within the datasets.

Blind Cyclic Prefix-based CFO Estimation in MIMO-OFDM Systems

Low-complexity estimation and correction of carrier frequency offset (CFO) are essential in orthogonal frequency division multiplexing (OFDM). In this paper, we propose a low overhead blind CFO estimation technique based on cyclic prefix (CP), in multi-input multi-output (MIMO)-OFDM systems. We propose to use antenna diversity for CFO estimation. Given that the RF chains for all antenna elements at a communication node share the same clock, the carrier frequency offset (CFO) between two points may be estimated by using the combination of the received signal at all antennas. We improve our method by combining the antenna diversity with time diversity by considering the CP for multiple OFDM symbols. We provide a closed-form expression for CFO estimation and present algorithms that can considerably improve the CFO estimation performance at the expense of a linear increase in computational complexity. We validate the effectiveness of our estimation scheme via extensive numerical analysis.

Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming

Ensuring high-quality video content for wireless users has become increasingly vital. Nevertheless, maintaining a consistent level of video quality faces challenges due to the fluctuating encoded bitrate, primarily caused by dynamic video content, especially in live streaming scenarios. Video compression is typically employed to eliminate unnecessary redundancies within and between video frames, thereby reducing the required bandwidth for video transmission. The encoded bitrate and the quality of the compressed video depend on encoder parameters, specifically, the quantization parameter (QP). Poor choices of encoder parameters can result in reduced bandwidth efficiency and high likelihood of non-conformance. Non-conformance refers to the violation of the peak signal-to-noise ratio (PSNR) constraint for an encoded video segment. To address these issues, a real-time deep learning-based H.264 controller is proposed. This controller dynamically estimates the optimal encoder parameters based on the content of a video chunk with minimal delay. The objective is to maintain video quality in terms of PSNR above a specified threshold while minimizing the average bitrate of the compressed video. Experimental results, conducted on both QCIF dataset and a diverse range of random videos from public datasets, validate the effectiveness of this approach. Notably, it achieves improvements of up to 2.5 times in average bandwidth usage compared to the state-of-the-art adaptive bitrate video streaming, with a negligible non-conformance probability below 10?2.

RIS-aided mmWave Beamforming for Two-way Communications of Multiple Pairs

Millimeter‑wave (mmWave) communications is a key enabler towards realizing enhanced Mobile Broadband (eMBB) as a key promise of 5G and beyond, due to the abundance of bandwidth available at mmWave bands. An mmWave coverage map consists of blind spots due to shadowing and fading especially in dense urban environments. Beamformingemploying massive MIMO is primarily used to address high attenuation in the mmWave channel. Due to their ability in manipulating the impinging electromagnetic waves in an energy‑efficient fashion, Reconfigurable Intelligent Surfaces (RISs) are considered a great match to complement the massive MIMO systems in realizing the beamforming task and therefore effectively filling in the mmWave coverage gap. In this paper, we propose a novel RIS architecture, namely RIS‑UPA where the RIS elements are arranged in a Uniform Planar Array (UPA). We show how RIS‑UPA can be used in an RIS‑aided MIMO system to fill the coverage gap in mmWave by forming beams of a custom footprint, with optimized main lobe gain, minimum leakage, and fairly sharp edges. Further, we propose a configuration for RIS‑UPA that can support multiple two‑way communication pairs, simultaneously. We theoretically obtain closed‑form low‑complexity solutions for our design and validate our theoretical findings by extensive numerical experiments.

Channel Reciprocity Calibration for Hybrid Beamforming in Distributed MIMO Systems

Time Division Duplex (TDD)-based distributed massive MIMO systems are envisioned as candidate solution for the physical layer of 6G multi-antenna systems supporting cooperative hybrid beamforming that heavily relies on the obtained uplink channel estimates for efficient coherent downlink precoding. However, due to the hardware impairment between the transmitter and the receiver, full channel reciprocity does not hold between the downlink and uplink direction. Such reciprocity mismatch deteriorates the performance of mm-Wave hybrid beamforming and has to be estimated and compensated for, to avoid performance degradation in the co-operative hybrid beamforming. In this paper, we address the channel reciprocity calibration between any two nodes at two levels. We decompose the problem into two sub-problems. In the first sub-problem, we calibrate the digital chain, i.e. obtain the mismatch coefficients of the (DAC/ADC) up to a constant scaling factor. In the second subproblem, we obtain the (PA/LNA) mismatch coefficients. At each step, we formulate the channel reciprocity calibration as a least square optimization problem that can efficiently be solved via conventional methods such as alternative optimization with high accuracy. Finally, we verify the performance of our channel reciprocity calibration approach through extensive numerical experiments.

The Trade-off between Scanning Beam Penetration and Transmission Beam Gain in mmWave Beam Alignment

Beam search algorithms have been proposed to align the beams from an access point to a user equipment. The process relies on sending beams from a set of scanning beams (SB) and tailoring a transmission beam (TB) using the received feedback. In this paper, we discuss a fundamental trade-off between the gain of SBs and TBs. The higher the gain of an SB, the better the penetration of the SB and the higher the gain of the TB the better the communication link performance. However, TB depends on the set of SBs and by increasing the coverage of each SB and in turn reducing its penetration, there is more opportunity to find a sharper TB to increase its beamforming gain. We define a quantitative measure for such trade-off in terms of a trade-off curve. We introduce SB set design namely Tulip design and formally prove it achieves this fundamental trade-off curve for channels with a single dominant path. We also find closed-form solutions for the trade-off curve for special cases and provide an algorithm with its performance evaluation results to find the trade-off curve revealing the need for further optimization on the SB sets in the state-of-the-art beam search algorithms.