Degeneracy in Self-Calibration Revisited and a Deep Learning Solution for Uncalibrated SLAM

Self-calibration of camera intrinsics and radial distortion has a long history of research in the computer vision community. However, it remains rare to see real applications of such techniques to modern Simultaneous Localization And Mapping (SLAM) systems, especially in driving scenarios. In this paper, we revisit the geometric approach to this problem, and provide a theoretical proof that explicitly shows the ambiguity between radial distortion and scene depth when two-view geometry is used to self-calibrate the radial distortion. In view of such geometric degeneracy, we propose a learning approach that trains a convolutional neural network (CNN) on a large amount of synthetic data. We demonstrate the utility of our proposed method by applying it as a checkerboard-free calibration tool for SLAM, achieving comparable or superior performance to previous learning and hand-crafted method

Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles

We address the problem of 3D object detection from 2D monocular images in autonomous driving scenarios. We propose to lift the 2D images to 3D representations using learned neural networks and leverage existing networks working directly on 3D data to perform 3D object detection and localization. We show that, with carefully designed training mechanism and automatically selected minimally noisy data, such a method is not only feasible, but gives higher results than many methods working on actual 3D inputs acquired from physical sensors. On the challenging KITTI benchmark, we show that our 2D to 3D lifted method outperforms many recent competitive 3D networks while significantly outperforming previous state-of-the-art for 3D detection from monocular images. We also show that a late fusion of the output of the network trained on generated 3D images, with that trained on real 3D images, improves performance. We find the results very interesting and argue that such a method could serve as a highly reliable backup in case of malfunction of expensive 3D sensors, if not potentially making them redundant, at least in the case of low human injury risk autonomous navigation scenarios like warehouse automation.

Rethinking Zero-Shot Learning: A Conditional Visual Classification Perspective

Zero-shot learning (ZSL) aims to recognize instances of unseen classes solely based on the semantic descriptions of the classes. Existing algorithms usually formulate it as a semantic-visual correspondence problem, by learning mappings from one feature space to the other. Despite being reasonable, previous approaches essentially discard the highly precious discriminative power of visual features in an implicit way, and thus produce undesirable results. We instead reformulate ZSL as a conditioned visual classification problem, i.e., classifying visual features based on the classifiers learned from the semantic descriptions. With this reformulation, we develop algorithms targeting various ZSL settings: For the conventional setting, we propose to train a deep neural network that directly generates visual feature classifiers from the semantic attributes with an episode-based training scheme; For the generalized setting, we concatenate the learned highly discriminative classifiers for seen classes and the generated classifiers for unseen classes to classify visual features of all classes; For the transductive setting, we exploit unlabeled data to effectively calibrate the classifier generator using a novel learning-without-forgetting self-training mechanism and guide the process by a robust generalized cross-entropy loss. Extensive experiments show that our proposed algorithms significantly outperform state-of-the-art methods by large margins on most benchmark datasets in all the ZSL settings.

Domain Adaptation for Structured Output via Discriminative Patch Representations

Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks. However, models trained on one data domain may not generalize well to other domains without annotations for model finetuning. To avoid the labor-intensive process of annotation, we develop a domain adaptation method to adapt the source data to the unlabeled target domain. We propose to learn discriminative feature representations of patches in the source domain by discovering multiple modes of patch-wise output distribution through the construction of a clustered space. With such representations as guidance, we use an adversarial learning scheme to push the feature representations of target patches in the clustered space closer to the distributions of source patches. In addition, we show that our framework is complementary to existing domain adaptation techniques and achieves consistent improvements on semantic segmentation. Extensive ablations and results are demonstrated on numerous benchmark datasets with various settings, such as synthetic-to-real and cross-city scenarios.

GLoSH: Global-Local Spherical Harmonics for Intrinsic Image Decomposition

Traditional intrinsic image decomposition focuses on decomposing images into reflectance and shading, leaving surfaces normals and lighting entangled in shading. In this work, we propose a Global-Local Spherical Harmonics (GLoSH) lighting model to improve the lighting component, and jointly predict reflectance and surface normals. The global SH models the holistic lighting while local SH account for the spatial variation of lighting. Also, a novel non-negative lighting constraint is proposed to encourage the estimated SH to be physically meaningful. To seamlessly reflect the GLoSH model, we design a coarse-to-fine network structure. The coarse network predicts global SH, reflectance and normals, and the fine network predicts their local residuals. Lacking labels for reflectance and lighting, we apply synthetic data for model pre-training and fine-tune the model with real data in a self-supervised way. Compared to the state-of-the-art methods only targeting normals or reflectance and shading, our method recovers all components and achieves consistently better results on three real datasets, IIW, SAW and NYUv2.

VeCharge: Intelligent Energy Management for Electric Vehicle charging

2018’s 1.2 million North American charging ports will grow ten times to over 12.6 million by 2027, according to Navigant, which could overwhelm the nation’s grids. DC Fast charging requires grid upgrade to supply the new charging demand. However, since the utilization ratio of those charging station is currently low. Demand charge cost can reach up to 90% of the total bill. Combining fast charging with energy storage can mitigate grid impacts and reduce demand charges. EV specific pricing is proposed for EV charging by many energy suppliers. Without managed charging, EV owner will lose the benefit of lowering charging cost by avoiding peak hour charging or missing the period when renewable energy generation is abundant.

Wavelength Modulation Spectroscopy Enhanced by Machine Learning for Early Fire Detection

We proposed and demonstrated a new machine learning algorithm for wavelength modulation spectroscopy to enhance the accuracy of fire detection. The result shows more than 8% of accuracy improvement by analyzing CO/CO 2 2f signals.

Data-Driven Day-Ahead PV Estimation Using Hybrid Deep Learning

Ongoing smart grid activities and associated automation resulted in rich set of data. These data can be utilized for monitoring and estimation of real time photovoltaic (PV) generation. Inherent variability in PV and related impact on power systems is a challenging problem. Improving the accuracy of PV generation estimation is beneficial for both the PV owners and the grid operators. Recently, deep learning algorithms possible by the availability of data have shown its advantages for time series estimation; however, its application on PV generation estimation is still in the early stage. In this paper, a hybrid estimation model with a combination of long-short-term-memory network (LSTM) and persistence model (PM) is developed to provide day-ahead PV estimation at 15-minute time interval with high accuracy and robustness. Simulation results show the superior performance of the proposed method over existing methods for most of the test c

Beam Training Optimization in Millimeter-wave Systems under Beamwidth, Modulation and Coding Constraints

Millimeter-wave (mmWave) bands have the potential to enable significantly high data rates in wireless systems. In order to overcome intense path loss and severe shadowing in these bands, it is essential to employ directional beams for data transmission. Furthermore, it is known that the mmWave channel incorporates a few number of spatial clusters necessitating additional time to align the corresponding beams with the channel prior to data transmission. This procedure is known as beam training (BT). While a longer BT leads to more directional beams (equivalently higher beamforming gains), there is less time for data communication. In this paper, this trade-off is investigated for a time slotted system under practical constraints such as finite beamwidth resolution and discrete modulation and coding schemes. At each BT time slot, the access point (AP) scans a region of uncertainty by transmitting a probing packet and refines angle of arrival (AoA) estimate based on user equipment (UE) feedback. Given a total number time slots, the objective is to find the optimum allocation between BT and data transmission and a feasible beamwidth for the estimation of AoA at each BT time slot such that the expected throughput is maximized. It is shown that the problem satisfies the optimal substructure property enabling the use of a backward dynamic programming approach to find the optimal solution with polynomial computational complexity. Simulation results reveal that in practical scenarios, the proposed approach outperforms existing techniques such as exhaustive and bisection search.

Opportunistic Temporal Fair Mode Selection and User Scheduling for Full-duplex Systems

In-band full-duplex (FD) communications – enabled by recent advances in antenna and RF circuit design – has emerged as one of the promising techniques to improve data rates in wireless systems. One of the major roadblocks in enabling high data rates in FD systems is the inter-user interference (IUI) due to activating pairs of uplink and downlink users at the same time-frequency resource block. Opportunistic user scheduling has been proposed as a means to manage IUI and fully exploit the multiplexing gains in FD systems. In this paper, scheduling under long-term and short-term temporal fairness for single-cell FD wireless networks is considered. Temporal fair scheduling is of interest in delay-sensitive applications, and leads to predictable latency and power consumption. The feasible region of user temporal demand vectors is derived, and a scheduling strategy maximizing the system utility while satisfying long-term temporal fairness is proposed. Furthermore, a short-term temporal fair scheduling strategy is devised which satisfies user temporal demands over a finite window-length. It is shown that the strategy achieves optimal average system utility as the window-length is increased asymptotically. Subsequently, practical construction algorithms for long-term and short-term temporal fair scheduling are introduced. Simulations are provided to verify the derivations and investigate the multiplexing gains. It is observed that using successive interference cancellation at downlink users improves FD gains significantly in the presence of strong IUI.