Publications Archives | Page 54 of 67

Progressive Processing of System-Behavioral Query

December 13, 2019/in Publications/by NEC Labs America

System monitoring has recently emerged as an effective way to analyze and counter advanced cyber attacks. The monitoring data records a series of system events and provides a global view of system behaviors in an organization. Querying such data to identify potential system risks and malicious behaviors helps security analysts detect and analyze abnormal system behaviors caused by attacks. However, since the data volume is huge, queries could easily run for a long time, making it difficult for system experts to obtain prompt and continuous feedback. To support interactive querying over system monitoring data, we propose ProbeQ, a system that progressively processes system-behavioral queries. It allows users to concisely compose queries that describe system behaviors and specify an update frequency to obtain partial results progressively. The query engine of ProbeQ is built based on a framework that partitions ProbeQ queries into sub-queries for parallel execution and retrieves partial results periodically based on the specified update frequency. We concretize the framework with three partition strategies that predict the workloads for sub-queries, where the adaptive workload partition strategy (AdWd) dynamically adjusts the predicted workloads for subsequent sub-queries based on the latest execution information. We evaluate the prototype system of ProbeQ on commonly used queries for suspicious behaviors over real-world system monitoring data, and the results show that the ProbeQ system can provide partial updates progressively (on average 9.1% deviation from the update frequencies) with only 1.2% execution overhead compared to the execution without progressive processing.

Contextual Grounding of Natural Language Entities in Images

December 13, 2019/in Publications/by NEC Labs America

In this paper, we introduce a contextual grounding approach that captures the context in corresponding text entities and image regions to improve the grounding accuracy. Specifically, the proposed architecture accepts pre-trained text token embeddings and image object features from an off-the-shelf object detector as input. Additional encoding to capture the positional and spatial information can be added to enhance the feature quality. There are separate text and image branches facilitating respective architectural refinements for different modalities. The text branch is pre-trained on a large-scale masked language modeling task while the image branch is trained from scratch. Next, the model learns the contextual representations of the text tokens and image objects through layers of high-order interaction respectively. The final grounding head ranks the correspondence between the textual and visual representations through cross-modal interaction. In the evaluation, we show that our model achieves the state-of-the-art grounding accuracy of 71.36% over the Flickr30K Entities dataset. No additional pre-training is necessary to deliver competitive results compared with related work that often requires task-agnostic and task-specific pre-training on cross-modal datasets. The implementation is publicly available at https://gitlab.com/necla-ml/grounding.

Multivariate Long-Term State Forecasting in Cyber-Physical Systems: A Sequence to Sequence Approach

December 9, 2019/in Publications/by NEC Labs America

Cyber-physical systems (CPS) are ubiquitous in several critical infrastructure applications. Forecasting the state of CPS, is essential for better planning, resource allocation and minimizing operational costs. It is imperative to forecast the state of a CPS multiple steps into the future to afford enough time for planning of CPS operation to minimize costs and component wear. Forecasting system state also serves as a precursor to detecting process anomalies and faults. Concomitantly, sensors used for data collection are commodity hardware and experience frequent failures resulting in periods with sparse or no data. In such cases, re-construction through imputation of the missing data sequences is imperative to alleviate data sparsity and enable better performance of down-stream analytic models. In this paper, we tackle the problem of CPS state forecasting and data imputation and characterize the performance of a wide array of deep learning architectures – unidirectional gated and non-gated recurrent architectures, sequence to sequence (Seq2Seq) architectures as well as bidirectional architectures – with a specific focus towards applications in CPS. We also study the impact of procedures like scheduled sampling and attention, on model training. Our results indicate that Seq2Seq models are superior to traditional step ahead forecasting models and yield an improvement of at least 28.5% for gated recurrent architectures and about 87.6% for non-gated architectures in terms of forecasting performance. We also notice that bidirectional models learn good representations for forecasting as well as for data imputation. Bidirectional Seq2Seq models show an average improvement of 17.6% in forecasting performance over their unidirectional counterparts. We also demonstrate the effect of employing an attention mechanism in the context of Seq2Seq architectures and find that it provides an average improvement of 57.12% in the case of unidirectional Seq2Seq architectures while causing a performance decline in the case of bidirectional Seq2Seq architectures. Finally, we also find that scheduled sampling helps in training better models that yield significantly lower forecasting error.

Adaptive Neural Network for Node Classification in Dynamic Networks

November 11, 2019/in Publications/by NEC Labs America

Given a network with the labels for a subset of nodes, transductive node classification targets to predict the labels for the remaining nodes in the network. This technique has been used in a variety of applications such as voxel functionality detection in brain network and group label prediction in social network. Most existing node classification approaches are performed in static networks. However, many real-world networks are dynamic and evolve over time. The dynamics of both node attributes and network topology jointly determine the node labels. In this paper, we study the problem of classifying the nodes in dynamic networks. The task is challenging for three reasons. First, it is hard to effectively learn the spatial and temporal information simultaneously. Second, the network evolution is complex. The evolving patterns lie in both node attributes and network topology. Third, for different networks or even different nodes in the same network, the node attributes, the neighborhood node representations and the network topology usually affect the node labels differently, it is desirable to assess the relative importance of different factors over evolutionary time scales. To address the challenges, we propose AdaNN, an adaptive neural network for transductive node classification. AdaNN learns node attribute information by aggregating the node and its neighbors, and extracts network topology information with a random walk strategy. The attribute information and topology information are further fed into two connected gated recurrent units to learn the spatio-temporal contextual information. Additionally, a triple attention module is designed to automatically model the different factors that influence the node representations. AdaNN is the first node classification model that is adaptive to different kinds of dynamic networks. Extensive experiments on real datasets demonstrate the effectiveness of AdaNN.

Learning Robust Representations with Graph Denoising Policy Network

November 11, 2019/in Publications/by NEC Labs America

Existing representation learning methods based on graph neural networks and their variants rely on the aggregation of neighborhood information, which makes it sensitive to noises in the graph, e.g. erroneous links between nodes, incorrect/missing node features. In this paper, we propose Graph Denoising Policy Network (short for GDPNet) to learn robust representations from noisy graph data through reinforcement learning. GDPNet first selects signal neighborhoods for each node, and then aggregates the information from the selected neighborhoods to learn node representations for the down-stream tasks. Specifically, in the signal neighborhood selection phase, GDPNet optimizes the neighborhood for each target node by formulating the process of removing noisy neighborhoods as a Markov decision process and learning a policy with task-specific rewards received from the representation learning phase. In the representation learning phase, GDPNet aggregates features from signal neighbors to generate node representations for down-stream tasks, and provides task-specific rewards to the signal neighbor selection phase. These two phases are jointly trained to select optimal sets of neighbors for target nodes with maximum cumulative task-specific rewards, and to learn robust representations for nodes. Experimental results on node classification task demonstrate the effectiveness of GDNet, outperforming the state-of-the-art graph representation learning methods on several well-studied datasets.

Self-Attentive Attributed Network Embedding Through Adversarial Learning

November 11, 2019/in Publications/by NEC Labs America

Network embedding aims to learn the low-dimensional representations/embeddings of vertices which preserve the structure and inherent properties of the networks. The resultant embeddings are beneficial to downstream tasks such as vertex classification and link prediction. A vast majority of real-world networks are coupled with a rich set of vertex attributes, which could be potentially complementary in learning better embeddings. Existing attributed network embedding models, with shallow or deep architectures, typically seek to match the representations in topology space and attribute space for each individual vertex by assuming that the samples from the two spaces are drawn uniformly. The assumption, however, can hardly be guaranteed in practice. Due to the intrinsic sparsity of sampled vertex sequences and incompleteness in vertex attributes, the discrepancy between the attribute space and the network topology space inevitably exists. Furthermore, the interactions among vertex attributes, a.k.a cross features, have been largely ignored by existing approaches. To address the above issues, in this paper, we propose Nettention, a self-attentive network embedding approach that can efficiently learn vertex embeddings on attributed network. Instead of sample-wise optimization, Nettention aggregates the two types of information through minimizing the difference between the representation distributions in the low-dimensional topology and attribute spaces. The joint inference is encapsulated in a generative adversarial training process, yielding better generalization performance and robustness. The learned distributions consider both locality-preserving and global reconstruction constraints which can be inferred from the learning of the adversarially regularized autoencoders. Additionally, a multi-head self-attention module is developed to explicitly model the attribute interactions. Extensive experiments on benchmark datasets have verified the effectiveness of the proposed Nettention model on a variety of tasks, including vertex classification and link prediction.

Contextual Grounding of Natural Language Phrases in Images

November 5, 2019/in Publications/by NEC Labs America

In this paper, we introduce a contextual grounding approach that captures the context in corresponding text entities and image regions to improve the grounding accuracy. Specifically, the proposed architecture accepts pre-trained text token embeddings and image object features from an off-the-shelf object detector as input. Additional encoding to capture the positional and spatial information can be added to enhance the feature quality. There are separate text and image branches facilitating respective architectural refinements for different modalities. The text branch is pre-trained on a large-scale masked language modeling task while the image branch is trained from scratch. Next, the model learns the contextual representations of the text tokens and image objects through layers of high-order interaction respectively. The final grounding head ranks the correspondence between the textual and visual representations through cross-modal interaction. In the evaluation, we show that our model achieves the state-of-the-art grounding accuracy of 71.36% over the Flickr30K Entities dataset. No additional pre-training is necessary to deliver competitive results compared with related work that often requires task-agnostic and task-specific pre-training on cross-modal datasets. The implementation is publicly available at https://gitlab.com/necla-ml/Grounding

On Novel Object Recognition: A Unified Framework for Discriminability and Adaptability

November 4, 2019/in Publications/by NEC Labs America

The rich and accessible labeled data fueled the revolutionary successes of deep learning in object recognition. However, recognizing objects of novel classes with limited supervision information provided, i.e., Novel Object Recognition (NOR), remains a challenging task. We identify in this paper two key factors for the success of NOR that previous approaches fail to simultaneously guarantee. The first is producing discriminative feature representations for images of novel classes, and the second is generating a flexible classifier readily adapted to novel classes provided with limited supervision signals. To secure both key factors, we propose a framework which decouples a deep classification model into a feature extraction module and a classification module. We learn the former to ensure feature discriminability with a standard multi-class classification task by fully utilizing the competing information among all classes within a training set, and learn the latter to secure adaptability by training a meta-learner network which generates classifier weights whenever provided with minimal supervision information of target classes. Extensive experiments on common benchmark datasets in the settings of both zero-shot and few-shot learning demonstrate our method achieves state-of-the-art performance.

Degeneracy in Self-Calibration Revisited and a Deep Learning Solution for Uncalibrated SLAM

November 3, 2019/in Publications/by NEC Labs America

Self-calibration of camera intrinsics and radial distortion has a long history of research in the computer vision community. However, it remains rare to see real applications of such techniques to modern Simultaneous Localization And Mapping (SLAM) systems, especially in driving scenarios. In this paper, we revisit the geometric approach to this problem, and provide a theoretical proof that explicitly shows the ambiguity between radial distortion and scene depth when two-view geometry is used to self-calibrate the radial distortion. In view of such geometric degeneracy, we propose a learning approach that trains a convolutional neural network (CNN) on a large amount of synthetic data. We demonstrate the utility of our proposed method by applying it as a checkerboard-free calibration tool for SLAM, achieving comparable or superior performance to previous learning and hand-crafted method

Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles

November 3, 2019/in Publications/by NEC Labs America

We address the problem of 3D object detection from 2D monocular images in autonomous driving scenarios. We propose to lift the 2D images to 3D representations using learned neural networks and leverage existing networks working directly on 3D data to perform 3D object detection and localization. We show that, with carefully designed training mechanism and automatically selected minimally noisy data, such a method is not only feasible, but gives higher results than many methods working on actual 3D inputs acquired from physical sensors. On the challenging KITTI benchmark, we show that our 2D to 3D lifted method outperforms many recent competitive 3D networks while significantly outperforming previous state-of-the-art for 3D detection from monocular images. We also show that a late fusion of the output of the network trained on generated 3D images, with that trained on real 3D images, improves performance. We find the results very interesting and argue that such a method could serve as a highly reliable backup in case of malfunction of expensive 3D sensors, if not potentially making them redundant, at least in the case of low human injury risk autonomous navigation scenarios like warehouse automation.

Progressive Processing of System-Behavioral Query

Multivariate Long-Term State Forecasting in Cyber-Physical Systems: A Sequence to Sequence Approach

Adaptive Neural Network for Node Classification in Dynamic Networks

Learning Robust Representations with Graph Denoising Policy Network

Self-Attentive Attributed Network Embedding Through Adversarial Learning

On Novel Object Recognition: A Unified Framework for Discriminability and Adaptability

Degeneracy in Self-Calibration Revisited and a Deep Learning Solution for Uncalibrated SLAM

Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles

Contact Us

About Us

Our Pages

Recent Publications

News & Events