A Query System for Efficiently Investigating Complex Attack Behaviors for Enterprise Security

The need for countering Advanced Persistent Threat (APT) attacks has led to the solutions that ubiquitously monitor system activities in each enterprise host, and perform timely attack investigation over the monitoring data for uncovering the attack sequence. However, existing general-purpose query systems lack explicit language constructs for expressing key properties of major attack behaviors, and their semantics-agnostic design often produces inefficient execution plans for queries. To address these limitations, we build Aiql, a novel query system that is designed with novel types of domain-specific optimizations to enable efficient attack investigation. Aiql provides (1) a domain-specific data model and storage for storing the massive system monitoring data, (2) a domain-specific query language, Attack Investigation Query Language (Aiql) that integrates critical primitives for expressing major attack behaviors, and (3) an optimized query engine based on the characteristics of the data and the semantics of the query to efficiently schedule the execution. We have deployed Aiql in NEC Labs America comprising 150 hosts. In our demo, we aim to show the complete usage scenario of Aiql by (1) performing an APT attack in a controlled environment, and (2) using Aiql to investigate such attack by querying the collected system monitoring data that contains the attack traces. The audience will have the option to perform the APT attack themselves under our guidance, and interact with the system and investigate the attack via issuing queries and checking the query results through our web UI.

Model transfer of QoT prediction in optical networks based on artificial neural networks

An artificial neural network (ANN) based transfer learning model is built for quality of transmission (QoT) prediction in optical systems feasible with different modulation formats. Knowledge learned from one optical system can be transferred to a similar optical system by adjusting weights in ANN hidden layers with a few additional training samples, where highly related information from both systems is integrated and redundant information is discarded. Homogeneous and heterogeneous ANN structures are implemented to achieve accurate Q-factor-based QoT prediction with low root-mean-square error. The transfer learning accuracy under different modulation formats, transmission distances, and fiber types is evaluated. Using transfer learning, the number of retraining samples is reduced from 1000 to as low as 20, and the training time is reduced by up to four times.

Heterogeneous Graph Matching Networks for Unknown Malware Detection

Information systems have widely been the target of malware attacks. Traditional signature-based malicious program detection algorithms can only detect known malware and are prone to evasion techniques such as binary obfuscation, while behavior-based approaches highly rely on the malware training samples and incur prohibitively high training cost. To address the limitations of existing techniques, we propose MatchGNet, a heterogeneous Graph Matching Network model to learn the graph representation and similarity metric simultaneously based on the invariant graph modeling of the program’s execution behaviors. We conduct a systematic evaluation of our model and show that it is accurate in detecting malicious program behavior and can help detect malware attacks with less false positives. MatchGNet outperforms the state-of-the-art algorithms in malware detection by generating 50% less false positives while keeping zero false negatives.

Spatio-Temporal Attentive RNN for Node Classification in Temporal Attributed Graphs

Node classification in graph-structured data aims to classify the nodes where labels are only available for a subset of nodes. This problem has attracted considerable research efforts in recent years. In real-world applications, both graph topology and node attributes evolve over time. Existing techniques, however, mainly focus on static graphs and lack the capability to simultaneously learn both temporal and spatial/structural features. Node classification in temporal attributed graphs is challenging for two major aspects. First, effectively modeling the spatio-temporal contextual information is hard. Second, as temporal and spatial dimensions are entangled, to learn the feature representation of one target node, it’s desirable and challenging to differentiate the relative importance of different factors, such as different neighbors and time periods. In this paper, we propose STAR, a spatio-temporal attentive recurrent network model, to deal with the above challenges. STAR extracts the vector representation of neighborhood by sampling and aggregating local neighbor nodes. It further feeds both the neighborhood representation and node attributes into a gated recurrent unit network to jointly learn the spatio-temporal contextual information. On top of that, we take advantage of the dual attention mechanism to perform a thorough analysis on the model interpretability. Extensive experiments on real datasets demonstrate the effectiveness of the STAR model.

Aggregation of BTM Battery Storages to Provide Ancillary Services in Wholesale Electricity Markets

The behind the meter battery energy storage systems (BTM-BESSs) have been deployed widely by indus-trial/commercial buildings to manage electricity transaction with utilities in order to reduce customers’ electricity bills. Commercial BTM battery storages are mainly employed to cut the customers’ monthly demand peaks, which is rewarded by significant decrease in the monthly demand charge. However, given complexity of demand charge management problems, the rates of return on investments for installation of BTM-BESSs are not appealing enough. In this paper, an aggregation model for BTM-BESSs is proposed in order to provide the opportunity for the BTM-EMS units to participate in the multiple wholesale markets to provide ancillary services, in addition to the demand charge management, to maximize owners’ payoff from installation of BTM-BESSs. Finally, the efficiency of the proposed aggregation model is validated through the simulation studies on the real value data.

Conditional GAN with Discriminative Filter Generation for Text-to-Video Synthesis

Developing conditional generative models for text-to-video synthesis is an extremely challenging yet an important topic of research in machine learning. In this work, we address this problem by introducing Text-Filter conditioning Generative Adversarial Network (TFGAN), a conditional GAN model with a novel multi-scale text-conditioning scheme that improves text-video associations. By combining the proposed conditioning scheme with a deep GAN architecture, TFGAN generates high quality videos from text on challenging real-world video datasets. In addition, we construct a synthetic dataset of text-conditioned moving shapes to systematically evaluate our conditioning scheme. Extensive experiments demonstrate that TFGAN significantly outperforms existing approaches, and can also generate videos of novel categories not seen during training.

Learning K-way D-dimensional Discrete Embedding for Hierarchical Data Visualization and Retrieval

Traditional embedding approaches associate a real-valued embedding vector with each symbol or data point, which is equivalent to applying a linear transformation to “one-hot” encoding of discrete symbols or data objects. Despite simplicity, these methods generate storage-inefficient representations and fail to effectively encode the internal semantic structure of data, especially when the number of symbols or data points and the dimensionality of the real-valued embedding vectors are large. In this paper, we propose a regularized autoencoder framework to learn compact Hierarchical K-way D-dimensional (HKD) discrete embedding of symbols or data points, aiming at capturing essential semantic structures of data. Experimental results on synthetic and real-world datasets show that our proposed HKD embedding can effectively reveal the semantic structure of data via hierarchical data visualization and greatly reduce the search space of nearest neighbor retrieval while preserving high accuracy.

A Study on Traffic Flow Monitoring Using Optical Fiber Sensor Technology

Traffic conditions of the highway, Ya traffic volume meter CCTV Because it is observed in the spot, such as the discovery of traffic disturbances which deviates from the observation spot it may be delayed. The traffic flow has a problem from the point observations data indirectly order to be estimated, the capture accuracy of trending and regional circumstances change in time series. Therefore, we focused on the optical fiber sensing technology that utilizes the existing light off Aibainfura highway, actually measuring the travel vibration of the vehicle from the infrastructure as a continuous line, overhead grasp the traffic flow from the traveling locus We are working to. This time, tried traffic flow observation and the estimates of the average speed in the Tokyo, Nagoya and New Tomei Expressway. A result, the demonstration zone 45km in a traffic flow observable real time, succeeded in average speed calculation equivalent to the existing traffic meter, this technology has shown promise as a bird’s-eye technique wide and real-time traffic flow.

Deep Supervision with Intermediate Concepts (IEEE)

Read Deep Supervision with Intermediate Concepts (IEEE). Recent data-driven approaches to scene interpretation predominantly pose inference as an end-to-end black-box mapping, commonly performed by a Convolutional Neural Network (CNN). However, decades of work on perceptual organization in both human and machine vision suggest that there are often intermediate representations that are intrinsic to an inference task, and which provide essential structure to improve generalization. In this work, we explore an approach for injecting prior domain structure into neural network training by supervising hidden layers of a CNN with intermediate concepts that normally are not observed in practice. We formulate a probabilistic framework which formalizes these notions and predicts improved generalization via this deep supervision method. One advantage of this approach is that we are able to train only from synthetic CAD renderings of cluttered scenes, where concept values can be extracted, but apply the results to real images. Our implementation achieves the state-of-the-art performance of 2D/3D keypoint localization and image classification on real image benchmarks including KITTI, PASCALVOC, PASCAL3D+, IKEA, and CIFAR100. We provide additional evidence that our approach outperforms alternative forms of supervision, such as multi-task networks.

Pose-variant 3D Facial Attribute Generation

We address the challenging problem of generating facial attributes using a single image in an unconstrained pose. In contrast to prior works that largely consider generation on 2D near-frontal images, we propose a GAN-based framework to generate attributes directly on a dense 3D representation given by UV texture and position maps, resulting in photorealistic, geometrically-consistent and identity-preserving outputs. Starting from a self-occluded UV texture map obtained by applying an off-the-shelf 3D reconstruction method, we propose two novel components. First, a texture completion generative adversarial network (TC-GAN) completes the partial UV texture map. Second, a 3D attribute generation GAN (3DA-GAN) synthesizes the target attribute while obtaining an appearance consistent with 3D face geometry and preserving identity. Extensive experiments on CelebA, LFW and IJB-A show that our method achieves consistently better attribute generation accuracy than prior methods, a higher degree of qualitative photorealism and preserves face identity information.