Deep Co-Clustering

Co-clustering partitions instances and features simultaneously by leveraging the duality between them, and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective features. However, the research on leveraging deep learning frameworks for co-clustering is limited for two reasons: 1) current deep clustering approaches usually decouple feature learning and cluster assignment as two separate steps, which cannot yield the task-specific feature representation; 2) existing deep clustering approaches cannot learn representations for instances and features simultaneously. In this paper, we propose a deep learning model for co-clustering called DeepCC. DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian Mixture Model (GMM) to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features. DeepCC jointly optimizes the parameters of the deep autoencoder and the mixture model in an end-to-end fashion on both the instance and the feature spaces, which can help the deep autoencoder escape from local optima and the mixture model circumvent the Expectation-Maximization (EM) algorithm. To the best of our knowledge, DeepCC is the first deep learning model for co-clustering. Experimental results on various dataseis demonstrate the effectiveness of DeepCC.

Attentional Heterogeneous Graph Neural Network: Application to Program Reidentification

Program or process is an integral part of almost every IT/OT system. Can we trust the identity/ID (e.g., executable name) of the program? To avoid detection, malware may disguise itself using the ID of a legitimate program, and a system tool (e.g., PowerShell) used by the attackers may have the fake ID of another common software, which is less sensitive. However, existing intrusion detection techniques often overlook this critical program reidentification problem (i.e., checking the program’s identity). In this paper, we propose an attentional heterogeneous graph neural network model (DeepHGNN) to verify the program’s identity based on its system behaviors. The key idea is to leverage the representation learning of the heterogeneous program behavior graph to guide the reidentification process. We formulate the program reidentification as a graph classification problem and develop an effective attentional heterogeneous graph embedding algorithm to solve it. Extensive experiments — using real-world enterprise monitoring data and real attacks — demonstrate the effectiveness of DeepHGNN across multiple popular metrics and the robustness to the normal dynamic changes like program version upgrades.

A Deep Spatio-Temporal Fuzzy Neural Network for Passenger Demand Prediction

In spite of its importance, passenger demand prediction is a highly challenging problem, because the demand is simultaneously influenced by the complex interactions among many spatial and temporal factors and other external factors such as weather. To address this problem, we propose a Spatio-TEmporal Fuzzy neural Network (STEF-Net) to accurately predict passenger demands incorporating the complex interactions of all known important factors. We design an end-to-end learning framework with different neural networks modeling different factors. Specifically, we propose to capture spatio-temporal feature interactions via a convolutional long short-term memory network and model external factors via a fuzzy neural network that handles data uncertainty significantly better than deterministic methods. To keep the temporal relations when fusing two networks and emphasize discriminative spatio-temporal feature interactions, we employ a novel feature fusion method with a convolution operation and an attention layer. As far as we know, our work is the first to fuse a deep recurrent neural network and a fuzzy neural network to model complex spatial-temporal feature interactions with additional uncertain input features for predictive learning. Experiments on a large-scale real-world dataset show that our model achieves more than 10% improvement over the state-of-the-art approaches.

Spectrally-Efficient 200G Probabilistically-Shaped 16QAM over 9000km Straight Line Transmission with Flexible Multiplexing Scheme

Flexible wavelength-multiplexing technique in backbone submarine networks has been deployed to accommodate the trend of variable-rate modulation formats. In this paper, we propose a new design of flexible-rate transponders in the scenario of flexible multiplexing scheme to achieve near-Shannon performance. Probabilistic-shaped (PS) M-QAM is capable of adjusting the bit rate at very finer granularity by adapting the entropy of the distribution matcher. Instead of delivering variable bit rates at the fixed baud rate, various baud rates of 200Gb/s PS-16QAM is demonstrated to fit into the flexible grid multiple 3.125GHz bandwidth. This flexible baud rate saves the limited optical bandwidth assigned by the flexible multiplexing scheme to improve bandwidth utilization. The 200G PS-16QAM signals are experimentally demonstrated over 9000km straight-line testbed to achieve 3.05b/s/Hz~5.33 b/s/Hz spectral efficiency (SE) with up to 4dB Q margin. In addition, the high baud rate signals are used for lower SE while low baud rate signals are targeting at high SE transmission to reduce the implementation penalty.

Fiber Nonlinearity Compensation by Neural Networks

Neuron network (NN) is proposed to work together with perturbation-based nonlinearity compensation (NLC) algorithm by feeding with intra-channel cross-phase modulation (IXPM) and intra-channel four-wave mixing (IFWM) triplets. Without prior knowledge of the transmission link and signal pulse shaping/baudrate, the optimum NN architecture and its tensor weights are completely constructed from a data-driven approach by exploring the training datasets. After trimming down the unnecessary input tensors based on their weights, its complexity is further reduced by applying the trained NN model at the transmitter side thanks to the limited alphabet size of the modulation formats. The performance advantage of Tx-side NN-NLC is experimentally demonstrated using both single-channel and WDM-channel 32Gbaud dual-polarization 16QAM over 2800km transmission

Coupled-Core Fiber Design For Enhancing Nonlinearity Tolerance

Fiber nonlinearity is a major limitation on the achievable maximum capacity per fiber core. Digital signal processing (DSP) can be used directly to compensate nonlinear impairments, however with limited effectiveness. It is well known that fibers with higher chromatic dispersion (CD) reduce nonlinear impairments, and CD can be taken care of with DSP. Since, maximum CD is limited by material dispersion of the fiber we propose using strongly-coupled multi-core fibers with large group delay (GD) between the cores. Nonlinear mitigation is achieved through strong mode coupling, and group delay between the cores which suppresses four-wave mixing interaction by inducing large phase-mismatch, albeit stochastic in nature. Through simulations we determine the threshold GD required for noticeable nonlinearity suppression depends on the fiber CD. In particular, for dispersion-uncompensated links a large GD of the order of 1ns per 1000km is required to improve optimum Q by 1 dB. Furthermore, beyond this threshold, larger GD results in larger suppression without any signs of saturation.

PoLPer: Process-Aware Restriction of Over-Privileged Setuid Calls in Legacy Applications

Setuid system calls enable critical functions such as user authentications and modular privileged components. Such operations must only be executed after careful validation. However, current systems do not perform rigorous checks, allowing exploitation of privileges through memory corruption vulnerabilities in privileged programs. As a solution, understanding which setuid system calls can be invoked in what context of a process allows precise enforcement of least privileges. We propose a novel comprehensive method to systematically extract and enforce least privilege of setuid system calls to prevent misuse. Our approach learns the required process contexts of setuid system calls along multiple dimensions: process hierarchy, call stack, and parameter in a process-aware way. Every setuid system call is then restricted to the per-process context by our kernel-level context enforcer. Previous approaches without process-awareness are too coarse-grained to control setuid system calls, resulting in over-privilege. Our method reduces available privileges even for identical code depending on whether it is run by a parent or a child process. We present our prototype called PoLPer which systematically discovers only required setuid system calls and effectively prevents real-world exploits targeting vulnerabilities of the setuid family of system calls in popular desktop and server software at near zero overhead.

On the Performance Metric and Design of Non-Uniformly Shaped Constellation

Asymmetric information is shown to be more accurate in characterizing the performance of quadrant folding shaped (QFS) M-QAM. The performance difference of QFS M-QAM schemes strongly depends on the FEC coding rate, and the optimum FEC coding rate is found to be around ?0.8, which is independent of QFS M-QAM and the designed rates.

First Field Trial of Sensing Vehicle Speed, Density, and Road Conditions by Using Fiber Carrying High Speed Data

For the first time, we demonstrate detection of vehicle speed, density, and road conditions using deployed fiber carrying high-speed data transmission, and prove carriers’ large-scale fiber infrastructures can also be used as ubiquitous sensing networks.

NODOZE: Combatting Threat Alert Fatigue with Automated Provenance Triage

Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there are more alerts than cyber analysts can properly investigate. This leads to a “threat alert fatigue” or information overload problem where cyber analysts miss true attack alerts in the noise of false alarms.In this paper, we present NoDoze to combat this challenge using contextual and historical information of generated threat alert in an enterprise. NoDoze first generates a causal dependency graph of an alert event. Then, it assigns an anomaly score to each event in the dependency graph based on the frequency with which related events have happened before in the enterprise. NoDoze then propagates those scores along the edges of the graph using a novel network diffusion algorithm and generates a subgraph with an aggregate anomaly score which is used to triage alerts. Evaluation on our dataset of 364 threat alerts shows that NoDoze decreases the volume of false alarms by 86%, saving more than 90 hours of analysts’ time, which was required to investigate those false alarms. Furthermore, NoDoze generated dependency graphs of true alerts are 2 orders of magnitude smaller than those generated by traditional tools without sacrificing the vital information needed for the investigation. Our system has a low average runtime overhead and can be deployed with any threat detection software.