Leveraging Knowledge Bases for Future Prediction with Memory Comparison Networks

Making predictions about what might happen in the future is important for reacting adequately in many situations. For example, observing that “Man kidnaps girl” may have the consequence that “Man kills girl”. While this is part of common sense reasoning for humans, it is not obvious how machines can acquire and generalize over such knowledge. In this article, we propose a new type of memory network that can predict the next future event also for observations that are not in the knowledge base. We evaluate our proposed method on two knowledge bases: Reuters KB (events from news articles) and Regneri KB (events from scripts). For both knowledge bases, our proposed method shows similar or better prediction accuracy on unseen events (or scripts) than recently proposed deep neural networks and rankSVM. We also demonstrate that the attention mechanism of our proposed method can be helpful for error analysis and manual expansion of the knowledge base.

Teaching Syntax by Adversarial Distraction

Existing entailment datasets mainly pose problems which can be answered without attention to grammar or word order. Learning syntax requires comparing examples where different grammar and word order change the desired classification. We introduce several datasets based on synthetic transformations of natural entailment examples in SNLI or FEVER, to teach aspects of grammar and word order. We show that without retraining, popular entailment models are unaware that these syntactic differences change meaning. With retraining, some but not all popular entailment models can learn to compare the syntax properly.

Team Papelo: Transformer Networks at FEVER

We develop a system for the FEVER fact extraction and verification challenge that uses a high precision entailment classifier based on transformer networks pretrained with language modeling, to classify a broad set of potential evidence. The precision of the entailment classifier allows us to enhance recall by considering every statement from several articles to decide upon each claim. We include not only the articles best matching the claim text by TFIDF score, but read additional articles whose titles match named entities and capitalized expressions occurring in the claim text. The entailment module evaluates potential evidence one statement at a time, together with the title of the page the evidence came from (providing a hint about possible pronoun antecedents). In preliminary evaluation, the system achieves .5736 FEVER score, .6108 label accuracy, and .6485 evidence F1 on the FEVER shared task test set.

Learning Context-Sensitive Convolutional Filters for Text Processing

Convolutional neural networks (CNNs) have recently emerged as a popular building block for natural language processing (NLP). Despite their success, most existing CNN models employed in NLP share the same learned (and static) set of filters for all input sentences. In this paper, we consider an approach of using a small meta network to learn context-sensitive convolutional filters for text processing. The role of meta network is to abstract the contextual information of a sentence or document into a set of input-sensitive filters. We further generalize this framework to model sentence pairs, where a bidirectional filter generation mechanism is introduced to encapsulate co-dependent sentence representations. In our benchmarks on four different tasks, including ontology classification, sentiment analysis, answer sentence selection, and paraphrase identification, our proposed model, a modified CNN with context-sensitive filters, consistently outperforms the standard CNN and attention-based CNN baselines. By visualizing the learned context-sensitive filters, we further validate and rationalize the effectiveness of proposed framework.

Demand Charge and Response with Energy Storage

Commercial and industry (C& I) customers incur two types of electricity charges on their bills: one for the amount of energy usage and another one for the maximum demand during certain billing periods. The second charge type is known as Demand Charge (DC), which could account for over half of a customers’ electricity bill. Those C& I customers often sign up for Demand Response (DR) programs to contribute to peak demand reduction as well as to receive incentives and rewards from participating in the programs. The critical factor of achieving both DR and DC reduction is to recognize the nature of these two types of problems and create an effective strategy that can handle them at the same time by which the benefits from DR incentives and DC reduction are maximized. This paper discusses the possible DR scenarios with DC reduction framework for C& I customers who use a Behind-the-Meter (BTM) energy storage and proposes a consistent real-time procedure of deciding battery’s charging and discharging set points to solve the problem of maximizing the rewards by conducting DRs as well as the savings by reducing DC costs.

SkyCore: Moving Core to the Edge for Untethered and Reliable UAV-based LTE Networks

The advances in unmanned aerial vehicle (UAV) technology have empowered mobile operators to deploy LTE base stations (BSs) on UAVs, and provide on-demand, adaptive connectivity to hotspot venues as well as emergency scenarios. However, today’s evolved packet core (EPC) that orchestrates the LTE RAN faces fundamental limitations in catering to such a challenging, wireless and mobile UAV environment, particularly in the presence of multiple BSs (UAVs). In this work, we argue for and propose an alternate, radical edge EPC design, called SkyCore that pushes the EPC functionality to the extreme edge of the core network – collapses the EPC into a single, light-weight, self-contained entity that is co-located with each of the UAV BS. SkyCore incorporates elements that are designed to address the unique challenges facing such a distributed design in the UAV environment, namely the resource-constraints of UAV platforms, and the distributed management of pronounced UAV and UE mobility. We build and deploy a fully functional version of SkyCore on a two-UAV LTE network and showcase its (i) ability to interoperate with commercial LTE BSs as well as smartphones, (ii) support for both hotspot and standalone multi-UAV deployments, and (iii) superior control and data plane performance compared to other EPC variants in this environment.

TGNet: Learning to Rank Nodes in Temporal Graphs

Node ranking in temporal networks are often impacted by heterogeneous context from node content, temporal, and structural dimensions. This paper introduces TGNet , a deep-learning framework for node ranking in heterogeneous temporal graphs. TGNet utilizes a variant of Recurrent Neural Network to adapt context evolution and extract context features for nodes. It incorporates a novel influence network to dynamically estimate temporal and structural influence among nodes over time. To cope with label sparsity, it integrates graph smoothness constraints as a weak form of supervision. We show that the application of TGNet is feasible for large-scale networks by developing efficient learning and inference algorithms with optimization techniques. Using real-life data, we experimentally verify the effectiveness and efficiency of TGNet techniques. We also show that TGNet yields intuitive explanations for applications such as alert detection and academic impact ranking, as verified by our case study.

Collaborative Alert Ranking for Anomaly Detection

Given a large number of low-quality heterogeneous categorical alerts collected from an anomaly detection system, how to characterize the complex relationships between different alerts and deliver trustworthy rankings to end users? While existing techniques focus on either mining alert patterns or filtering out false positive alerts, it can be more advantageous to consider the two perspectives simultaneously in order to improve detection accuracy and better understand abnormal system behaviors. In this paper, we propose CAR, a collaborative alert ranking framework that exploits both temporal and content correlations from heterogeneous categorical alerts. CAR first builds a hierarchical Bayesian model to capture both short-term and long-term dependencies in each alert sequence. Then, an entity embedding-based model is proposed to learn the content correlations between alerts via their heterogeneous categorical attributes. Finally, by incorporating both temporal and content dependencies into a unified optimization framework, CAR ranks both alerts and their corresponding alert patterns. Our experiments-using both synthetic and real-world enterprise security alert data-show that CAR can accurately identify true positive alerts and successfully reconstruct the attack scenarios at the same time.

Behavior-based Community Detection: Application to Host Assessment in Enterprise Information Networks

Behavior-based Community Detection: Application to Host Assessment in Enterprise Information Networks Community detection in complex networks is a fundamental problem that attracts much attention across various disciplines. Previous studies have been mostly focusing on external connections between nodes (i.e., topology structure) in the network whereas largely ignoring internal intricacies (i.e., local behavior) of each node. A pair of nodes without any interaction can still share similar internal behaviors. For example, in an enterprise information network, compromised computers controlled by the same intruder often demonstrate similar abnormal behaviors even if they do not connect with each other. In this paper, we study the problem of community detection in enterprise information networks, where large-scale internal events and external events coexist on each host. The discovered host communities, capturing behavioral affinity, can benefit many comparative analysis tasks such as host anomaly assessment. In particular, we propose a novel community detection framework to identify behavior-based host communities in enterprise information networks, purely based on large-scale heterogeneous event data. We continue proposing an efficient method for assessing host’s anomaly level by leveraging the detected host communities. Experimental results on enterprise networks demonstrate the effectiveness of our model.

NodeMerge: Template Based Efficient Data Reduction For Big-Data Causality Analysis

Today’s enterprises are exposed to sophisticated attacks, such as Advanced Persistent Threats~(APT) attacks, which usually consist of stealthy multiple steps. To counter these attacks, enterprises often rely on causality analysis on the system activity data collected from a ubiquitous system monitoring to discover the initial penetration point, and from there identify previously unknown attack steps. However, one major challenge for causality analysis is that the ubiquitous system monitoring generates a colossal amount of data and hosting such a huge amount of data is prohibitively expensive. Thus, there is a strong demand for techniques that reduce the storage of data for causality analysis and yet preserve the quality of the causality analysis. To address this problem, in this paper, we propose NodeMerge, a template based data reduction system for online system event storage. Specifically, our approach can directly work on the stream of system dependency data and achieve data reduction on the read-only file events based on their access patterns. It can either reduce the storage cost or improve the performance of causality analysis under the same budget. Only with a reasonable amount of resource for online data reduction, it nearly completely preserves the accuracy for causality analysis. The reduced form of data can be used directly with little overhead. To evaluate our approach, we conducted a set of comprehensive evaluations, which show that for different categories of workloads, our system can reduce the storage capacity of raw system dependency data by as high as 75.7 times, and the storage capacity of the state-of-the-art approach by as high as 32.6 times. Furthermore, the results also demonstrate that our approach keeps all the causality analysis information and has a reasonably small overhead in memory and hard disk.