Neural Networks are a class of machine learning models inspired by the structure and function of the human brain. These models consist of interconnected nodes, called neurons or artificial neurons, organized in layers. Neural networks are used for various tasks, including pattern recognition, classification, regression, and decision-making.

Posts

Weight Pruning Techniques for Nonlinear Impairment Compensation using Neural Networks

Neural networks (NNs) are attractive for nonlinear impairment compensation applications in communication systems, such as optical fiber nonlinearity, nonlinearity of driving amplifiers, and nonlinearity of semiconductor optical amplifiers. Without prior knowledge of the transmission link or the hardware characteristics, optimal parameters are completely constructed from a data-driven approach by exploring training datasets, once the NN structure is given. On the other hand, due to computational power and energy consumption, especially in high-speed communication systems, the computational complexity of the optimized NN needs to be confined to the hardware, such as FPGA or ASIC without sacrificing its performance improvement. In this paper, two approaches are presented to accommodate the NN-based algorithms for high-speed communication systems. The first approach is to reduce computational complexity of the NN-based nonlinearity compensation algorithms on the basis of weight pruning (WP). WP can significantly reduce the computational complexity, especially because the nonlinear compensation task studied here results in a sparse NN. The authors have studied an enhanced approach of WP by imposing an additional restriction on the selection of non-zero weights on each hidden layer. The second approach is to implement NNs onto a silicon-photonic integrated platform, enabling power efficiency to be further improved without sacrificing the high-speed operation.

You Are What and Where You Are: Graph Enhanced Attention Network for Explainable POI Recommendation

Point-of-interest (POI) recommendation is an emerging area of research on location-based social networks to analyze user behaviors and contextual check-in information. For this problem, existing approaches, with shallow or deep architectures, have two major drawbacks. First, for these approaches, the attributes of individuals have been largely ignored. Therefore, it would be hard, if not impossible, to gather sufficient user attribute features to have complete coverage of possible motivation factors. Second, most existing models preserve the information of users or POIs by latent representations without explicitly highlighting salient factors or signals. Consequently, the trained models with unjustifiable parameters provide few persuasive rationales to explain why users favor or dislike certain POIs and what really causes a visit. To overcome these drawbacks, we propose GEAPR, a POI recommender that is able to interpret the POI prediction in an end-to-end fashion. Specifically, GEAPR learns user representations by aggregating different factors, such as structural context, neighbor impact, user attributes, and geolocation influence. GEAPR takes advantage of a triple attention mechanism to quantify the influences of different factors for each resulting recommendation and performs a thorough analysis of the model interpretability. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed model. GEAPR is deployed and under test on an internal web server. An example interface is presented to showcase its application on explainable POI recommendation.

Interpreting Convolutional Sequence Model by Learning Local Prototypes with Adaptation Regularization

In many high-stakes applications of machine learning models, outputting only predictions or providing statistical confidence is usually insufficient to gain trust from end users, who often prefer a transparent reasoning paradigm. Despite the recent encouraging developments on deep networks for sequential data modeling, due to the highly recursive functions, the underlying rationales of their predictions are difficult to explain. Thus, in this paper, we aim to develop a sequence modeling approach that explains its own predictions by breaking input sequences down into evidencing segments (i.e., sub-sequences) in its reasoning. To this end, we build our model upon convolutional neural networks, which, in their vanilla forms, associates local receptive fields with outputs in an obscure manner. To unveil it, we resort to case-based reasoning, and design prototype modules whose units (i.e., prototypes) resemble exemplar segments in the problem domain. Each prediction is obtained by combining the comparisons between the prototypes and the segments of an input. To enhance interpretability, we propose a training objective that delicately adapts the distribution of prototypes to the data distribution in latent spaces, and design an algorithm to map prototypes to human-understandable segments. Through extensive experiments in a variety of domains, we demonstrate that our model can achieve high interpretability generally, together with a competitive accuracy to the state-of-the-art approaches.

Leveraging Knowledge Bases for Future Prediction with Memory Comparison Networks

Making predictions about what might happen in the future is important for reacting adequately in many situations. For example, observing that “Man kidnaps girl” may have the consequence that “Man kills girl”. While this is part of common sense reasoning for humans, it is not obvious how machines can acquire and generalize over such knowledge. In this article, we propose a new type of memory network that can predict the next future event also for observations that are not in the knowledge base. We evaluate our proposed method on two knowledge bases: Reuters KB (events from news articles) and Regneri KB (events from scripts). For both knowledge bases, our proposed method shows similar or better prediction accuracy on unseen events (or scripts) than recently proposed deep neural networks and rankSVM. We also demonstrate that the attention mechanism of our proposed method can be helpful for error analysis and manual expansion of the knowledge base.

Deep Learning IP Network Representations

We present DIP, a deep learning-based framework to learn structural properties of the Internet, such as node clustering or distance between nodes. Existing embedding-based approaches use linear algorithms on a single source of data, such as latency or hop count information, to approximate the position of a node in the Internet. In contrast, DIP computes low-dimensional representations of nodes that preserve structural properties and non-linear relationships across multiple, heterogeneous sources of structural information, such as IP, routing, and distance information. Using a large real-world data set, we show that DIP learns representations that preserve the real-world clustering of the associated nodes and predicts the distance between them more than 30% better than a mean-based approach. Furthermore, DIP accurately imputes hop count distance to unknown hosts (i.e., not used in training) given only their IP addresses and routable prefixes. Our framework is extensible to new data sources and applicable to a wide range of problems in network monitoring and security.