Deep Learning IP Network Representations

Publication Date: 8/24/2018

Event: Big-DAMA 2018 – ACM SIGCOMM 2018 Workshop on Big Data Analytics and Machine Learning for Data Communication Networks

Reference: pp. 33-39, 2018

Authors: Mingda Li, University of California, Los Angeles; Cristian Lumezanu, NEC Laboratories America, Inc.; Bo Zong, NEC Laboratories America, Inc.; Haifeng Chen, NEC Laboratories America, Inc.

Abstract: We present DIP, a deep learning based framework to learn structural properties of the Internet, such as node clustering or distance between nodes. Existing embedding-based approaches use linear algorithms on a single source of data, such as latency or hop count information, to approximate the position of a node in the Internet. In contrast, DIP computes low-dimensional representations of nodes that preserve structural properties and non-linear relationships across multiple, heterogeneous sources of structural information, such as IP, routing, and distance information. Using a large real-world data set, we show that DIP learns representations that preserve the real-world clustering of the associated nodes and predicts distance between them more than 30% better than a mean-based approach. Furthermore, DIP accurately imputes hop count distance to unknown hosts (i.e., not used in training) given only their IP addresses and routable prefixes. Our framework is extensible to new data sources and applicable to a wide range of problems in network monitoring and security.

Publication Link: https://dl.acm.org/doi/10.1145/3229607.3229609