Multi-source Inductive Knowledge Graph Transfer

Multi-source Inductive Knowledge Graph Transfer Large-scale information systems, such as knowledge graphs (KGs), enterprise system networks, often exhibit dynamic and complex activities. Recent research has shown that formalizing these information systems as graphs can effectively characterize the entities (nodes) and their relationships (edges). Transferring knowledge from existing well-curated source graphs can help construct the target graph of newly-deployed systems faster and better which no doubt will benefit downstream tasks such as link prediction and anomaly detection for new systems. However, current graph transferring methods are either based on a single source, which does not sufficiently consider multiple available sources, or not selectively learns from these sources. In this paper, we propose MSGT-GNN, a graph knowledge transfer model for efficient graph link prediction from multiple source graphs. MSGT-GNN consists of two components: the Intra-Graph Encoder, which embeds latent graph features of system entities into vectors, and the graph transferor, which utilizes graph attention mechanism to learn and optimize the embeddings of corresponding entities from multiple source graphs, in both node level and graph level. Experimental results on multiple real-world datasets from various domains show that MSGT-GNN outperforms other baseline approaches in the link prediction and demonstrate the merit of attentive graph knowledge transfer and the effectiveness of MSGT-GNN.

Learning K-way D-dimensional Discrete Code For Compact Embedding Representations

Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying a linear transformation based on a “one-hot” encoding of the discrete symbols. Despite its simplicity, such approach yields the number of parameters that grows linearly with the vocabulary size and can lead to overfitting. In this work, we propose a much more compact K-way D-dimensional discrete encoding scheme to replace the “one-hot” encoding. In the proposed “KD encoding”, each symbol is represented by a D-dimensional code with a cardinality of K, and the final symbol embedding vector is generated by composing the code embedding vectors. To end-to-end learn semantically meaningful codes, we derive a relaxed discrete optimization approach based on stochastic gradient descent, which can be generally applied to any differentiable computational graph with an embedding layer. In our experiments with various applications from natural language processing to graph convolutional networks, the total size of the embedding layer can be reduced up to 98% while achieving similar or better performance.