Learning K-way D-dimensional Discrete Embedding for Hierarchical Data Visualization and Retrieval

Publication Date: 8/10/2019

Event: IJCAI 2019

Reference: pp. 2966-2972, 2019

Authors: Xiaoyuan Liang , NEC Laboratories America, Inc., New Jersey Institute of Technology; Martin Renqiang Min, NEC Laboratories America, Inc.; Hongyu Guo, National Research Council Canada; Guiling Wang, New Jersey Institute of Technology

Abstract: Traditional embedding approaches associate a real-valued embedding vector with each symbol or data point, which is equivalent to applying a linear transformation to “one-hot” encoding of discrete symbols or data objects. Despite simplicity, these methods generate storage-inefficient representations and fail to effectively encode the internal semantic structure of data, especially when the number of symbols or data points and the dimensionality of the real-valued embedding vectors are large. In this paper, we propose a regularized autoencoder framework to learn compact Hierarchical K-way D-dimensional (HKD) discrete embedding of symbols or data points, aiming at capturing essential semantic structures of data. Experimental results on synthetic and real-world datasets show that our proposed HKD embedding can effectively reveal the semantic structure of data via hierarchical data visualization and greatly reduce the search space of nearest neighbor retrieval while preserving high accuracy.

Publication Link: https://www.ijcai.org/proceedings/2019/411