Hui Zhang works at Ant Financial Group.


TGNet: Learning to Rank Nodes in Temporal Graphs

Node ranking in temporal networks are often impacted by heterogeneous context from node content, temporal, and structural dimensions. This paper introduces TGNet , a deep-learning framework for node ranking in heterogeneous temporal graphs. TGNet utilizes a variant of Recurrent Neural Network to adapt context evolution and extract context features for nodes. It incorporates a novel influence network to dynamically estimate temporal and structural influence among nodes over time. To cope with label sparsity, it integrates graph smoothness constraints as a weak form of supervision. We show that the application of TGNet is feasible for large-scale networks by developing efficient learning and inference algorithms with optimization techniques. Using real-life data, we experimentally verify the effectiveness and efficiency of TGNet techniques. We also show that TGNet yields intuitive explanations for applications such as alert detection and academic impact ranking, as verified by our case study.

LogLens: A Real-time Log Analysis System

Administrators of most user-facing systems depend on periodic log data to get an idea of the health and status of production applications. Logs report information, which is crucial to diagnose the root cause of complex problems. In this paper, we present a real-time log analysis system called LogLens that automates the process of anomaly detection from logs with no (or minimal) target system knowledge and user specification. In LogLens, we employ unsupervised machine learning based techniques to discover patterns in application logs, and then leverage these patterns along with the real-time log parsing for designing advanced log analytics applications. Compared to the existing systems which are primarily limited to log indexing and search capabilities, LogLens presents an extensible system for supporting both stateless and stateful log analysis applications. Currently, LogLens is running at the core of a commercial log analysis solution handling millions of logs generated from the large-scale industrial environments and reported up to 12096x man-hours reduction in troubleshooting operational problems compared to the manual approach.