Publication Date: 10/26/2018
Event: Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018)
Reference: pp. 1987-1995, 2018
Authors: Ying Lin, University of Houston; Zhengzhang Chen, NEC Laboratories America, Inc.; Kai Zhang, Temple University; Cheng Cao, Texas A&M University; Lu-An Tang, NEC Laboratories America, Inc.; Wei Cheng, University of Washington; Zhichun Li, NEC Laboratories America, Inc.
Abstract: Given a large number of low-quality heterogeneous categorical alerts collected from an anomaly detection system, how to characterize the complex relationships between different alerts and deliver trustworthy rankings to end users? While existing techniques focus on either mining alert patterns or filtering out false positive alerts, it can be more advantageous to consider the two perspectives simultaneously in order to improve detection accuracy and better understand abnormal system behaviors. In this paper, we propose CAR, a collaborative alert ranking framework that exploits both temporal and content correlations from heterogeneous categorical alerts. CAR first builds a hierarchical Bayesian model to capture both short-term and long-term dependencies in each alert sequence. Then, an entity embedding-based model is proposed to learn the content correlations between alerts via their heterogeneous categorical attributes. Finally, by incorporating both temporal and content dependencies into a unified optimization framework, CAR ranks both alerts and their corresponding alert patterns. Our experiments-using both synthetic and real-world enterprise security alert data-show that CAR can accurately identify true positive alerts and successfully reconstruct the attack scenarios at the same time.
Publication Link: https://dl.acm.org/doi/10.1145/3269206.3272013