This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage

Publication Date: 12/11/2020

Event: 2020 Annual Computer Security Applications Conference

Reference: pp. 165-178, 2020

Authors: Wajih Ul Hassan, University of Illinois Urbana-Champaign; Ding Li, NEC Laboratories America, Inc.; Kangkook Jee, University of Texas at Dallas; Xiao Yu, NEC Laboratories America, Inc.; Kexuan Zou, University Of Illinois Urbana-Champaign; Dawei Wang, University Of Illinois Urbana-Champaign; Zhengzhang Chen, NEC Laboratories America, Inc.; Zhichun Li, Stellar Cyber; Junghwan Rhee, NEC Laboratories America, Inc.; Jiaping Gui, NEC Laboratories America, Inc.; Adam Bates, University Of Illinois Urbana-Champaign

Abstract: Recent advances in the causal analysis can accelerate incident response time, but only after a causal graph of the attack has been constructed. Unfortunately, existing causal graph generation techniques are mainly offline and may take hours or days to respond to investigator queries, creating greater opportunity for attackers to hide their attack footprint, gain persistency, and propagate to other machines. To address that limitation, we present Swift, a threat investigation system that provides high-throughput causality tracking and real-time causal graph generation capabilities. We design an in-memory graph database that enables space-efficient graph storage and online causality tracking with minimal disk operations. We propose a hierarchical storage system that keeps forensically-relevant part of the causal graph in main memory while evicting rest to disk. To identify the causal graph that is likely to be relevant during the investigation, we design an asynchronous cache eviction policy that calculates the most suspicious part of the causal graph and caches only that part in the main memory. We evaluated Swift on a real-world enterprise to demonstrate how our system scales to process typical event loads and how it responds to forensic queries when security alerts occur. Results show that Swift is scalable, modular, and answers forensic queries in real-time even when analyzing audit logs containing tens of millions of events.

Publication Link: https://dl.acm.org/doi/10.1145/3427228.3427255