research-article

This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage

Authors:

Wajih Ul Hassan,

Zhengzhang Chen,

Adam BatesAuthors Info & Claims

ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference

Pages 165 - 178

https://doi.org/10.1145/3427228.3427255

Published: 08 December 2020 Publication History

Abstract

Recent advances in the causal analysis can accelerate incident response time, but only after a causal graph of the attack has been constructed. Unfortunately, existing causal graph generation techniques are mainly offline and may take hours or days to respond to investigator queries, creating greater opportunity for attackers to hide their attack footprint, gain persistency, and propagate to other machines. To address that limitation, we present Swift, a threat investigation system that provides high-throughput causality tracking and real-time causal graph generation capabilities. We design an in-memory graph database that enables space-efficient graph storage and online causality tracking with minimal disk operations. We propose a hierarchical storage system that keeps forensically-relevant part of the causal graph in main memory while evicting rest to disk. To identify the causal graph that is likely to be relevant during the investigation, we design an asynchronous cache eviction policy that calculates the most suspicious part of the causal graph and caches only that part in the main memory. We evaluated Swift on a real-world enterprise to demonstrate how our system scales to process typical event loads and how it responds to forensic queries when security alerts occur. Results show that Swift is scalable, modular, and answers forensic queries in real-time even when analyzing audit logs containing tens of millions of events.

References

[1]

[n.d.]. Cortex XDR. https://www.paloaltonetworks.com/cortex/cortex-xdr.

[2]

[n.d.]. CrowdStrike. https://www.crowdstrike.com/.

[3]

[n.d.]. Event tracing. https://docs.microsoft.com/en-us/windows/desktop/ETW/event-tracing-portal.

[4]

[n.d.]. The Linux audit daemon. https://linux.die.net/man/8/auditd.

[5]

[n.d.]. MTTD vs MTTK. https://www.threatstack.com/blog/how-to-use-automation-to-decrease-mean-time-to-know.

[6]

[n.d.]. Netwrix Auditor. https://www.netwrix.com/network_auditing_software_features.html.

[7]

2014. CVE-2014-6271. https://nvd.nist.gov/vuln/detail/CVE-2014-6271.

[8]

2018. Persistent netcat backdoor. https://www.offensive-security.com/metasploit-unleashed/persistent-netcat-backdoor/.

[9]

2018. Ransom.Wannacry. https://symc.ly/2NSK5Rg.

[10]

2018. VPNFilter: New Router Malware with Destructive Capabilities. https://symc.ly/2IPGGVE.

[11]

2019. Apache Kafka. https://kafka.apache.org/.

[12]

2019. Automated Incident Response: Respond to Every Alert. https://swimlane.com/blog/automated-incident-response-respond-every-alert/.

[13]

2019. Automated Security Intelligence (ASI). https://www.nec.com/en/global/techrep/journal/g16/n01/160110.html.

[14]

2019. Breach Detection. https://link.medium.com/6HpgbLgZuW.

[15]

2019. Cyber Threat Hunting Review. https://blog.usejournal.com/cyber-threat-hunting-basics-52fca11a4e1d.

[16]

2019. Endpoint Monitoring & Security. https://logrhythm.com/solutions/security/endpoint-threat-detection/.

[17]

2019. Google core libraries for Java. https://github.com/google/guava.

[18]

2019. How Many Alerts is Too Many to Handle?https://www2.fireeye.com/StopTheNoise-IDC-Numbers-Game-Special-Report.html.

[19]

2019. How WannaCrypt attacks. https://www.zdnet.com/article/how-wannacrypt-attacks/.

[20]

2019. MS17-010 EternalBlue SMB Remote Windows Kernel Pool Corruption. https://www.rapid7.com/db/modules/exploit/windows/smb/ms17_010_eternalblue.

[21]

2019. Neo4j. https://neo4j.com/.

[22]

2019. New Research from Advanced Threat Analytics. https://prn.to/2uTiaK6.

[23]

2019. Over 18,000 Redis Instances Targetted. https://duo.com/decipher/over-18000-redis-instances-targeted-by-fake-ransomware.

[24]

2019. Petya ransomware outbreak. https://www.symantec.com/blogs/threat-intelligence/petya-ransomware-wiper.

[25]

2019. Redis in-memory data structure store. https://redis.io/.

[26]

2019. RedisGraph - a graph database module for Redis. https://oss.redislabs.com/redisgraph/.

[27]

2019. RocksDB | A persistent key-value store. https://rocksdb.org/.

[28]

2019. What is SIEM?https://logz.io/blog/what-is-siem/.

[29]

2 [n.d.]. Equifax says cyberattack may have affected 143 million in the U.S.https://www.nytimes.com/2017/09/07/business/equifax-cyberattack.html.

[30]

Adam Bates, Wajih Ul Hassan, Kevin Butler, Alin Dobra, Bradley Reaves, Patrick Cable, Thomas Moyer, and Nabil Schear. 2017. Transparent web service auditing via network provenance functions. In WWW.

[31]

Adam Bates, Dave Tian, Kevin R. B. Butler, and Thomas Moyer. 2015. Trustworthy whole-system provenance for the Linux kernel. In USENIX Security.

[32]

Chen Chen, Harshal Tushar Lehri, Lay Kuan Loh, Anupam Alur, Limin Jia, Boon Thau Loo, and Wenchao Zhou. 2017. Distributed Provenance Compression. In SIGMOD.

[33]

Marco Cova, Davide Balzarotti, Viktoria Felmetsger, and Giovanni Vigna. 2007. Swaddler: An approach for the anomaly-based detection of state violations in web applications. In International Workshop on Recent Advances in Intrusion Detection. Springer, 63–86.

[34]

Hervé Debar and Andreas Wespi. 2001. Aggregation and correlation of intrusion-detection alerts. In International Workshop on Recent Advances in Intrusion Detection. Springer, 85–103.

[35]

David Ediger, Rob McColl, Jason Riedy, and David A Bader. 2012. Stinger: High performance data structure for streaming graphs. In Conference on High Performance Extreme Computing. IEEE.

[36]

FireEye. 2019. Incident Investigation. https://www.fireeye.com/solutions/incident-investigation.html.

[37]

Peng Gao, Xusheng Xiao, Ding Li, Zhichun Li, Kangkook Jee, Zhenyu Wu, Chung Hwan Kim, Sanjeev R. Kulkarni, and Prateek Mittal. 2018. SAQL: A Stream-based Query System for Real-Time Abnormal System Behavior Detection. In USENIX Security Symposium.

[38]

Ashish Gehani and Dawood Tariq. 2012. SPADE: Support for provenance auditing in distributed environments. In Middleware(Montreal, Quebec, Canada).

[39]

Xueyan Han, Thomas Pasqueir, Adam Bates, James Mickens, and Margo Seltzer. 2020. Unicorn: Runtime Provenance-Based Detector for Advanced Persistent Threats. In NDSS.

[40]

Wajih Ul Hassan, Adam Bates, and Daniel Marino. 2020. Tactical Provenance Analysis for Endpoint Detection and Response Systems. In IEEE S&P.

[41]

Wajih Ul Hassan, Shengjian Guo, Ding Li, Zhengzhang Chen, Kangkook Jee, Zhichun Li, and Adam Bates. 2019. NoDoze: Combatting threat alert fatigue with automated provenance triage. In NDSS (San Diego, CA).

[42]

Wajih Ul Hassan, Mark Lemay, Nuraini Aguse, Adam Bates, and Thomas Moyer. 2018. Towards scalable cluster auditing through grammatical inference over provenance graphs. In NDSS (San Diego, CA).

[43]

Wajih Ul Hassan, Mohammad A Noureddine, Pubali Datta, and Adam Bates. 2020. OmegaLog: High-Fidelity Attack Investigation via Transparent Multi-layer Log Analysis. In NDSS.

[44]

Md Nahid Hossain, Sadegh M Milajerdi, Junao Wang, Birhanu Eshete, Rigel Gjomemo, R Sekar, Scott D Stoller, and VN Venkatakrishnan. 2017. SLEUTH: Real-time attack scenario reconstruction from COTS audit data. In USENIX Security.

[45]

Md Nahid Hossain, Sanaz Sheikhi, and R Sekar. 2020. Combating Dependence Explosion in Forensic Analysis Using Alternative Tag Propagation Semantics. In IEEE S&P.

[46]

Md Nahid Hossain, Junao Wang, R. Sekar, and Scott D. Stoller. 2018. Dependence-Preserving data compaction for scalable forensic analysis. In USENIX Security Symposium.

[47]

Yang Ji, Sangho Lee, Evan Downing, Weiren Wang, Mattia Fazzini, Taesoo Kim, Alessandro Orso, and Wenke Lee. 2017. Rain: Refinable attack investigation with on-demand inter-process information flow tracking. In CCS. ACM.

Digital Library

[48]

Samuel T. King and Peter M. Chen. 2003. Backtracking Intrusions. In SOSP. ACM.

[49]

Samuel T King, Zhuoqing Morley Mao, Dominic G Lucchetti, and Peter M Chen. 2005. Enriching Intrusion Alerts Through Multi-Host Causality. In NDSS.

[50]

Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur. 2003. Bayesian event classification for intrusion detection. In 19th Annual Computer Security Applications Conference, 2003. Proceedings. IEEE, 14–23.

[51]

Christopher Kruegel and Giovanni Vigna. 2003. Anomaly detection of web-based attacks. In Proceedings of the 10th ACM conference on Computer and communications security. ACM, 251–261.

Digital Library

[52]

Pradeep Kumar and H. Howie Huang. 2019. GraphOne: A Data Store for Real-time Analytics on Evolving Graphs. In USENIX FAST.

[53]

Yonghwi Kwon, Fei Wang, Weihang Wang, Kyu Hyung Lee, Wen-Chuan Lee, Shiqing Ma, Xiangyu Zhang, Dongyan Xu, Somesh Jha, Gabriela Ciocarlie, 2018. MCI: Modeling-based Causality Inference in Audit Logging for Attack Investigation. In NDSS.

[54]

Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2013. High accuracy attack provenance via binary-based execution partition. In NDSS (San Diego, CA).

[55]

Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2013. LogGC: Garbage collecting audit log. In CCS.

[56]

Hyeontaek Lim, Donsu Han, David G Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. USENIX.

Digital Library

[57]

Yushan Liu, Mu Zhang, Ding Li, Kangkook Jee, Zhichun Li, Zhenyu Wu, Junghwan Rhee, and Prateek Mittal. 2018. Towards a Timely Causality Analysis for Enterprise Security. In NDSS.

[58]

Shiqing Ma, Kyu Hyung Lee, Chung Hwan Kim, Junghwan Rhee, Xiangyu Zhang, and Dongyan Xu. 2015. Accurate, low cost and instrumentation-free security audit logging for Windows. In ACSAC. ACM.

[59]

Shiqing Ma, Juan Zhai, Yonghwi Kwon, Kyu Hyung Lee, Xiangyu Zhang, Gabriela Ciocarlie, Ashish Gehani, Vinod Yegneswaran, Dongyan Xu, and Somesh Jha. 2018. Kernel-supported cost-effective audit logging for causality tracking. In USENIX ATC.

[60]

Shiqing Ma, Juan Zhai, Fei Wang, Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2017. MPI: Multiple Perspective Attack Investigation with Semantic Aware Execution Partitioning. In USENIX Security.

[61]

Shiqing Ma, Xiangyu Zhang, and Dongyan Xu. 2016. ProTracer: Towards practical provenance tracing by alternating between logging and tainting. In NDSS (San Diego, CA).

[62]

Sadegh M. Milajerdi, Birhanu Eshete, Rigel Gjomemo, and V.N. Venkatakrishnan. 2019. POIROT: Aligning Attack Behavior with Kernel Audit Records for Cyber Threat Hunting. In CCS. ACM.

Digital Library

[63]

S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V. N. Venkatakrishnan. 2019. HOLMES: Real-Time APT Detection through Correlation of Suspicious Information Flows. In IEEE S&P.

[64]

Charlie Miller and Chris Valasek. 2015. Remote exploitation of an unaltered passenger vehicle. (2015).

[65]

David Moore, Vern Paxson, Stefan Savage, Colleen Shannon, Stuart Staniford, and Nicholas Weaver. 2003. Inside the Slammer Worm. IEEE Security and Privacy 1, 4 (July 2003), 33–39. https://doi.org/10.1109/MSECP.2003.1219056

Digital Library

[66]

Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, [n.d.]. Scaling Memcache at Facebook.

[67]

Steven Noel, Eric Robertson, and Sushil Jajodia. 2004. Correlating intrusion events and building attack scenarios through attack graph distances. In Computer Security Applications Conference, 2004. 20th Annual. IEEE, 350–359.

Digital Library

[68]

Thomas Pasquier, Xueyuan Han, Thomas Moyer, Adam Bates, Olivier Hermant, David Eyers, Jean Bacon, and Margo Seltzer. 2018. Runtime analysis of whole-system provenance. In CCS. ACM.

[69]

Xiaokui Shu, Frederico Araujo, Douglas L Schales, Marc Ph Stoecklin, Jiyong Jang, Heqing Huang, and Josyula R Rao. 2018. Threat intelligence computing. In ACM CCS.

[70]

Splunk Inc.[n.d.]. splunk. https://www.splunk.com.

[71]

Yutao Tang, Ding Li, Zhichun Li, Mu Zhang, Kangkook Jee, Xusheng Xiao, Zhenyu Wu, Junghwan Rhee, Fengyuan Xu, and Qun Li. 2018. Nodemerge: Template based efficient data reduction for big-data causality analysis. In CCS. ACM.

Digital Library

[72]

Alfonso Valdes and Keith Skinner. 2001. Probabilistic alert correlation. In International Workshop on Recent Advances in Intrusion Detection. Springer, 54–68.

[73]

Yulai Xie, Kiran-Kumar Muniswamy-Reddy, Darrell D. E. Long, Ahmed Amer, Dan Feng, and Zhipeng Tan. 2011. Compressing Provenance Graphs.

[74]

Zhang Xu, Zhenyu Wu, Zhichun Li, Kangkook Jee, Junghwan Rhee, Xusheng Xiao, Fengyuan Xu, Haining Wang, and Guofei Jiang. 2016. High Fidelity Data Reduction for Big Data Security Dependency Analyses. In CCS.

Cited By

Wang LFang LHu Y(2025)A dynamic provenance graph-based detector for advanced persistent threatsExpert Systems with Applications10.1016/j.eswa.2024.125877265(125877)Online publication date: Mar-2025
https://doi.org/10.1016/j.eswa.2024.125877
Xu BGong YGeng XLi YDong CLiu SLiu YJiang BLu Z(2024)ProcSAGE: an efficient host threat detection method based on graph representation learningCybersecurity10.1186/s42400-024-00240-w7:1Online publication date: 25-Aug-2024
https://doi.org/10.1186/s42400-024-00240-w
Zhang BGao YKuang BYu CFu ASusilo W(2024)A Survey on Advanced Persistent Threat Detection: A Unified Framework, Challenges, and CountermeasuresACM Computing Surveys10.1145/370074957:3(1-36)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700749
Show More Cited By

Index Terms

This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage

Index terms have been assigned to the content through auto-classification.

Recommendations

The hunting of a snark with total chromatic number 5

A snark is a cyclically-4-edge-connected cubic graph with chromatic index 4. In 1880, Tait proved that the Four-Color Conjecture is equivalent to the statement that every planar bridgeless cubic graph has chromatic index 3. The search for counter-...
PCM-Based Durable Write Cache for Fast Disk I/O
MASCOTS '12: Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Flash based solid-state devices (FSSDs) have been adopted within the memory hierarchy to improve the performance of hard disk drive (HDD) based storage system. However, with the fast development of storage-class memories, new storage technologies with ...
Exposing non-volatile memory cache for adaptive storage access
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

This paper proposes a method that combines next generation non-volatile (NV) memory technologies to block storage and makes use of NV memory as storage cache. The existing method to combine cache storage with block storage hides the cache storage under ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference

December 2020

962 pages

ISBN:9781450388580

DOI:10.1145/3427228

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

ACSAC '20

ACSAC '20: Annual Computer Security Applications Conference

December 7 - 11, 2020

Austin, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
565
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)3

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang LFang LHu Y(2025)A dynamic provenance graph-based detector for advanced persistent threatsExpert Systems with Applications10.1016/j.eswa.2024.125877265(125877)Online publication date: Mar-2025
https://doi.org/10.1016/j.eswa.2024.125877
Xu BGong YGeng XLi YDong CLiu SLiu YJiang BLu Z(2024)ProcSAGE: an efficient host threat detection method based on graph representation learningCybersecurity10.1186/s42400-024-00240-w7:1Online publication date: 25-Aug-2024
https://doi.org/10.1186/s42400-024-00240-w
Zhang BGao YKuang BYu CFu ASusilo W(2024)A Survey on Advanced Persistent Threat Detection: A Unified Framework, Challenges, and CountermeasuresACM Computing Surveys10.1145/370074957:3(1-36)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3700749
Aly AIqbal SYoussef AMansour E(2024)MEGR-APT: A Memory-Efficient APT Hunting System Based on Attack Representation LearningIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.339639019(5257-5271)Online publication date: 2024
https://doi.org/10.1109/TIFS.2024.3396390
Jiang PXiao JLi DYu HBai YGuo YChen X(2024)Detecting Malicious Websites From the Perspective of System Provenance AnalysisIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.327761321:3(1406-1423)Online publication date: May-2024
https://doi.org/10.1109/TDSC.2023.3277613
Altinisik EDeniz FSencar HMeng WJensen CCremers CKirda E(2023)ProvG-Searcher: A Graph Representation Learning Approach for Efficient Provenance Graph SearchProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623187(2247-2261)Online publication date: 21-Nov-2023
https://doi.org/10.1145/3576915.3623187
Inam MChen YGoyal ALiu JMink JMichael NGaur SBates AHassan W(2023)SoK: History is a Vast Early Warning System: Auditing the Provenance of System Intrusions2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179405(2620-2638)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179405
Shrestha MKim YOh JRhee JChoe YZuo FPark MQian G(2023)ProvSec: Cybersecurity System Provenance Analysis Benchmark Dataset2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)10.1109/SERA57763.2023.10197743(352-357)Online publication date: 23-May-2023
https://doi.org/10.1109/SERA57763.2023.10197743
Shrestha MKim YOh JRhee JChoe YZuo FPark MQian G(2023)ProvSec: Open Cybersecurity System Provenance Analysis Benchmark Dataset with LabelsInternational Journal of Networked and Distributed Computing10.1007/s44227-023-00014-911:2(112-123)Online publication date: 15-Nov-2023
https://doi.org/10.1007/s44227-023-00014-9
Ahmed MWei JAl-Shaer E(2023)SCAHunter: Scalable Threat Hunting Through Decentralized Hierarchical Monitoring Agent ArchitectureIntelligent Computing10.1007/978-3-031-37963-5_88(1282-1307)Online publication date: 20-Aug-2023
https://doi.org/10.1007/978-3-031-37963-5_88
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents