skip to main content
10.1145/3427228.3427255acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage

Published: 08 December 2020 Publication History

Abstract

Recent advances in the causal analysis can accelerate incident response time, but only after a causal graph of the attack has been constructed. Unfortunately, existing causal graph generation techniques are mainly offline and may take hours or days to respond to investigator queries, creating greater opportunity for attackers to hide their attack footprint, gain persistency, and propagate to other machines. To address that limitation, we present Swift, a threat investigation system that provides high-throughput causality tracking and real-time causal graph generation capabilities. We design an in-memory graph database that enables space-efficient graph storage and online causality tracking with minimal disk operations. We propose a hierarchical storage system that keeps forensically-relevant part of the causal graph in main memory while evicting rest to disk. To identify the causal graph that is likely to be relevant during the investigation, we design an asynchronous cache eviction policy that calculates the most suspicious part of the causal graph and caches only that part in the main memory. We evaluated Swift on a real-world enterprise to demonstrate how our system scales to process typical event loads and how it responds to forensic queries when security alerts occur. Results show that Swift is scalable, modular, and answers forensic queries in real-time even when analyzing audit logs containing tens of millions of events.

References

[1]
[n.d.]. Cortex XDR. https://www.paloaltonetworks.com/cortex/cortex-xdr.
[2]
[n.d.]. CrowdStrike. https://www.crowdstrike.com/.
[3]
[n.d.]. Event tracing. https://docs.microsoft.com/en-us/windows/desktop/ETW/event-tracing-portal.
[4]
[n.d.]. The Linux audit daemon. https://linux.die.net/man/8/auditd.
[5]
[n.d.]. MTTD vs MTTK. https://www.threatstack.com/blog/how-to-use-automation-to-decrease-mean-time-to-know.
[6]
[n.d.]. Netwrix Auditor. https://www.netwrix.com/network_auditing_software_features.html.
[7]
2014. CVE-2014-6271. https://nvd.nist.gov/vuln/detail/CVE-2014-6271.
[8]
2018. Persistent netcat backdoor. https://www.offensive-security.com/metasploit-unleashed/persistent-netcat-backdoor/.
[9]
2018. Ransom.Wannacry. https://symc.ly/2NSK5Rg.
[10]
2018. VPNFilter: New Router Malware with Destructive Capabilities. https://symc.ly/2IPGGVE.
[11]
2019. Apache Kafka. https://kafka.apache.org/.
[12]
2019. Automated Incident Response: Respond to Every Alert. https://swimlane.com/blog/automated-incident-response-respond-every-alert/.
[13]
2019. Automated Security Intelligence (ASI). https://www.nec.com/en/global/techrep/journal/g16/n01/160110.html.
[14]
2019. Breach Detection. https://link.medium.com/6HpgbLgZuW.
[15]
2019. Cyber Threat Hunting Review. https://blog.usejournal.com/cyber-threat-hunting-basics-52fca11a4e1d.
[16]
2019. Endpoint Monitoring & Security. https://logrhythm.com/solutions/security/endpoint-threat-detection/.
[17]
2019. Google core libraries for Java. https://github.com/google/guava.
[18]
2019. How Many Alerts is Too Many to Handle?https://www2.fireeye.com/StopTheNoise-IDC-Numbers-Game-Special-Report.html.
[19]
2019. How WannaCrypt attacks. https://www.zdnet.com/article/how-wannacrypt-attacks/.
[20]
2019. MS17-010 EternalBlue SMB Remote Windows Kernel Pool Corruption. https://www.rapid7.com/db/modules/exploit/windows/smb/ms17_010_eternalblue.
[21]
2019. Neo4j. https://neo4j.com/.
[22]
2019. New Research from Advanced Threat Analytics. https://prn.to/2uTiaK6.
[23]
2019. Over 18,000 Redis Instances Targetted. https://duo.com/decipher/over-18000-redis-instances-targeted-by-fake-ransomware.
[24]
2019. Petya ransomware outbreak. https://www.symantec.com/blogs/threat-intelligence/petya-ransomware-wiper.
[25]
2019. Redis in-memory data structure store. https://redis.io/.
[26]
2019. RedisGraph - a graph database module for Redis. https://oss.redislabs.com/redisgraph/.
[27]
2019. RocksDB | A persistent key-value store. https://rocksdb.org/.
[28]
2019. What is SIEM?https://logz.io/blog/what-is-siem/.
[29]
2 [n.d.]. Equifax says cyberattack may have affected 143 million in the U.S.https://www.nytimes.com/2017/09/07/business/equifax-cyberattack.html.
[30]
Adam Bates, Wajih Ul Hassan, Kevin Butler, Alin Dobra, Bradley Reaves, Patrick Cable, Thomas Moyer, and Nabil Schear. 2017. Transparent web service auditing via network provenance functions. In WWW.
[31]
Adam Bates, Dave Tian, Kevin R. B. Butler, and Thomas Moyer. 2015. Trustworthy whole-system provenance for the Linux kernel. In USENIX Security.
[32]
Chen Chen, Harshal Tushar Lehri, Lay Kuan Loh, Anupam Alur, Limin Jia, Boon Thau Loo, and Wenchao Zhou. 2017. Distributed Provenance Compression. In SIGMOD.
[33]
Marco Cova, Davide Balzarotti, Viktoria Felmetsger, and Giovanni Vigna. 2007. Swaddler: An approach for the anomaly-based detection of state violations in web applications. In International Workshop on Recent Advances in Intrusion Detection. Springer, 63–86.
[34]
Hervé Debar and Andreas Wespi. 2001. Aggregation and correlation of intrusion-detection alerts. In International Workshop on Recent Advances in Intrusion Detection. Springer, 85–103.
[35]
David Ediger, Rob McColl, Jason Riedy, and David A Bader. 2012. Stinger: High performance data structure for streaming graphs. In Conference on High Performance Extreme Computing. IEEE.
[36]
FireEye. 2019. Incident Investigation. https://www.fireeye.com/solutions/incident-investigation.html.
[37]
Peng Gao, Xusheng Xiao, Ding Li, Zhichun Li, Kangkook Jee, Zhenyu Wu, Chung Hwan Kim, Sanjeev R. Kulkarni, and Prateek Mittal. 2018. SAQL: A Stream-based Query System for Real-Time Abnormal System Behavior Detection. In USENIX Security Symposium.
[38]
Ashish Gehani and Dawood Tariq. 2012. SPADE: Support for provenance auditing in distributed environments. In Middleware(Montreal, Quebec, Canada).
[39]
Xueyan Han, Thomas Pasqueir, Adam Bates, James Mickens, and Margo Seltzer. 2020. Unicorn: Runtime Provenance-Based Detector for Advanced Persistent Threats. In NDSS.
[40]
Wajih Ul Hassan, Adam Bates, and Daniel Marino. 2020. Tactical Provenance Analysis for Endpoint Detection and Response Systems. In IEEE S&P.
[41]
Wajih Ul Hassan, Shengjian Guo, Ding Li, Zhengzhang Chen, Kangkook Jee, Zhichun Li, and Adam Bates. 2019. NoDoze: Combatting threat alert fatigue with automated provenance triage. In NDSS (San Diego, CA).
[42]
Wajih Ul Hassan, Mark Lemay, Nuraini Aguse, Adam Bates, and Thomas Moyer. 2018. Towards scalable cluster auditing through grammatical inference over provenance graphs. In NDSS (San Diego, CA).
[43]
Wajih Ul Hassan, Mohammad A Noureddine, Pubali Datta, and Adam Bates. 2020. OmegaLog: High-Fidelity Attack Investigation via Transparent Multi-layer Log Analysis. In NDSS.
[44]
Md Nahid Hossain, Sadegh M Milajerdi, Junao Wang, Birhanu Eshete, Rigel Gjomemo, R Sekar, Scott D Stoller, and VN Venkatakrishnan. 2017. SLEUTH: Real-time attack scenario reconstruction from COTS audit data. In USENIX Security.
[45]
Md Nahid Hossain, Sanaz Sheikhi, and R Sekar. 2020. Combating Dependence Explosion in Forensic Analysis Using Alternative Tag Propagation Semantics. In IEEE S&P.
[46]
Md Nahid Hossain, Junao Wang, R. Sekar, and Scott D. Stoller. 2018. Dependence-Preserving data compaction for scalable forensic analysis. In USENIX Security Symposium.
[47]
Yang Ji, Sangho Lee, Evan Downing, Weiren Wang, Mattia Fazzini, Taesoo Kim, Alessandro Orso, and Wenke Lee. 2017. Rain: Refinable attack investigation with on-demand inter-process information flow tracking. In CCS. ACM.
[48]
Samuel T. King and Peter M. Chen. 2003. Backtracking Intrusions. In SOSP. ACM.
[49]
Samuel T King, Zhuoqing Morley Mao, Dominic G Lucchetti, and Peter M Chen. 2005. Enriching Intrusion Alerts Through Multi-Host Causality. In NDSS.
[50]
Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur. 2003. Bayesian event classification for intrusion detection. In 19th Annual Computer Security Applications Conference, 2003. Proceedings. IEEE, 14–23.
[51]
Christopher Kruegel and Giovanni Vigna. 2003. Anomaly detection of web-based attacks. In Proceedings of the 10th ACM conference on Computer and communications security. ACM, 251–261.
[52]
Pradeep Kumar and H. Howie Huang. 2019. GraphOne: A Data Store for Real-time Analytics on Evolving Graphs. In USENIX FAST.
[53]
Yonghwi Kwon, Fei Wang, Weihang Wang, Kyu Hyung Lee, Wen-Chuan Lee, Shiqing Ma, Xiangyu Zhang, Dongyan Xu, Somesh Jha, Gabriela Ciocarlie, 2018. MCI: Modeling-based Causality Inference in Audit Logging for Attack Investigation. In NDSS.
[54]
Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2013. High accuracy attack provenance via binary-based execution partition. In NDSS (San Diego, CA).
[55]
Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2013. LogGC: Garbage collecting audit log. In CCS.
[56]
Hyeontaek Lim, Donsu Han, David G Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. USENIX.
[57]
Yushan Liu, Mu Zhang, Ding Li, Kangkook Jee, Zhichun Li, Zhenyu Wu, Junghwan Rhee, and Prateek Mittal. 2018. Towards a Timely Causality Analysis for Enterprise Security. In NDSS.
[58]
Shiqing Ma, Kyu Hyung Lee, Chung Hwan Kim, Junghwan Rhee, Xiangyu Zhang, and Dongyan Xu. 2015. Accurate, low cost and instrumentation-free security audit logging for Windows. In ACSAC. ACM.
[59]
Shiqing Ma, Juan Zhai, Yonghwi Kwon, Kyu Hyung Lee, Xiangyu Zhang, Gabriela Ciocarlie, Ashish Gehani, Vinod Yegneswaran, Dongyan Xu, and Somesh Jha. 2018. Kernel-supported cost-effective audit logging for causality tracking. In USENIX ATC.
[60]
Shiqing Ma, Juan Zhai, Fei Wang, Kyu Hyung Lee, Xiangyu Zhang, and Dongyan Xu. 2017. MPI: Multiple Perspective Attack Investigation with Semantic Aware Execution Partitioning. In USENIX Security.
[61]
Shiqing Ma, Xiangyu Zhang, and Dongyan Xu. 2016. ProTracer: Towards practical provenance tracing by alternating between logging and tainting. In NDSS (San Diego, CA).
[62]
Sadegh M. Milajerdi, Birhanu Eshete, Rigel Gjomemo, and V.N. Venkatakrishnan. 2019. POIROT: Aligning Attack Behavior with Kernel Audit Records for Cyber Threat Hunting. In CCS. ACM.
[63]
S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V. N. Venkatakrishnan. 2019. HOLMES: Real-Time APT Detection through Correlation of Suspicious Information Flows. In IEEE S&P.
[64]
Charlie Miller and Chris Valasek. 2015. Remote exploitation of an unaltered passenger vehicle. (2015).
[65]
David Moore, Vern Paxson, Stefan Savage, Colleen Shannon, Stuart Staniford, and Nicholas Weaver. 2003. Inside the Slammer Worm. IEEE Security and Privacy 1, 4 (July 2003), 33–39. https://doi.org/10.1109/MSECP.2003.1219056
[66]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, [n.d.]. Scaling Memcache at Facebook.
[67]
Steven Noel, Eric Robertson, and Sushil Jajodia. 2004. Correlating intrusion events and building attack scenarios through attack graph distances. In Computer Security Applications Conference, 2004. 20th Annual. IEEE, 350–359.
[68]
Thomas Pasquier, Xueyuan Han, Thomas Moyer, Adam Bates, Olivier Hermant, David Eyers, Jean Bacon, and Margo Seltzer. 2018. Runtime analysis of whole-system provenance. In CCS. ACM.
[69]
Xiaokui Shu, Frederico Araujo, Douglas L Schales, Marc Ph Stoecklin, Jiyong Jang, Heqing Huang, and Josyula R Rao. 2018. Threat intelligence computing. In ACM CCS.
[70]
Splunk Inc.[n.d.]. splunk. https://www.splunk.com.
[71]
Yutao Tang, Ding Li, Zhichun Li, Mu Zhang, Kangkook Jee, Xusheng Xiao, Zhenyu Wu, Junghwan Rhee, Fengyuan Xu, and Qun Li. 2018. Nodemerge: Template based efficient data reduction for big-data causality analysis. In CCS. ACM.
[72]
Alfonso Valdes and Keith Skinner. 2001. Probabilistic alert correlation. In International Workshop on Recent Advances in Intrusion Detection. Springer, 54–68.
[73]
Yulai Xie, Kiran-Kumar Muniswamy-Reddy, Darrell D. E. Long, Ahmed Amer, Dan Feng, and Zhipeng Tan. 2011. Compressing Provenance Graphs.
[74]
Zhang Xu, Zhenyu Wu, Zhichun Li, Kangkook Jee, Junghwan Rhee, Xusheng Xiao, Fengyuan Xu, Haining Wang, and Guofei Jiang. 2016. High Fidelity Data Reduction for Big Data Security Dependency Analyses. In CCS.

Cited By

View all

Index Terms

  1. This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Other conferences
            ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference
            December 2020
            962 pages
            ISBN:9781450388580
            DOI:10.1145/3427228
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 08 December 2020

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Auditing
            2. Data Provenance
            3. Digital Forensics

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Funding Sources

            Conference

            ACSAC '20

            Acceptance Rates

            Overall Acceptance Rate 104 of 497 submissions, 21%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)72
            • Downloads (Last 6 weeks)3
            Reflects downloads up to 03 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media