skip to main content
10.1145/3460120.3484759acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

An Inside Look into the Practice of Malware Analysis

Published: 13 November 2021 Publication History

Abstract

Malware analysis aims to understand how malicious software carries out actions necessary for a successful attack and identify the possible impacts of the attack. While there has been substantial research focused on malware analysis and it is an important tool for practitioners in industry, the overall malware analysis process used by practitioners has not been studied. As a result, an understanding of common malware analysis workflows and their goals is lacking. A better understanding of these workflows could help identify new research directions that are impactful in practice. In order to better understand malware analysis processes, we present the results of a user study with 21 professional malware analysts with diverse backgrounds who work at 18 different companies. The study focuses on answering three research questions: (1) What are the different objectives of malware analysts in practice?, (2) What comprises a typical professional malware analyst workflow, and (3) When analysts decide to conduct dynamic analysis, what factors do they consider when setting up a dynamic analysis system? Based on participant responses, we propose a taxonomy of malware analysts and identify five common analysis workflows. We also identify challenges that analysts face during the different stages of their workflow. From the results of the study, we propose two potential directions for future research, informed by challenges described by the participants. Finally, we recommend guidelines for developers of malware analysis tools to consider in order to improve the usability of such tools.

Supplementary Material

MP4 File (CCS21-fp217.mp4)
In this presentation Miuyin Yong Wong presents an "An Inside Look into the Practice of Malware Analysis", a user study that aims to understand the process of malware analysis in practice. The study focuses on answering three research questions: (1) What are the different objectives of malware analysts in practice?, (2) What comprises a typical professional malware analyst workflow?, and (3) When analysts decide to conduct dynamic analysis, what factors do they consider when setting up a dynamic analysis system? Miuyin will describe the study's methodology, a high level overview of the participants, and the study's findings. The three main findings are the categorization of malware analysts, the five most common analysis workflows, and the key decisions malware analysts make when setting up their dynamic analysis systems. The presentation concludes with a discussion of the remaining challenges malware analysts still face in practice.

References

[1]
Capev2. https://github.com/kevoreilly/CAPEv2.
[2]
Virustotal. https://virustotal.com.
[3]
al-khaser. URL https://github.com/LordNoteworthy/al-khaser.
[4]
Any-run - interactive online malware sandbox. https://any.run.
[5]
How antivirus softwares are evolving with behaviour-based malware detection algorithms. https://analyticsindiamag.com/how-antivirus-softwares-are-evolving-with-behaviour-based-malware-detection-algorithms.
[6]
Equifax says cyberattack may have affected 143 million in the u.s. https://nytimes.com/2017/09/07/business/equifax-cyberattack.html.
[7]
Free automated malware analysis service. https://hybrid-analysis.com.
[8]
]malpediaMalpedia, a . https://malpedia.caad.fkie.fraunhofer.de/.
[9]
]malshareMalshare, b . https://malshare.com.
[10]
]malwareBazaarMalware bazaar, c . https://bazaar.abuse.ch.
[11]
Mitre att&ck. https://attack.mitre.org/matrices/enterprise/.
[12]
ollydbg. http://www.ollydbg.de/.
[13]
Openioc: Back to the basics. https://www.fireeye.com/blog/threat-research/2013/10/openioc-basics.html.
[14]
Pyramid of pain. https://detect-respond.blogspot.com/2013/03/the-pyramid-of-pain.html.
[15]
Reversing labs. https://reversinglabs.com.
[16]
Target missed warnings in epic hack of credit card data. https://bloom.bg/2KjElxM.
[17]
thezoo - a live malware repository. https://github.com/ytisf/theZoo.
[18]
Twitter. https://twitter.com.
[19]
Unpacme. https://unpac.me.
[20]
M. Abu Rajab, J. Zarfoss, F. Monrose, and A. Terzis. A multifaceted approach to understanding the botnet phenomenon. In Proceedings of the 17th ACM SIGCOMM, pages 41--52, Stanford, CA, Aug. 2006.
[21]
Y. Acar, M. Backes, S. Fahl, D. Kim, M. L. Mazurek, and C. Stransky. You get where you're looking for: The impact of information sources on code security. In 2016 IEEE Symposium on Security and Privacy (SP), pages 289--305. IEEE, 2016.
[22]
Y. Acar, M. Backes, S. Fahl, D. Kim, M. L. Mazurek, and C. Stransky. How internet resources might be helping you develop faster but less securely. IEEE Security & Privacy, 15 (2): 50--60, 2017.
[23]
t al.(2011)Babić, Martignoni, McCamant, and Song]babic2011staticallyD. Babić, L. Martignoni, S. McCamant, and D. Song. Statically-directed dynamic automated test generation. In Proceedings of the 2011 International Symposium on Software Testing and Analysis, pages 12--22, 2011.
[24]
M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario. Automated classification and analysis of internet malware. In Proceedings of the 9th International Symposium on Research in Attacks, Intrusions and Defenses (RAID), pages 178--197, 2007.
[25]
R. Baldoni, E. Coppa, D. C. D'elia, C. Demetrescu, and I. Finocchi. A survey of symbolic execution techniques. ACM Computing Surveys (CSUR), 51 (3): 1--39, 2018.
[26]
D. Balzarotti, M. Cova, C. Karlberger, E. Kirda, C. Kruegel, and G. Vigna. Efficient detection of split personalities in malware. In Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, Feb.--Mar. 2010.
[27]
U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In Proceedings of the 16th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, Feb. 2009.
[28]
L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi. Exposure: Finding malicious domains using passive dns analysis. In Proceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, Feb. 2011.
[29]
T. Blazytko, M. Contag, C. Aschermann, and T. Holz. Syntia: Synthesizing the semantics of obfuscated code. In Proceedings of the 25th USENIX Security Symposium (Security), pages 643--659, Vancouver, BC, Canada, Aug. 2017.
[30]
D. Brumley, C. Hartwig, Z. Liang, J. Newsome, D. Song, and H. Yin. Automatically identifying trigger-based behavior in malware. In Botnet Detection, pages 65--88. Springer, 2008.
[31]
and Song]caballero2010inputJ. Caballero, P. Poosankam, S. McCamant, D. Babi ć, and D. Song. Input generation via decomposition and re-stitching: Finding bugs in malware. In Proceedings of the 17th ACM conference on Computer and communications security, pages 413--425, 2010.
[32]
J. Caballero, C. Grier, C. Kreibich, and V. Paxson. Measuring pay-per-install: the commoditization of malware distribution. In Proceedings of the 20th USENIX Security Symposium (Security), San Francisco, CA, Aug. 2011.
[33]
R. Canzanese, S. Mancoridis, and M. Kam. Run-time classification of malicious processes using system call analysis. In 2015 10th International Conference on Malicious and Unwanted Software (MALWARE), pages 21--28. IEEE, 2015.
[34]
B. Cheng, J. Ming, J. Fu, G. Peng, T. Chen, X. Zhang, and J.-Y. Marion. Towards paving the way for large-scale windows malware analysis: Generic binary unpacking with orders-of-magnitude performance boost. In Proceedings of the 24rd ACM Conference on Computer and Communications Security (CCS), pages 395--411, Toronto, Canada, Oct. 2018.
[35]
M. Christodorescu, S. Jha, and C. Kruegel. Mining specifications of malicious behavior. In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 5--14, 2007.
[36]
P. M. Comparetti, G. Salvaneschi, E. Kirda, C. Kolbitsch, C. Kruegel, and S. Zanero. Identifying dormant functionality in malware programs. In Proceedings of the 31th IEEE Symposium on Security and Privacy (Oakland), pages 61--76, Oakland, CA, May 2010.
[37]
W. Cui, M. Peinado, Z. Xu, and E. Chan. Tracking rootkit footprints with a practical memory analysis system. In Proceedings of the 21st USENIX Security Symposium (Security), pages 601--615, Bellevue, WA, Aug. 2012.
[38]
Z. Deng, X. Zhang, and D. Xu. Spider: Stealthy binary program instrumentation and debugging via hardware virtualization. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), pages 289--298, 2013.
[39]
A. Dinaburg, P. Royal, M. Sharif, and W. Lee. Ether: malware analysis via hardware virtualization extensions. In Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS), pages 51--62, Alexandria, VA, Oct. 2008.
[40]
D. Ford and C. Parnin. Exploring causes of frustration for software developers. In 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering, pages 115--116, 2015. 10.1109/CHASE.2015.19.
[41]
M. Graziano, C. Leita, and D. Balzarotti. Towards network containment in malware analysis systems. In Proceedings of the 28th Annual Computer Security Applications Conference, pages 339--348, 2012.
[42]
M. Hassen, M. M. Carvalho, and P. K. Chan. Malware classification using static analysis based features. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1--7. IEEE, 2017.
[43]
E. Hollnagel. Handbook of cognitive task design. CRC Press, 2003.
[44]
M. Ijaz, M. H. Durad, and M. Ismail. Static and dynamic malware analysis using machine learning. In 2019 16th International bhurban conference on applied sciences and technology (IBCAST), pages 687--691. IEEE, 2019.
[45]
G. Jacob, R. Hund, C. Kruegel, and T. Holz. Jackstraws: Picking command and control connections from bot traffic. In Proceedings of the 20th USENIX Security Symposium (Security), San Francisco, CA, Aug. 2011.
[46]
S. A. Jacob and S. P. Furgerson. Writing interview protocols and conducting interviews: tips for students new to the field of qualitative research. Qualitative Report, 17: 6, 2012.
[47]
N. Jagpal, E. Dingle, J.-P. Gravel, P. Mavrommatis, N. Provos, M. A. Rajab, and K. Thomas. Trends and lessons from three years fighting malicious extensions. In Proceedings of the 24th USENIX Security Symposium (Security), pages 579--593, Washington, DC, Aug. 2015.
[48]
Y. Jing, Z. Zhao, G.-J. Ahn, and H. Hu. Morpheus: automatically generating heuristics to detect android emulators. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), pages 216--225, 2014.
[49]
B. Johnson, R. Pandita, J. Smith, D. Ford, S. Elder, E. Murphy-Hill, S. Heckman, and C. Sadowski. A cross-tool communication study on program analysis tool notifications. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, page 73--84, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450342186. 10.1145/2950290.2950304. URL https://doi.org/10.1145/2950290.2950304.
[50]
A. Kharaz, S. Arshad, C. Mulliner, W. Robertson, and E. Kirda. $$UNVEIL$$: A large-scale, automated approach to detecting ransomware. In Proceedings of the 24th USENIX Security Symposium (Security), pages 757--772, Washington, DC, Aug. 2015.
[51]
D. Kim, A. Majlesi-Kupaei, J. Roy, K. Anand, K. ElWazeer, D. Buettner, and R. Barua. Dynodet: Detecting dynamic obfuscation in malware. In Proceedings of the 14th Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA), pages 97--118, 2017.
[52]
D. Kirat and G. Vigna. Malgene: Automatic extraction of malware analysis evasion signature. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 769--780, 2014.
[53]
D. Kirat, G. Vigna, and C. Kruegel. Barebox: efficient malware analysis on bare-metal. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), pages 403--412, 2011.
[54]
D. Kirat, G. Vigna, and C. Kruegel. Barecloud: bare-metal analysis-based evasive malware detection. In Proceedings of the 23rd USENIX Security Symposium (Security), San Diego, CA, Aug. 2014.
[55]
C. Kolbitsch, T. Holz, C. Kruegel, and E. Kirda. Inspector gadget: Automated extraction of proprietary gadgets from malware binaries. In Proceedings of the 31th IEEE Symposium on Security and Privacy (Oakland), pages 29--44, Oakland, CA, May 2010.
[56]
C. Kolbitsch, E. Kirda, and C. Kruegel. The power of procrastination: detection and mitigation of execution-stalling malicious code. In Proceedings of the 18th ACM Conference on Computer and Communications Security (CCS), pages 285--296, Chicago, Illinois, Oct. 2011.
[57]
J. R. Landis and G. G. Koch. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 33 (2): 363--374, 1977. ISSN 0006341X, 15410420. URL http://jstor.org/stable/2529786.
[58]
M. Lindorfer, C. Kolbitsch, and P. M. Comparetti. Detecting environment-sensitive malware. In International Workshop on Recent Advances in Intrusion Detection, pages 338--357. Springer, 2011.
[59]
B. Liu, W. Huo, C. Zhang, W. Li, F. Li, A. Piao, and W. Zou. αdiff: cross-version binary code similarity detection with dnn. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pages 667--678, 2018.
[60]
R. Majumdar and K. Sen. Hybrid concolic testing. In 29th International Conference on Software Engineering (ICSE'07), pages 416--426. IEEE, 2007.
[61]
L. Martignoni, E. Stinson, M. Fredrikson, S. Jha, and J. C. Mitchell. A layered architecture for detecting malicious behaviors. In International Workshop on Recent Advances in Intrusion Detection, pages 78--97. Springer, 2008.
[62]
J. Ming, D. Xu, Y. Jiang, and D. Wu. Binsim: Trace-based semantic binary diffing via system call sliced segment equivalence checking. In Proceedings of the 25th USENIX Security Symposium (Security), pages 253--270, Vancouver, BC, Canada, Aug. 2017.
[63]
A. Moser, C. Kruegel, and E. Kirda. Exploring multiple execution paths for malware analysis. In Proceedings of the 28th IEEE Symposium on Security and Privacy (Oakland), pages 231--245, Oakland, CA, May 2007.
[64]
Y. Nadji, M. Antonakakis, R. Perdisci, and W. Lee. Understanding the prevalence and use of alternative plans in malware with network games. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), 2011.
[65]
M. Neugschwandtner, P. M. Comparetti, and C. Platzer. Detecting malware's failover c&c strategies with squeeze. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), 2011.
[66]
Chaudhuri, and Kifer]palahan2013extractionS. Palahan, D. Babić, S. Chaudhuri, and D. Kifer. Extraction of statistically significant malware behaviors. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), pages 69--78, 2013.
[67]
F. Peng, Z. Deng, X. Zhang, D. Xu, Z. Lin, and Z. Su. X-force: Force-executing binary programs for security applications. In Proceedings of the 23rd USENIX Security Symposium (Security), pages 829--844, San Diego, CA, Aug. 2014.
[68]
el, and Laskov]rieck2008learningK. Rieck, T. Holz, C. Willems, P. Düssel, and P. Laskov. Learning and classification of malware behavior. In Proceedings of the 5th Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA), pages 108--125, 2008.
[69]
P. Royal, M. Halpin, D. Dagon, R. Edmonds, and W. Lee. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In Proceedings of the 22nd Annual Computer Security Applications Conference (ACSAC), pages 289--300, 2006.
[70]
E. J. Schwartz, T. Avgerinos, and D. Brumley. All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In 2010 IEEE symposium on Security and privacy, pages 317--331, Oakland, CA, May 2010.
[71]
C. Spensky, H. Hu, and K. Leach. Lo-phi: Low-observable physical host instrumentation for malware analysis. In Proceedings of the 23rd Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, Feb. 2016.
[72]
P. Srivastava and N. Hopwood. A practical iterative framework for qualitative data analysis. International journal of qualitative methods, 8 (1): 76--84, 2009.
[73]
F. Tegeler, X. Fu, G. Vigna, and C. Kruegel. Botfinder: Finding bots in network traffic without deep packet inspection. In Proceedings of the 8th international conference on Emerging networking experiments and technologies, pages 349--360, 2012.
[74]
X. Ugarte-Pedrero, D. Balzarotti, I. Santos, and P. G. Bringas. Rambo: Run-time packer analysis with multiple branch observation. In Proceedings of the 13th Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA), pages 186--206, 2017.
[75]
D. Votipka, R. Stevens, E. Redmiles, J. Hu, and M. Mazurek. Hackers vs. testers: A comparison of software vulnerability discovery processes. In Proceedings of the 39th IEEE Symposium on Security and Privacy (Oakland), pages 374--391, San Jose, CA, May 2018.
[76]
D. Votipka, S. Rabin, K. Micinski, J. S. Foster, and M. L. Mazurek. An observational investigation of reverse engineers' processes. In 29th $$USENIX$$ Security Symposium ($$USENIX$$ Security 20), pages 1875--1892, 2020.
[77]
S. Wang and D. Wu. In-memory fuzzing for binary code similarity analysis. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 319--330. IEEE, 2017.
[78]
M. Xu and T. Kim. Platpal: Detecting malicious documents with platform diversity. In Proceedings of the 25th USENIX Security Symposium (Security), pages 271--287, Vancouver, BC, Canada, Aug. 2017.
[79]
Z. Xu, J. Zhang, G. Gu, and Z. Lin. Goldeneye: Efficiently and effectively unveiling malware's targeted environment. In International Workshop on Recent Advances in Intrusion Detection, pages 22--45, 2014.
[80]
B. Yadegari and S. Debray. Symbolic execution of obfuscated code. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 732--744, Denver, Colorado, Oct. 2015.
[81]
L.-K. Yan, M. Jayachandra, M. Zhang, and H. Yin. V2e: combining hardware virtualization and softwareemulation for transparent and extensible malware analysis. In Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments, pages 227--238, 2012.
[82]
T.-F. Yen and M. K. Reiter. Traffic aggregation for malware detection. In Proceedings of the 5th Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA), pages 207--227, 2008.
[83]
F. Zhang, K. Leach, K. Sun, and A. Stavrou. Spectre: A dependable introspection framework via system management mode. In Proceedings of the International Conference on Dependable Systems and Networks (DSN), pages 1--12, 2013.
[84]
F. Zhang, K. Leach, A. Stavrou, H. Wang, and K. Sun. Using hardware features for increased debugging transparency. In Proceedings of the 36th IEEE Symposium on Security and Privacy (Oakland), pages 55--69, San Jose, CA, May 2015.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
November 2021
3558 pages
ISBN:9781450384544
DOI:10.1145/3460120
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. malware analysis
  2. usable security

Qualifiers

  • Research-article

Funding Sources

  • National ScienceFoundation Graduate Research Fellowship

Conference

CCS '21
Sponsor:
CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security
November 15 - 19, 2021
Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)611
  • Downloads (Last 6 weeks)51
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media