skip to main content
research-article

BINO: : Automatic recognition of inline binary functions from template classes

Published: 01 September 2023 Publication History

Highlights

Reverse engineering is a complex process essential to vulnerability detection.
Compiler optimizations such as inlining make the task of reverse engineering significantly harder.
Functions from library template classes, such as the methods from the classes vector and map, are the best candidate for inlining.
Automatic recognition of inline library functions simplifies the reverse engineering task since the analyst has suggestions on the code semantics.

Abstract

In this paper, we propose BINO, a static analysis approach that relieves reverse engineers from the challenging task of recognizing library functions that have been inlined. BINO recognizes inline calls of methods of C++ template classes (even with unknown data types). We do this through a binary fingerprinting and matching approach. Our fingerprint model captures syntactic and semantic features of an assembly function, along with its Control-Flow Graph structure. Using these fingerprints and subgraph isomorphism, it recognizes inline method calls in a target binary. BINO automates the fingerprints generation phase by parsing the source code of the template classes and automatically building appropriate binaries with representative inline calls of said methods. We evaluate BINO by performing experiments on a dataset of 555 GitHub C++ projects containing 10,600 inline functions, exploring several optimization levels that allow the compiler to inline function calls. We show that our approach can recognize inline function calls to the most used methods of well-known template classes with an F1-Score up to 63% with the -O2, -O3, and -Ofast optimizations levels.

References

[1]
T. Bao, J. Burket, M. Woo, R. Turner, D. Brumley, BYTEWEIGHT: learning to recognize functions in binary code, in: K. Fu, J. Jung (Eds.), Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, August 20–22, 2014, USENIX Association, 2014, pp. 845–860.
[2]
Y. Ben-Asher, O. Boehm, D. Citron, G. Haber, M. Klausner, R. Levin, Y. Shajrawi, Aggressive function inlining: preventing loop blockings in the instruction cache, in: P. Stenström, M. Dubois, M. Katevenis, R. Gupta, T. Ungerer (Eds.), High Performance Embedded Architectures and Compilers, Third International Conference, HiPEAC 2008, Göteborg, Sweden, January 27–29, 2008, Proceedings, Springer, 2008, pp. 384–397,.
[3]
D. Brumley, I. Jager, T. Avgerinos, E.J. Schwartz, BAP: a binary analysis platform, in: G. Gopalakrishnan, S. Qadeer (Eds.), Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14–20, 2011. Proceedings, Springer, 2011, pp. 463–469,.
[4]
M. Chandramohan, Y. Xue, Z. Xu, Y. Liu, C.Y. Cho, H.B.K. Tan, Bingo: cross-architecture cross-os binary search, in: T. Zimmermann, J. Cleland-Huang, Z. Su (Eds.), Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13–18, 2016, ACM, 2016, pp. 678–689,.
[5]
X. Chen, A. Slowinska, H. Bos, Who allocated my memory? Detecting custom memory allocators in C binaries, in: R. Lämmel, R. Oliveto, R. Robbes (Eds.), 20th Working Conference on Reverse Engineering, WCRE 2013, Koblenz, Germany, October 14–17, 2013, IEEE Computer Society, 2013, pp. 22–31,.
[6]
X. Chen, A. Slowinska, H. Bos, On the detection of custom memory allocators in cbinaries, Empir. Softw. Eng. 21 (3) (2016) 753–777,.
[7]
P.M. Comparetti, G. Salvaneschi, E. Kirda, C. Kolbitsch, C. Kruegel, S. Zanero, Identifying dormant functionality in malware programs, SP, IEEE Computer Society, Washington, DC, USA, 2010, pp. 61–76.
[8]
L.P. Cordella, P. Foggia, C. Sansone, M. Vento, An improved algorithm for matching large graphs, 3rd IAPR-TC15 workshop on graph-based representations in pattern recognition, 2001, pp. 149–159.
[9]
L.P. Cordella, P. Foggia, C. Sansone, M. Vento, A (sub) graph isomorphism algorithm for matching large graphs, IEEE Trans. Pattern Anal. Mach. Intell. 26 (10) (2004) 1367–1372,.
[10]
S.H. Ding, B.C. Fung, P. Charland, Kam1n0: Mapreduce-based assembly clone search for reverse engineering, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, New York, NY, USA, 2016, pp. 461–470,.
[11]
S.H.H. Ding, B.C.M. Fung, P. Charland, Kam1n0: Mapreduce-based assembly clone search for reverse engineering, in: B. Krishnapuram, M. Shah, A.J. Smola, C.C. Aggarwal, D. Shen, R. Rastogi (Eds.), Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, ACM, 2016, pp. 461–470,.
[12]
S.H.H. Ding, B.C.M. Fung, P. Charland, Asm2vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization, 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19–23, 2019, IEEE, 2019, pp. 472–489,.
[13]
Y. Duan, X. Li, J. Wang, H. Yin, Deepbindiff: learning program-wide code representations for binary diffing, 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23–26, 2020, The Internet Society, 2020.
[14]
Dwarf debugging information format version 5, 2017. published standard. http://dwarfstd.org/doc/DWARF5.pdf.
[15]
A.D. Federico, M. Payer, G. Agosta, rev.ng: a unified binary analysis framework to recover cfgs and function boundaries, in: P. Wu, S. Hack (Eds.), Proceedings of the 26th International Conference on Compiler Construction, Austin, TX, USA, February 5–6, 2017, ACM, 2017, pp. 131–141.
[16]
S. Fortin, The graph isomorphism problem, Technical Report, University of Alberta, 1996.
[17]
M. Grohe, P. Schweitzer, The graph isomorphism problem, Commun. ACM 63 (11) (2020) 128–134,.
[18]
I.U. Haq, J. Caballero, A survey of binary code similarity, ACM Comput. Surv. 54 (3) (2021),.
[20]
S. Kairajärvi, A. Costin, T. Hämäläinen, Isadetect: usable automated detection of cpu architecture and endianness for executable binary files and object code, Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy, Association for Computing Machinery, New York, NY, USA, 2020, pp. 376–380,.
[21]
K. Kennedy, J.R. Allen, Optimizing Compilers for Modern Architectures: ADependence-Based Approach, Morgan Kaufmann Publishers Inc., 2001.
[22]
H. Koo, S. Park, T. Kim, Revisiting function identification with machine learning, Machine Learning for Program Aanalysis (MLPA) Workshop, 2021.
[23]
C. Kruegel, E. Kirda, D. Mutz, W. Robertson, G. Vigna, Polymorphic worm detection using structural information of executables, RAID, Springer-Verlag, Berlin, Heidelberg, 2006, pp. 207–226.
[24]
J. Lee, W. Han, R. Kasperovics, J. Lee, An in-depth comparison of subgraph isomorphism algorithms in graph databases, Proc. VLDB Endow. 6 (2) (2012) 133–144,.
[25]
M. Lindorfer, A.D. Federico, F. Maggi, P.M. Comparetti, S. Zanero, Lines of malicious code: insights into the malicious software industry, ACSAC, ACM, New York, NY, USA, 2012, pp. 349–358.
[26]
A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y. Fratantonio, M. Mansouri, D. Balzarotti, How machine learning is solving the binary function similarity problem, 31st USENIX Security Symposium (USENIX Security 22), USENIX Association, Boston, MA, 2022.
[27]
P.D. Nicolao, M. Pogliani, M. Polino, M. Carminati, D. Quarta, S. Zanero, ELISA: eliciting ISA of raw binaries for fine-grained code and data separation, in: C. Giuffrida, S. Bardin, G. Blanc (Eds.), Detection of Intrusions and Malware, and Vulnerability Assessment - 15th International Conference, DIMVA 2018, Saclay, France, June 28–29, 2018, Proceedings, Springer, 2018, pp. 351–371,.
[28]
L. Nouh, A. Rahimian, D. Mouheb, M. Debbabi, A. Hanna, Binsign: fingerprinting binary functions to support automated analysis of code executables, in: S.D.C. di Vimercati, F. Martinelli (Eds.), ICT Systems Security and Privacy Protection - 32nd IFIP TC 11 International Conference, SEC 2017, Rome, Italy, May 29–31, 2017, Proceedings, Springer, 2017, pp. 341–355,.
[29]
M. Polino, A. Scorti, F. Maggi, S. Zanero, Jackdaw: towards automatic reverse engineering of large datasets of binaries, in: M. Almgren, V. Gulisano, F. Maggi (Eds.), Detection of Intrusions and Malware, and Vulnerability Assessment - 12th International Conference, DIMVA 2015, Milan, Italy, July 9–10, 2015, Proceedings, Springer, 2015, pp. 121–143,.
[30]
P. Shirani, L. Wang, M. Debbabi, Binshape: scalable and robust binary library function identification using function shape, in: M. Polychronakis, M. Meier (Eds.), Detection of Intrusions and Malware, and Vulnerability Assessment - 14th International Conference, DIMVA 2017, Bonn, Germany, July 6–7, 2017, Proceedings, Springer, 2017, pp. 301–324,.
[31]
Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Krügel, G. Vigna, SOK: (state of) the art of war: Offensive techniques in binary analysis, IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22–26, 2016, IEEE Computer Society, 2016, pp. 138–157,.
[32]
R. Wang, Y. Shoshitaishvili, A. Bianchi, A. Machiry, J. Grosen, P. Grosen, C. Kruegel, G. Vigna, Ramblr: making reassembly great again, 24th Annual Network and Distributed System Security Symposium, NDSS 2017, San Diego, California, USA, February 26, - March 1, 2017, The Internet Society, 2017.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Computers and Security
Computers and Security  Volume 132, Issue C
Sep 2023
997 pages

Publisher

Elsevier Advanced Technology Publications

United Kingdom

Publication History

Published: 01 September 2023

Author Tags

  1. Reverse engineering
  2. Function inlining
  3. Template classes
  4. Function recognition
  5. Graph isomorphism

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media