skip to main content
10.1145/2024569.2024571acmconferencesArticle/Chapter ViewAbstractPublication PagespasteConference Proceedingsconference-collections
research-article

Labeling library functions in stripped binaries

Published: 05 September 2011 Publication History

Abstract

Binary code presents unique analysis challenges, particularly when debugging information has been stripped from the executable. Among the valuable information lost in stripping are the identities of standard library functions linked into the executable; knowing the identities of such functions can help to optimize automated analysis and is instrumental in understanding program behavior. Library fingerprinting attempts to restore the names of library functions in stripped binaries, using signatures extracted from reference libraries. Existing methods are brittle in the face of variations in the toolchain that produced the reference libraries and do not generalize well to new library versions. We introduce semantic descriptors, high-level representations of library functions that avoid the brittleness of existing approaches. We have extended a tool, unstrip, to apply this technique to fingerprint wrapper functions in the GNU C library. unstrip discovers functions in a stripped binary and outputs a new binary, with meaningful names added to the symbol table. Other tools can leverage these symbols to perform further analysis. We demonstrate that our semantic descriptors generalize well and substantially outperform existing library fingerprinting techniques.

References

[1]
G. Balakrishnan, T. Reps, D. Melski, and T. Teitelbaum. WYSINWYX: What You See Is Not What You eXecute. In Verified Software: Theories, Tools, Experiments. Springer-Verlag, 2007.
[2]
U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, February 2009.
[3]
T. E. Cheatham, G. H. Holloway, and J. A. Townley. Symbolic evaluation and the analysis of programs. IEEE Trans. Softw. Eng., 5 (4): 402--417, 1979.
[4]
M. Christodorescu, S. Jha, and C. Krugel. Mining specifications of malicious behavior. In Proceedings of the Sixth Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pages 5--14, Dubrovnik, Croatia, 2007.
[5]
C. Cifuentes and A. Fraboulet. Intraprocedural static slicing of binary executables. In Proc. International Conference on Software Maintenance, pages 188--195, Bari, Italy, October 1997.
[6]
C. Cifuentes and K. J. Gough. Decompilation of binary programs. Software--Practice and Experience, 25 (7), 1995.
[7]
P. Coward. Symbolic execution systems-a review. Software Engineering Journal, 3 (6): 229--239, Nov 1988.
[8]
ROSED. J. Quinlan et al. ROSE Compiler Project. http://www.rosecompiler.org.
[9]
T. Dullien and R. Rolles. Graph-based comparison of executable objects. In Symposium sur la Sécurité des Technologies de l'Information et des Communications (SSTIC), June 2005.
[10]
M. V. Emmerik. Signatures for library functions in executable files. Technical Report 2194, Queensland University of Technology, 1994.
[11]
H. Flake. Structural comparison of executable objects. In Conference Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2004), Dortmund, Germany, July 2004.
[12]
M. Fredrikson, S. Jha, M. Christodorescu, R. Sailer, and X. Yan. Synthesizing near-optimal malware specifications from suspicious behaviors. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, Berkeley, California, May 2010.
[13]
I. Guilfanova and DataRescue. Fast library identificatiion and recognition technology. http://www.hex-rays.com/idapro/flirt.htm, 1997.
[14]
Hex-Rays. IDA Pro disassembler. http://www.hex-rays.com/idapro.
[15]
A. Kiss, J. Jasz, G. Lehotai, and T. Gyimothy. Interprocedural static slicing of binary executables. In Source Code Analysis and Manipulation, Amsterdam, The Netherlands, September 2003.
[16]
C. Kolbitsch, P. M. Comparetti, C. Kruegel, E. Kirda, X. Zho, and X. Wang. Effective and efficient malware detection at the end host. In Eighteenth USENIX Security Symposium, Montreal, Canada, August 2009.
[17]
C. Kruegel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In Eighth International Symposium on Recent Advances in Intrusion Detection (RAID 2005), Seattle,WA, September 2005.
[18]
Paradyn Project. Dyninst 7.0. 2011. URL http://www.paradyn.org/html/dyninst7.0-features.html.
[19]
Paradyn Project. ParseAPI: An application program interface for binary parsing. 2011. URL http://paradyn.org/html/parse0.9-features.html.
[20]
Paradyn Project. shape unstrip. 2011. URL http://paradyn.org/html/tools/unstrip.html.
[21]
N. Rosenblum, X. Zhu, B. Miller, and K. Hunt. Learning to analyze binary computer code. In 23rd conference on Artificial Intellegence (AAAI '08), Chicago, IL, July 2008.
[22]
N. E. Rosenblum, B. P. Miller, and X. Zhu. Extracting compiler provenance from program binaries. In 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (PASTE '10), Toronto, Ontario, Canada, June 2010.
[23]
H. Theiling. Ecxtracting safe and precise control flow from binaries. In 7th Conference on Real-Time Computing Systems and Applications (RTCSA '00), Washington, DC, December 2000.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PASTE '11: Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools
September 2011
46 pages
ISBN:9781450308496
DOI:10.1145/2024569
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. reverse engineering
  2. static analysis
  3. stipped binaries

Qualifiers

  • Research-article

Conference

ESEC/FSE'11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 57 of 159 submissions, 36%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media