Computer Science > Cryptography and Security
[Submitted on 18 May 2023 (this version), latest version 18 Jul 2023 (v2)]
Title:GraphMoco:a Graph Momentum Contrast Model that Using Multimodel Structure Information for Large-scale Binary Function Representation Learning
View PDFAbstract:The ability to compute similarity scores of binary code at the function level is essential for cyber security. A single binary file can contain tens of thousands of functions. A deployable learning framework for cybersecurity applications needs to work not only accurately but also efficiently with large amounts of data. Traditional methods suffer from two drawbacks. First, it is very difficult to annotate different pairs of functions with accurate labels. These supervised learning methods can easily be overtrained with inaccurate labels. The second is that they either use the pre-trained encoder or use the fine-grained graph comparison. However, these methods have shortcomings in terms of time or memory consumption. We focus on large-scale Binary Code Similarity Detection (BCSD) and to mitigate the traditional problems, we propose GraphMoco: a graph momentum contrast model that uses multimodal structure information for large-scale binary function representation learning. We take an unsupervised learning approach and make full use of the structural information in the binary code. It does not require manually labelled similar or dissimilar information. Our models perform efficiently on large amounts of training data. Our experimental results show that our method outperforms the state-of-the-art in terms of accuracy.
Submission history
From: Ruijin Sun [view email][v1] Thu, 18 May 2023 09:07:40 UTC (1,204 KB)
[v2] Tue, 18 Jul 2023 16:05:16 UTC (1,996 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.