article

Detecting malware based on DNS graph mining

Authors:

Ping YiAuthors Info & Claims

International Journal of Distributed Sensor Networks, Volume 2015

Article No.: 1, Page 1

https://doi.org/10.1155/2015/102687

Published: 01 January 2015 Publication History

Abstract

Malware remains a major threat to nowadays Internet. In this paper, we propose a DNS graph mining-based malware detection approach. A DNS graph is composed of DNS nodes, which represent server IPs, client IPs, and queried domain names in the process of DNS resolution. After the graph construction, we next transform the problem of malware detection to the graph mining task of inferring graph nodes' reputation scores using the belief propagation algorithm. The nodes with lower reputation scores are inferred as those infected by malwares with higher probability. For demonstration, we evaluate the proposed malware detection approach with real-world dataset. Our real-world dataset is collected from campus DNS servers for three months and we built a DNS graph consisting of 19,340,820 vertices and 24,277,564 edges. On the graph, we achieve a true positive rate 80.63% with a false positive rate 0.023%. With a false positive of 1.20%, the true positive rate was improved to 95.66%. We detected 88,592 hosts infected by malware or C&C servers, accounting for the percentage of 5.47% among all hosts. Meanwhile, 117,971 domains are considered to be related tomalicious activities, accounting for 1.5% among all domains. The results indicate that our method is efficient and effective in detecting malwares.

References

[1]

J. Nazario and T. Holz, "As the net churns: fast-flux botnet observations," in Proceedings of the 3rd International Conference on Malicious and Unwanted Software (MALWARE '08), pp. 24- 31, Alexandria, VA, USA, October 2008.

[2]

T. Holz, C. Gorecki, K. Rieck, and F. C. Freiling, "Measuring and detecting fast-flux service networks," in Proceedings of the 15th Annual Network and Distributed System Security Symposium, San Diego, Calif, USA, February 2008.

[3]

S. Yadav, A. K. K. Reddy, A. L. N. Reddy, and S. Ranjan, "Detecting algorithmically generated malicious domain names," in Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (IMC '10), pp. 48-61, ACM, Melbourne, Australia, November 2010.

[4]

M. Antonakakis, R. Perdisci, Y. Nadji et al., "From throwaway traffic to bots detecting the rise of DGA-based Malware," in Proceedings of the 21st USENIX Security Symposium, p. 24, Bellevue, Wash, USA, August 2012.

[5]

C. J. Dietrich, C. Rossow, F. C. Freiling, H. Bos, M. V. Steen, and N. Pohlmann, "On botnets that use DNS for command and control," in Proceedings of the 7th European Conference on Computer Network Defense (EC2ND '11), pp. 9-16, Gothenburg, Sweden, September 2011.

[6]

M. Antonakakis, R. Perdisci, D. Dagon et al., "Building a dynamic reputation system for DNS," in Proceedings of the 19th USENIX Security Symposium, pp. 273-290, Washington, DC, USA, August 2010.

[7]

L. Bilge, E. Kirda, C. Kruegel et al., "Exposure: finding malicious domains using passive DNS analysis," in Proceedings of the 18th Annual Network and Distributed System Security Symposium, San Diego, Calif, USA, February 2011.

[8]

M. Antonakakis, R. Perdisci, W. Lee et al., "Detecting malware domains at the upper DNS hierarchy," in Proceedings of the 20th USENIX Security Symposium, p. 27, San Francisco, Calif, USA, August 2011.

Digital Library

[9]

F. Weimer, "Passive DNS replication," in Proceedings of the 17th Annual FIRST Conference on Computer Security Incident Handling, Singapore, June-July 2005.

[10]

R. Kindermann and J. L. Snell, Markov Random Fields and Their Applications, American Mathematical Society, Providence, RI, USA, 1980.

[11]

J. Pearl, "Reverend bayes on inference engines: a distributed hierarchical approach," in Proceedings of the 2nd National Conference on Artificial Intelligence, pp. 133-136, Pittsburgh, Pa, USA, August 1982.

Digital Library

[12]

J. S. Yedidia, W. T. Freeman, and Y. Weiss, "Understanding belief propagation and its generalizations," in Exploring Artificial Intelligence in the New Millennium, G. Lakemeyer and B. Nebel, Eds., pp. 239-269, Morgan Kaufmann Publishers, San Francisco, Calif, USA, 2003.

[13]

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. Hellerstein, "GraphLab: a new framework for parallel machine learning," in Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI '10), pp. 340-349, Catalina Island, Calif, USA, July 2010.

[14]

Z. Ramzan, V. Seshadri, and C. Nachenberg, "Reputationbased security: an analysis of real world effectiveness [R/OL]," https://scm.symantec.com/resources/reputation_based_security.pdf.

[15]

D. H. P. Chau, C. Nachenberg, J. Wilhelm et al., "Polonium: tera-scale graph mining and inference for malware detection," in Proceedings of the SIAM International Conference on Data Mining, pp. 131-142, Mesa, Ariz, USA, April 2011.

[16]

T. Ehrenkranz and J. Li, "On the state of IP spoofing defense," ACM Transactions on Internet Technology, vol. 9, no. 2, article 6, 2009.

Digital Library

[17]

MaxMind, GeoLite Free Downloadable Databases[CP/OL], http://dev.maxmind.com/geoip/legacy/geolite.

[18]

M. Bastian, S. Heymann, and M. Jacomy, "Gephi: an open source software for exploring and manipulating networks," in Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media, San Jose, Calif, USA, May 2009.

[19]

S. Wang, R. State, M. Ourdane, and T. Engel, "RiskRank: security risk ranking for IP flow records," in Proceedings of the 6th International Conference on Network and Service Management (CNSM '10), pp. 56-63, IEEE, Niagara Falls, Canada, October 2010.

[20]

J. M. Kleinberg, "Authoritative sources in a hyperlinked environment," Journal of the ACM, vol. 46, no. 5, pp. 604-632, 1999.

Digital Library

[21]

J. François, S. Wang, W. Bronzi, R. State, and T. Engel, "Bot-Cloud: detecting botnets using MapReduce," in Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS '11), pp. 1-6, Iguacu Falls, Brazil, December 2011.

[22]

S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos, "NetProbe: a fast and scalable system for fraud detection in online auction networks," in Proceedings of the 16th International World Wide Web Conference (WWW '07), pp. 201-210, Banff, Canada, May 2007.

Digital Library

[23]

H. Gao, V. Yegneswaran, Y. Chen et al., "An empirical reexamination of global DNS behavior," in Proceedings of the ACM SIGCOMM conference on SIGCOMM, pp. 267-278, Hong Kong, China, August 2013.

Cited By

Nabeel MKhalil IGuan BYu T(2020)Following Passive DNS Traces to Detect Stealthy Malicious Domains Via Graph InferenceACM Transactions on Privacy and Security10.1145/340189723:4(1-36)Online publication date: 6-Jul-2020
https://dl.acm.org/doi/10.1145/3401897
Blaise ABouet MConan VSecci S(2020)BotFP: FingerPrints Clustering for Bot DetectionNOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS47738.2020.9110420(1-7)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1109/NOMS47738.2020.9110420
Zhauniarovich YKhalil IYu TDacier M(2018)A Survey on Malicious Domains Detection through DNS Data AnalysisACM Computing Surveys10.1145/319132951:4(1-36)Online publication date: 6-Jul-2018
https://dl.acm.org/doi/10.1145/3191329
Show More Cited By

Recommendations

Detecting PE infection-based malware

Organisations have employed multiple layers of defence mechanisms, while numerous attacks still take place every day. Malware is a major vehicle to perform attacks such as stealing confidential information, disrupting services, or sabotaging industrial ...
Detecting trigger-based behaviors in botnet malware
RACS '15: Proceedings of the 2015 Conference on research in adaptive and convergent systems

Malware often hides malicious behaviors which are triggered when constraints are satisfied. The trigger-based behavior makes malware detection harder, and requires manual analysis. The number of daily submitted malware has been increasing, while the ...
Detecting environment-sensitive malware
RAID'11: Proceedings of the 14th international conference on Recent Advances in Intrusion Detection

The execution of malware in an instrumented sandbox is a widespread approach for the analysis of malicious code, largely because it sidesteps the difficulties involved in the static analysis of obfuscated code. As malware analysis sandboxes increase in ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Distributed Sensor Networks

International Journal of Distributed Sensor Networks Volume 2015, Issue

Special issue on Big Data in Future Sensing

January 2015

88 pages

ISSN:1550-1329

EISSN:1550-1477

Issue’s Table of Contents

Publisher

Hindawi Limited

London, United Kingdom

Publication History

Accepted: 17 April 2015

Published: 01 January 2015

Received: 24 August 2014

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
24
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nabeel MKhalil IGuan BYu T(2020)Following Passive DNS Traces to Detect Stealthy Malicious Domains Via Graph InferenceACM Transactions on Privacy and Security10.1145/340189723:4(1-36)Online publication date: 6-Jul-2020
https://dl.acm.org/doi/10.1145/3401897
Blaise ABouet MConan VSecci S(2020)BotFP: FingerPrints Clustering for Bot DetectionNOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS47738.2020.9110420(1-7)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1109/NOMS47738.2020.9110420
Zhauniarovich YKhalil IYu TDacier M(2018)A Survey on Malicious Domains Detection through DNS Data AnalysisACM Computing Surveys10.1145/319132951:4(1-36)Online publication date: 6-Jul-2018
https://dl.acm.org/doi/10.1145/3191329
Khalil IGuan BNabeel MYu TZhao ZAhn GKrishnan RGhinita G(2018)A Domain is only as Good as its BuddiesProceedings of the Eighth ACM Conference on Data and Application Security and Privacy10.1145/3176258.3176329(330-341)Online publication date: 13-Mar-2018
https://dl.acm.org/doi/10.1145/3176258.3176329

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents