skip to main content
article

Detecting malware based on DNS graph mining

Published: 01 January 2015 Publication History

Abstract

Malware remains a major threat to nowadays Internet. In this paper, we propose a DNS graph mining-based malware detection approach. A DNS graph is composed of DNS nodes, which represent server IPs, client IPs, and queried domain names in the process of DNS resolution. After the graph construction, we next transform the problem of malware detection to the graph mining task of inferring graph nodes' reputation scores using the belief propagation algorithm. The nodes with lower reputation scores are inferred as those infected by malwares with higher probability. For demonstration, we evaluate the proposed malware detection approach with real-world dataset. Our real-world dataset is collected from campus DNS servers for three months and we built a DNS graph consisting of 19,340,820 vertices and 24,277,564 edges. On the graph, we achieve a true positive rate 80.63% with a false positive rate 0.023%. With a false positive of 1.20%, the true positive rate was improved to 95.66%. We detected 88,592 hosts infected by malware or C&C servers, accounting for the percentage of 5.47% among all hosts. Meanwhile, 117,971 domains are considered to be related tomalicious activities, accounting for 1.5% among all domains. The results indicate that our method is efficient and effective in detecting malwares.

References

[1]
J. Nazario and T. Holz, "As the net churns: fast-flux botnet observations," in Proceedings of the 3rd International Conference on Malicious and Unwanted Software (MALWARE '08), pp. 24- 31, Alexandria, VA, USA, October 2008.
[2]
T. Holz, C. Gorecki, K. Rieck, and F. C. Freiling, "Measuring and detecting fast-flux service networks," in Proceedings of the 15th Annual Network and Distributed System Security Symposium, San Diego, Calif, USA, February 2008.
[3]
S. Yadav, A. K. K. Reddy, A. L. N. Reddy, and S. Ranjan, "Detecting algorithmically generated malicious domain names," in Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (IMC '10), pp. 48-61, ACM, Melbourne, Australia, November 2010.
[4]
M. Antonakakis, R. Perdisci, Y. Nadji et al., "From throwaway traffic to bots detecting the rise of DGA-based Malware," in Proceedings of the 21st USENIX Security Symposium, p. 24, Bellevue, Wash, USA, August 2012.
[5]
C. J. Dietrich, C. Rossow, F. C. Freiling, H. Bos, M. V. Steen, and N. Pohlmann, "On botnets that use DNS for command and control," in Proceedings of the 7th European Conference on Computer Network Defense (EC2ND '11), pp. 9-16, Gothenburg, Sweden, September 2011.
[6]
M. Antonakakis, R. Perdisci, D. Dagon et al., "Building a dynamic reputation system for DNS," in Proceedings of the 19th USENIX Security Symposium, pp. 273-290, Washington, DC, USA, August 2010.
[7]
L. Bilge, E. Kirda, C. Kruegel et al., "Exposure: finding malicious domains using passive DNS analysis," in Proceedings of the 18th Annual Network and Distributed System Security Symposium, San Diego, Calif, USA, February 2011.
[8]
M. Antonakakis, R. Perdisci, W. Lee et al., "Detecting malware domains at the upper DNS hierarchy," in Proceedings of the 20th USENIX Security Symposium, p. 27, San Francisco, Calif, USA, August 2011.
[9]
F. Weimer, "Passive DNS replication," in Proceedings of the 17th Annual FIRST Conference on Computer Security Incident Handling, Singapore, June-July 2005.
[10]
R. Kindermann and J. L. Snell, Markov Random Fields and Their Applications, American Mathematical Society, Providence, RI, USA, 1980.
[11]
J. Pearl, "Reverend bayes on inference engines: a distributed hierarchical approach," in Proceedings of the 2nd National Conference on Artificial Intelligence, pp. 133-136, Pittsburgh, Pa, USA, August 1982.
[12]
J. S. Yedidia, W. T. Freeman, and Y. Weiss, "Understanding belief propagation and its generalizations," in Exploring Artificial Intelligence in the New Millennium, G. Lakemeyer and B. Nebel, Eds., pp. 239-269, Morgan Kaufmann Publishers, San Francisco, Calif, USA, 2003.
[13]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. Hellerstein, "GraphLab: a new framework for parallel machine learning," in Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI '10), pp. 340-349, Catalina Island, Calif, USA, July 2010.
[14]
Z. Ramzan, V. Seshadri, and C. Nachenberg, "Reputationbased security: an analysis of real world effectiveness [R/OL]," https://scm.symantec.com/resources/reputation_based_security.pdf.
[15]
D. H. P. Chau, C. Nachenberg, J. Wilhelm et al., "Polonium: tera-scale graph mining and inference for malware detection," in Proceedings of the SIAM International Conference on Data Mining, pp. 131-142, Mesa, Ariz, USA, April 2011.
[16]
T. Ehrenkranz and J. Li, "On the state of IP spoofing defense," ACM Transactions on Internet Technology, vol. 9, no. 2, article 6, 2009.
[17]
MaxMind, GeoLite Free Downloadable Databases[CP/OL], http://dev.maxmind.com/geoip/legacy/geolite.
[18]
M. Bastian, S. Heymann, and M. Jacomy, "Gephi: an open source software for exploring and manipulating networks," in Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media, San Jose, Calif, USA, May 2009.
[19]
S. Wang, R. State, M. Ourdane, and T. Engel, "RiskRank: security risk ranking for IP flow records," in Proceedings of the 6th International Conference on Network and Service Management (CNSM '10), pp. 56-63, IEEE, Niagara Falls, Canada, October 2010.
[20]
J. M. Kleinberg, "Authoritative sources in a hyperlinked environment," Journal of the ACM, vol. 46, no. 5, pp. 604-632, 1999.
[21]
J. François, S. Wang, W. Bronzi, R. State, and T. Engel, "Bot-Cloud: detecting botnets using MapReduce," in Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS '11), pp. 1-6, Iguacu Falls, Brazil, December 2011.
[22]
S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos, "NetProbe: a fast and scalable system for fraud detection in online auction networks," in Proceedings of the 16th International World Wide Web Conference (WWW '07), pp. 201-210, Banff, Canada, May 2007.
[23]
H. Gao, V. Yegneswaran, Y. Chen et al., "An empirical reexamination of global DNS behavior," in Proceedings of the ACM SIGCOMM conference on SIGCOMM, pp. 267-278, Hong Kong, China, August 2013.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Distributed Sensor Networks
International Journal of Distributed Sensor Networks  Volume 2015, Issue
Special issue on Big Data in Future Sensing
January 2015
88 pages
ISSN:1550-1329
EISSN:1550-1477
Issue’s Table of Contents

Publisher

Hindawi Limited

London, United Kingdom

Publication History

Accepted: 17 April 2015
Published: 01 January 2015
Received: 24 August 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media