skip to main content
10.1007/978-3-031-36805-9_47guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Structural Node Representation Learning for Detecting Botnet Nodes

Published: 03 July 2023 Publication History

Abstract

Private consumers, small businesses, and even large enterprises are all more at risk from botnets. These botnets are known for spearheading Distributed Denial-Of-Service (DDoS) attacks, spamming large populations of users, and causing critical harm to major organizations. The development of Internet-of-Things (IoT) devices led to the use of these devices for cryptocurrency mining, in transit data interception, and sending logs containing private data to the master botnet. Different techniques have been developed to identify these botnet activities, but only a few use Graph Neural Networks (GNNs) to analyze host activity by representing their communications with a directed graph. Although GNNs are intended to extract structural graph properties, they risk to cause overfitting, which leads to failure when attempting to do so from an unidentified network. In this study, we test the notion that structural graph patterns might be used for efficient botnet detection. In this study, we also present SIR-GN, a structural iterative representation learning methodology for graph nodes. Our approach is built to work well with untested data, and our model is able to provide a vector representation for every node that captures its structural information. Finally, we demonstrate that, when the collection of node representation vectors is incorporated into a neural network classifier, our model outperforms the state-of-the-art GNN based algorithms in the detection of bot nodes within unknown networks.

References

[1]
Zou, C.C., Cunningham, R.: Honeypot-aware advanced botnet construction and maintenance. In: International Conference on Dependable Systems and Networks, pp. 199–208. IEEE (2006)
[2]
Yan G, Ha DT, and Eidenbenz S AntBot: Anti-pollution peer-to-peer botnets Comput. Netw. 2011 55 8 1941-1956
[3]
Gu, G., Perdisci, R., Zhang, J., Lee, W.: Botminer: Clustering Analysis of Network Traffic for Protocol-And Structure-Independent Botnet Detection (2008)
[4]
Holz, T., Gorecki, C., Freiling, F., Rieck, K.: Detection and mitigation of fast-flux service networks. In: 15th Annual Network and Distributed System Security Symposium (2008)
[5]
Bartos, K., Sofka, M., Franc, V.: Optimized Invariant Representation of Network Traffic for Detecting Unseen Malware Variants. In: 25th {USENIX} Security Symposium, pp. 807–822 (2016)
[6]
Perdisci, R., Lee, W.: Method and System for Detecting Malicious and/or Botnet-Related Domain Names. Patent 10,027,688 (2018)
[7]
Andriesse, D., Rossow, C., Bos, H.: Reliable recon in adversarial peer-to-peer botnets. In: 2015 Internet Measurement Conference, pp. 129–140 (2015)
[8]
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
[9]
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
[10]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
[11]
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-scale information network embedding. In: 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
[12]
Grover, A., Leskovec, J.: Node2vec: Scalable Feature Learning for Networks. In: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
[13]
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, and Monfardini G The graph neural network model IEEE Trans. Neural Netw. 2008 20 1 61-80
[14]
Ribeiro, L.F., Saverese, P.H., Figueiredo, D.R.: Struc2vec: Learning node representations from structural identity. In: 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394 (2017)
[15]
Donnat, C., Zitnik, M., Hallac, D., Leskovec, J.: Learning structural node embeddings via diffusion wavelets. In: 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1320–1329 (2018)
[16]
Joaristi M and Serra E SIR-GN: A fast structural iterative representation learning approach for graph nodes ACM Trans. Knowl. Discov. Data 2021 15 6 1-39
[17]
Layne, J., Serra, E.: INFSIR-GN: Inferential Labeled Node and Graph Representation Learning. arXiv preprint arXiv:1918.10503 (2021)
[18]
Ceci, M., Cuzzocrea, A., Malerba, D.: Supporting roll-up and drill-down operations over OLAP data cubes with continuous dimensions via density-based hierarchical clustering. In: SEBD. Citeseer, pp. 57–65 (2011)
[19]
Serra, E., Joaristi, M., Cuzzocrea, A.:, Large-scale sparse structural node representation. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 5247–5253. IEEE (2020)
[20]
Braun P, Cuzzocrea A, Keding TD, Leung CK, Padzor AG, and Sayson D Game data mining: clustering and visualization of online game data in cyber-physical worlds Procedia Comput. Sci. 2017 112 2259-2268
[21]
Guzzo, A., Sacca, D., Serra, E.: An effective approach to inverse frequent set mining. In: 2009 9th IEEE International Conference on Data Mining, pp. 806–811. IEEE (2009)
[22]
Morris, K.J., Egan, S.D., Linsangan, J.L., Leung, C.K., Cuzzocrea, A., Hoi, C.S.: Token-based adaptive time-series prediction by ensembling linear and non-linear estimators: A machine learning approach for predictive analytics on big stock data”. In: 2018 17th IEEE International Conference on Machine Learning and Applications, pp. 1486–1491. IEEE (2018)
[23]
Serra E and Subrahmanian V A survey of quantitative models of terror group behavior and an analysis of strategic disclosure of behavioral models IEEE Trans. Comput. Soc. Syst. 2014 1 1 66-88
[24]
Bellatreche L, Cuzzocrea A, and Benkrid S Bach Pedersen T, Mohania MK, and Tjoa AM F&A: A methodology for effectively and efficiently designing parallel relational data warehouses on heterogenous database clusters Data Warehousing and Knowledge Discovery 2010 Heidelberg Springer 89-104
[25]
Korzh O, Joaristi M, and Serra E Chin FYL, Chen CLP, Khan L, Lee K, and Zhang L-J Convolutional neural network ensemble fine-tuning for extended transfer learning Big Data – BigData 2018 2018 Cham Springer 110-123
[26]
Ahn, S., et al.: A fuzzy logic based machine learning tool for supporting big data business analytics in complex artificial intelligence environments. In: 2019 IEEE International Conference on Fuzzy Systems, pp. 1–6. IEEE (2019)
[27]
Serra, E., Sharma, A., Joaristi, M., Korzh, O.: Unknown landscape identification with CNN transfer learning. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 813–820. IEEE (2018)
[28]
Serra, E., Shrestha, A., Spezzano, F., Squicciarini, A.: Deeptrust: An automatic framework to detect trustworthy users in opinion-based systems. In: 10th ACM Conference on Data and Application Security and Privacy, pp. 29–38 (2020)
[29]
Joaristi, M., Serra, E., Spezzano, F.: Inferring bad entities through the panama papers network. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 767–773. IEEE (2018)
[30]
Joaristi M, Serra E, and Spezzano F Detecting suspicious entities in offshore leaks networks Soc. Netw. Anal. Min. 2019 9 1 1-15
[31]
CAIDA. The CAIDA UCSD Anonymized Internet Traces-2018. (2018). Accessed 16 Sept. 2017. https://www.caida.org/data/passive/passivedataset.xml
[32]
Kaashoek MF and Karger DR Kaashoek MF and Stoica I Koorde: A simple degree-optimal distributed hash table Peer-to-Peer Systems II 2003 Heidelberg Springer 98-107
[33]
Maymounkov P and Mazières D Druschel P, Kaashoek F, and Rowstron A Kademlia: A peer-to-peer information system based on the XOR metric Peer-To-Peer Systems 2002 Heidelberg Springer 53-65
[34]
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H., Chord, A.: A scalable peer-to-peer lookup service for internet applications. Lab. Comput. Sci., Massachusetts Inst. Technol., Tech. Rep. TR-819 (2001)
[35]
Jelasity M, Bilicki V, et al. Towards automated detection of peer-to-peer botnets: On the limits of local approaches LEET 2009 9 3
[36]
Garcia S, Grill M, Stiborek J, and Zunino A An empirical comparison of botnet detection methods Comput. Secur. 2014 45 100-123
[37]
Zhou, J., Xu, Z., Rush, A.M., Yu, M.: Automating botnet detection with graph neural networks. arXiv preprint arXiv:2003.06344 (2020)
[38]
Coronato, A., Cuzzocrea, A.: An innovative risk assessment methodology for medical information systems. IEEE Trans. Knowl. Data Eng. 34(7), 3095–3110 (2020)
[39]
Leung, C.K., Cuzzocrea, A., Mai, J.J., Deng, D., Jiang, F.: Personalized deepinf: Enhanced social influence prediction with deep learning and transfer learning. In: 2019 IEEE International Conference on Big Data, pp. 2871–2880. IEEE (2019)
[40]
Leung CK, Braun P, Hoi CSH, Souza J, and Cuzzocrea A Ordonez C, Song I-Y, Anderst-Kotsis G, Tjoa AM, and Khalil I Urban analytics of big transportation data for supporting smart cities Big Data Analytics and Knowledge Discovery 2019 Cham Springer 24-33
[41]
Leung, C.K., Chen, Y., Hoi, C.S., Shang, S., Wen, Y., Cuzzocrea, A.: Big data visualization and visual analytics of COVID-19 data. In: 24th International Conference Information Visualisation, pp. 415–420. IEEE (2020)
[42]
Leung, C.K., Chen, Y., Hoi, C.S., Shang, S., Cuzzocrea, A.: Machine learning and OLAP on big COVID-19 data. In: 2020 IEEE International Conference on Big Data, pp. 5118–5127. IEEE (2020)
[43]
Barkwell, K.E., et al.: Big data visualisation and visual analytics for music data mining. In: 22nd International Conference on Information Visualisation, pp. 235–240. IEEE (2018)
[44]
Camara, R.C., et al.: Fuzzy logic-based data analytics on predicting the effect of hurricanes on the stock market. In: International Conference on Fuzzy Systems, pp. 1–8. IEEE (2018)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Computational Science and Its Applications – ICCSA 2023: 23rd International Conference, Athens, Greece, July 3–6, 2023, Proceedings, Part I
Jul 2023
818 pages
ISBN:978-3-031-36804-2
DOI:10.1007/978-3-031-36805-9

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 03 July 2023

Author Tags

  1. Machine Learning
  2. Botnet Detection

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media