Malla: demystifying real-world large language model integrated malicious services
Article No.: 263, Pages 4693 - 4710
Abstract
The underground exploitation of large language models (LLMs) for malicious services (i.e., Malla) is witnessing an uptick, amplifying the cyber threat landscape and posing questions about the trustworthiness of LLM technologies. However, there has been little effort to understand this new cybercrime, in terms of its magnitude, impact, and techniques. In this paper, we conduct the first systematic study on 212 real-world Mallas, uncovering their proliferation in underground marketplaces and exposing their operational modalities. Our study discloses the Malla ecosystem, revealing its significant growth and impact on today's public LLM services. Through examining 212 Mallas, we uncovered eight backend LLMs used by Mallas, along with 182 prompts that circumvent the protective measures of public LLM APIs. We further demystify the tactics employed by Mallas, including the abuse of uncensored LLMs and the exploitation of public LLM APIs through jailbreak prompts. Our findings enable a better understanding of the real-world exploitation of LLMs by cybercriminals, offering insights into strategies to counteract this cybercrime.
References
[1]
ast. https://docs.python.org/3/library/ast.html.
[2]
Badgpt-#1 hack ai service. https://badgpt.pro/.
[3]
Badgpt.pro - best gpt by hackers for hackers. https://hackforums.net/showthread.php?tid=6249463.
[4]
Blackhatgpt. https://blackhatgpt.netlify.app/.
[5]
Blackhatgpt new ai tool the dark side of generative ai is here. | xss.is. https://xss.is/threads/96643/.
[6]
Blackhatgpt the dark side of generative ai. https://hackforums.net/showthread.php?tid=6250241.
[7]
Blockchain explorer - bitcoin tracker | blockchain.com. https://www.blockchain.com/explorer.
[8]
Btcpay server. https://btcpayserver.org/.
[9]
Charybdis worm | many infections | spread widely | discord | telegram | lan |and more | hack forums. https://hackforums.net/showthread.php?tid=6229200.
[10]
Clang. https://clang.llvm.org/.
[11]
Codegpt | hack forums. https://hackforums.net/showthread.php?tid=6238843.
[12]
codeop. https://docs.python.org/3/library/codeop.html.
[13]
Confense email security. https://cofense.com/knowledge-center-hub/real-phishing-email-examples/.
[14]
Darkbard | google bart ai evil twin | fraud ai bot | 6 month. http://kingdom5bb43gc5umrviiwbicomgrma57jcmxm5uinjnfmegepbhmrad.onion/offer/view?id=69413.
[15]
Darkbard | telegraph. https://telegra.ph/darkBARD-AI-08-13.
[16]
Darkbert | telegraph. https://telegra.ph/darkBERT-AI-08-13.
[17]
Darkbert | worlds most powerful ai bot | 1fraudbot| 1 month. http://abacusxqw5uv7amzqazdbxo2nd57vaioblew6m25pbzznaf4ph6nh6ad.onion/listing/84edfc3f895e9230d7eff3cd.
[18]
Darkgpt. https://t.me/DarkGPT3_bot.
[19]
Darkgpt | hack forums. https://hackforums.net/showthread.php?tid=6253460.
[20]
Digital selling with ease | sellix. https://sellix.io/.
[21]
Eleutherai/gpt-j-6b · hugging face. https://huggingface.co/EleutherAI/gpt-j-6b.
[22]
Enron email dataset. https://www.cs.cmu.edu/~enron/.
[23]
Escape gpt - 1 jailbreak gpt no limitations | best jailbreak gpt make money | hack forums. https://hackforums.net/showthread.php?tid=6250272.
[24]
Eth blockchain explorer. https://etherscan.io/.
[25]
Evil confidant | jailbreakchat. https://www.jailbreakchat.com/prompt/588ab0ed-2829-4be8-a3f3-f28e29c06621.
[26]
Evil-gpt: The best alternative to wormgpt | breachforums. https://breachforums.st/Thread-SELLING-Evil-GPT-THE-BEST-ALTERNATIVE-TO-WORMGPT.
[27]
Evilai. http://hackforums.net/showthread.php?tid=6268764.
[28]
Explore xxx-gpt's digital store. https://xxx-gpt.mysellix.io/.
[29]
Flowgpt. https://flowgpt.com/.
[30]
Fraudgpt. https://btcpay0.voltageapp.io/apps/GcgNdVHQUUbrgfjenixLRv7VPaF/pos.
[31]
Free keyword tool | wordstream. https://www.wordstream.com/keywords?camplink=mainnavbar&campname=KWT&cid=Web_Any_MegaMenu_Keywords_KWTool_KWTool.
[32]
Freedomgpt. https://www.freedomgpt.com/.
[33]
Freedomgpt | hack forums. https://hackforums.net/showthread.php?tid=6250664.
[34]
The generative ai tool cybercriminals are using to launch business email compromise attacks. https://slashnext.com/blog/wormgpt-the-generative-ai-tool-cybercriminals-are-using-to-launch-business-email-compromise-attacks/.
[35]
Gpt-3 wiki. https://wikipedia.org/wiki/GPT-3.
[36]
Gpt-4report. https://cdn.openai.com/papers/gpt-4.pdf.
[37]
Guardrails ai. https://www.guardrailsai.com/.
[38]
Gunning fog index - wikipedia. https://en.wikipedia.org/wiki/Gunning_fog_index.
[39]
Hackergpt. https://www.hackergpt.chat/.
[40]
Hackergpt | breachforums. https://breachforums.st/Thread-HackerGPT-Chatgpt-jailbreak.
[41]
hofnar05 dark-gpt | breachforums. https://breachforums.st/Thread-hofnar05-Dark-GPT.
[42]
hofnar05 dark-gpt - telegraph. https://telegra.ph/hofnar05-Dark-GPT-10-29.
[43]
idllresearch/malicious-gpt. https://github.com/idllresearch/malicious-gpt.
[44]
Index of /iso/vicibox/server/archive. http://download.vicidial.com/iso/vicibox/server/archive/.
[45]
Is fraud-gpt any good | hack forums. https://hackforums.net/showthread.php?tid=6253036.
[46]
jaro-winkler | levenshtein 0.23.0 documentation. https://maxbachmann.github.io/Levenshtein/levenshtein.html#jaro-winkler.
[47]
Luna-ai-llama2-uncensored-gguf. https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGUF.
[48]
Machiavelli gpt - fraudgpt upgraded. https://breachforums.st/Thread-Machiavelli-GPT-fraud-gpt-upgraded.
[49]
Makergpt bypass | hack forums. https://hackforums.net/showthread.php?tid=6239716.
[50]
michellejieli/nsfw_text_classifier · hugging face. https://huggingface.co/michellejieli/NSFW_text_classifier.
[51]
Nanogpt. t.me/nanogpt1.
[52]
Nanogpt | breachforums. https://breachforums.st/Thread-NanoGPT-a-non-limited-chatgpt-project.
[53]
Netlify. https://netlify.app/.
[54]
Obscuregpt - truly uncensored ai chatbot - obscuregpt.com. https://hackforums.net/showthread.php?tid=6254767.
[55]
ohmplatform/freedomgpt. https://github.com/ohmplatform/FreedomGPT/blob/60cf5067cd4822c0b6bf485b5fa32580dd33df40/renderer/localModels/offlineModels.ts.
[56]
Oopspam anti-spam api: A powerful spam filter for any content exchange. https://www.oopspam.com/.
[57]
openai/tiktoken. https://github.com/openai/tiktoken.
[58]
Owasp webgoat. https://owasp.org/www-project-webgoat/.
[59]
Phishtank. https://phishtank.org/.
[60]
Poe. https://poe.com/.
[61]
Poe - xxxgptdemo. https://poe.com/XXXGPTdemo.
[62]
Pygmalion 13b-wnr.ai. https://wnr.ai/models/pygmalion-13b.
[63]
Pygmalionai. https://pygmalion.chat/.
[64]
Pygmalionai/pygmalion-2-13b · hugging face. https://huggingface.co/PygmalionAI/pygmalion-2-13b.
[65]
Recent examples of phishing. https://uit.stanford.edu/phishing/.
[66]
Salesforce/codet5p-110m-embedding · hugging face. https://huggingface.co/Salesforce/codet5p-110m-embedding.
[67]
Semantic textual similarity | sentence-transformers documentation. https://www. sbert.net/docs/usage/semantic_textual_similarity.html.
[68]
Successor of the dark ai triads introducing abrax666 | xss.is. https://xss.is/threads/100890/.
[69]
Tap-m/luna-ai-llama2-uncensored · hugging face. https://huggingface.co/Tap-M/Luna-AI-Llama2-Uncensored.
[70]
Tap mobile. https://tap.pm/.
[71]
togethercomputer/openchatkit. https://github.com/togethercomputer/OpenChatKit.
[72]
Vicibox. http://www.vicibox.com/.
[73]
vicidial-multiple-sqli.md | rapid7/metasploit-framework. https://github.com/rapid7/metasploit-framework/blob/master/documentation/modules/auxiliary/scanner/http/vicidial_multiple_sqli.md.
[74]
Vicidial.com. https://www.vicidial.com/.
[75]
Virustotal - home. https://www.virustotal.com/.
[76]
The w3c markup validation service. https://validator.w3.org/.
[77]
Wolfgpt. https://t.me/KEP_TEAM/716.
[78]
Wolfgpt - the alternative to wormgpt and fraudgpt | breach-forums. https://breachforums.st/Thread-WolfGPT-The-alternative-to-WormGPT-and-FraudGPT.
[79]
Wormgpt. https://udpgame.site/.
[80]
Wormgpt - best gpt alternative without limits - privacy focused - easy money! | hack forums. https://hackforums.net/showthread.php?tid=6245159.
[81]
Wormgpt and fraudgpt - the rise of malicious llms. https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/wormgpt-and-fraudgpt-the-rise-of-malicious-llms/.
[82]
Wormgpt service has been shut down by its developer | hack forums. https://hackforums.net/showthread.php?tid=6249700.
[83]
Wormgpt v4 is here! https://btcpay0.voltageapp.io/apps/4fdWdy3z1kiaNhYpvjvd16i3utd/pos.
[84]
Xxxgpt do it all ! botnet,rat,crypter,vbv pass by,atm / pos, info crypto grabber etc | xss.is. https://xss.is/threads/94300/.
[85]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages, 3(POPL):1-29, 2019.
[86]
Sumayah Alrwais, Xiaojing Liao, Xianghang Mi, Peng Wang, XiaoFeng Wang, Feng Qian, Raheem Beyah, and Damon McCoy. Under the shadow of sunshine: Understanding and detecting bulletproof hosting on legitimate service provider networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 805-823, 2017.
[87]
Michael Bailey, David Dittrich, Erin Kenneally, and Doug Maughan. The menlo report. IEEE Security & Privacy, 10(2):71-75, 2012.
[88]
Egor Bogomolov, Vladimir Kovalenko, Yurii Rebryk, Alberto Bacchelli, and Timofey Bryksin. Authorship attribution of source code: A language-agnostic approach and applicability in software engineering. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 932-944, 2021.
[89]
Hezekiah J Branch, Jonathan Rodriguez Cefalu, Jeremy McHugh, Leyla Hujer, Aditya Bahl, Daniel del Castillo Iglesias, Ron Heichman, and Ramesh Darwishi. Evaluating the susceptibility of pre-trained language models via handcrafted adversarial examples. arXiv preprint arXiv:2209.02128, 2022.
[90]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877-1901, 2020.
[91]
Ben Buchanan, Andrew Lohn, and Micah Musser. Truth, lies, and automation: How language models could change disinformation. Center for Security and Emerging Technology, 2021.
[92]
Bochuan Cao, Yuanpu Cao, Lu Lin, and Jinghui Chen. Defending against alignment-breaking attacks via robustly aligned llm. arXiv preprint arXiv:2309.14348, 2023.
[93]
Kate Connolly, Anna Klempay, Mary McCann, and Paul Brenner. Dark web marketplaces: Data for collaborative threat intelligence. Digital Threats: Research and Practice, 4(4), 2023.
[94]
Ronen Eldan and Yuanzhi Li. Tinystories: How small can language models be and still speak coherent english? arXiv preprint arXiv:2305.07759, 2023.
[95]
Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, et al. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027, 2020.
[96]
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you've signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. pages 79-90, 2023.
[97]
Maanak Gupta, CharanKumar Akiri, Kshitiz Aryal, Eli Parker, and Lopamudra Praharaj. From chatgpt to threatgpt: Impact of generative ai in cybersecurity and privacy. IEEE Access, 2023.
[98]
Julian Hazell. Large language models can be used to effectively scale spear phishing campaigns. arXiv preprint arXiv:2305.06972, 2023.
[99]
Danny Yuxing Huang, Maxwell Matthaios Aliapoulios, Vector Guo Li, Luca Invernizzi, Elie Bursztein, Kylie McRoberts, Jonathan Levin, Kirill Levchenko, Alex C Snoeren, and Damon McCoy. Tracking ransomware end-to-end. In 2018 IEEE Symposium on Security and Privacy (SP), pages 618-631. IEEE, 2018.
[100]
Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, et al. Llama guard: Llm-based input-output safeguard for human-ai conversations. arXiv preprint arXiv:2312.06674, 2023.
[101]
Tadayoshi Kohno, Yasemin Acar, and Wulf Loh. Ethical frameworks and computer security trolley problems: Foundations for conversations. In 32nd USENIX Security Symposium (USENIX Security 23), pages 5145-5162, Anaheim, CA, 2023. USENIX Association.
[102]
Sarah Kreps, R Miles McCain, and Miles Brundage. All the news that's fit to fabricate: Ai-generated text as a tool of media misinformation. Journal of experimental political science, 9(1):104-117, 2022.
[103]
Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, Mark Félegyházi, Chris Grier, Tristan Halvorson, Chris Kanich, Christian Kreibich, He Liu, et al. Click trajectories: End-to-end analysis of the spam value chain. In 2011 ieee symposium on security and privacy, pages 431-446. IEEE, 2011.
[104]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1-35, 2023.
[105]
Yi Liu, Gelei Deng, Zhengzi Xu, Yuekang Li, Yaowen Zheng, Ying Zhang, Lida Zhao, Tianwei Zhang, and Yang Liu. Jailbreaking chatgpt via prompt engineering: An empirical study. arXiv preprint arXiv:2305.13860, 2023.
[106]
Todor Markov, Chong Zhang, Sandhini Agarwal, Florentine Eloundou Nekoul, Theodore Lee, Steven Adler, Angela Jiang, and Lilian Weng. A holistic approach to undesired content detection in the real world. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 15009-15018, 2023.
[107]
Damon McCoy, Andreas Pitsillidis, Jordan Grant, Nicholas Weaver, Christian Kreibich, Brian Krebs, Geoffrey Voelker, Stefan Savage, and Kirill Levchenko. {PharmaLeaks}: Understanding the business of online pharmaceutical affiliate programs. In 21st USENIX Security Symposium (USENIX Security 12), pages 1-16, 2012.
[108]
Meta. Llama use policy. https://ai.meta.com/llama/use-policy/.
[109]
Jaron Mink, Licheng Luo, Natä M Barbosa, Olivia Figueira, Yang Wang, and Gang Wang. {DeepPhish} : Understanding user trust towards artificially generated profiles in online social networks. In 31st USENIX Security Symposium, pages 1669-1686, 2022.
[110]
Graeme R Newman and Kelly Socia. Sting operations. US Department of Justice, 2007.
[111]
OpenAI. Moderation. https://platform.openai.com/docs/guides/moderation.
[112]
OpenAI. Usage policies. https://openai.com/policies/usage-policies.
[113]
Will Oremus. The clever trick that turns chatgpt into its evil twin. Washington Post, 2023.
[114]
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wain-wright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730-27744, 2022.
[115]
Fábio Perez and Ian Ribeiro. Ignore previous prompt: Attack techniques for language models. arXiv preprint arXiv:2211.09527, 2022.
[116]
Poe. Poe usage guidelines. https://poe.com/usage_guidelines.
[117]
Rebecca S Portnoff, Sadia Afroz, Greg Durrett, Jonathan K Kummerfeld, Taylor Berg-Kirkpatrick, Damon McCoy, Kirill Levchenko, and Vern Paxson. Tools for automated analysis of cybercriminal markets. In Proceedings of international conference on world wide web, 2017.
[118]
Huachuan Qiu, Shuai Zhang, Anqi Li, Hongliang He, and Zhen-zhong Lan. Latent jailbreak: A benchmark for evaluating text safety and output robustness of large language models. arXiv preprint arXiv:2307.08487, 2023.
[119]
Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, and Yang Zhang. Unsafe diffusion: On the generation of unsafe images and hateful memes from text-to-image models. arXiv preprint arXiv:2305.13873, 2023.
[120]
Traian Rebedea, Razvan Dinu, Makesh Narsimhan Sreedhar, Christopher Parisien, and Jonathan Cohen. Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2023.
[121]
Alexander Robey, Eric Wong, Hamed Hassani, and George J Pappas. Smoothllm: Defending large language models against jailbreaking attacks. arXiv preprint arXiv:2310.03684, 2023.
[122]
Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. "do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models. arXiv preprint arXiv:2308.03825, 2023.
[123]
Adaku Uchendu, Thai Le, Kai Shu, and Dongwon Lee. Authorship attribution for neural text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 8384-8395, 2020.
[124]
Peng Wang, Xiaojing Liao, Yue Qin, and XiaoFeng Wang. Into the deep web: Understanding e-commercefraud from autonomous chat with cybercriminals. In Proceedings of the ISOC Network and Distributed System Security Symposium (NDSS), 2020, 2020.
[125]
Jiawei Zhou, Yixuan Zhang, Qianni Luo, Andrea G Parker, and Munmun De Choudhury. Synthetic lies: Understanding ai-generated misinformation and evaluating algorithmic and human solutions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1-20, 2023.
Index Terms
- Malla: demystifying real-world large language model integrated malicious services
Index terms have been assigned to the content through auto-classification.
Recommendations
Security beyond cybersecurity: side-channel attacks against non-cyber systems and their countermeasures
AbstractSide-channels are unintended pathways within target systems that leak internal information, exploitable via side-channel attack techniques that extract the target information, compromising the system’s security and privacy. Side-channel attacks ...
Detecting Insider Theft of Trade Secrets
Trusted insiders who misuse their privileges to gather and steal sensitive information represent a potent threat to businesses. Applying access controls to protect sensitive information can reduce the threat but has significant limitations. Even if ...
Comments
Information & Contributors
Information
Published In
Copyright © 2024 The USENIX Association.
Sponsors
- Bloomberg Engineering
- Google Inc.
- NSF
- Futurewei Technologies
- IBM
Publisher
USENIX Association
United States
Publication History
Published: 12 August 2024
Qualifiers
- Research-article
- Research
- Refereed limited
Acceptance Rates
Overall Acceptance Rate 40 of 100 submissions, 40%
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 05 Feb 2025