A chit-chat between Llama 2 and ChatGPT for the automated creation of exploits

Caturano, F.; Ciotola, J.; Romano, S. P.; Varlese, M.

doi:10.1016/j.comnet.2025.111501

Software exploitation is the process of taking advantage of vulnerabilities in software systems in order to perform unintended activities. Its understanding leads to improved defensive measures and informed decision making about which security mechanisms to prioritize. However, creating a software exploit is typically a time-consuming and manual task that demands a deep understanding of programming, network protocols, operating system internals, and computer architectures. Additionally, it requires the ability to integrate this knowledge through complex reasoning and problem-solving techniques. This paper proposes an approach to tackle the aforementioned problems by encouraging a conversation between Large Language Models (LLMs) with the purpose of generating software exploits. First, the chosen LLMs are provided with the necessary context knowledge, through modern techniques of fine-tuning and prompt engineering. Then, the exploitation methodology is divided into several steps: vulnerable program analysis, identification of the exploit, planning of the exploitation process, discovery of architecture internals, and production of the exploit software. A first Large Language Model (LLM) is designed to ask questions to a second LLM regarding the execution of the above mentioned steps. The final output from the second LLM provides fully automated, functional exploit code. This method demonstrates how two LLMs – one possessing capabilities in exploitation and coding, and the other with expertise in computer architecture – can collaborate to successfully exploit Buffer Overflow vulnerabilities.

A chit-chat between Llama 2 and ChatGPT for the automated creation of exploits / Caturano, F.; Ciotola, J.; Romano, S. P.; Varlese, M.. - In: COMPUTER NETWORKS. - ISSN 1389-1286. - 270:(2025). [10.1016/j.comnet.2025.111501]