Sort by:  

Conference:  Defcon 31
Authors: Xavier Cadena

Large Language Models are already revolutionizing the software development landscape. As hackers we can only do what we've always done, embrace the machine and use it to do our bidding. There are many valid criticisms of GPT models for writing code like the tendency to hallucinate functions, not being able to reason about architecture, training done on amateur code, limited context due to token length, and more. None of which are particularly important when writing fuzz tests. This presentation will delve into the integration of LLMs into fuzz testing, providing attendees with the insights and tools necessary to transform and automate their security assessment strategies. The presentation will kick off with an introduction to LLMs; how they work, the potential use cases and challenges for hackers, prompt writing tips, and the deficiencies of current models. We will then provide a high level overview explaining the purpose, goals, and obstacles of fuzzing, why this research was undertaken, and why we chose to start with 'memory safe' Python. We will then explore efficient usage of LLMs for coding, and the Primary benefits LLMs offer for security work, paving the way for a comprehensive understanding of how LLMs can automate tasks traditionally performed by humans in fuzz testing engagements. We will then introduce FuzzForest, an open source tool that harnesses the power of LLMs to automatically write, fix, and triage fuzz tests on Python code. A thorough discussion on the workings of FuzzForest will follow, with a focus on the challenges faced during development and our solutions. The highlight of the talk will showcase the results of running the tool on the 20 most popular open-source Python libraries which resulted in identifying dozens of bugs. We will end the talk with an analysis of efficacy and question if we'll all be replaced with a SecurityGPT model soon. To maximize the benefits of this talk, attendees should possess a fundamental understanding of fuzz testing, programming languages, and basic AI concepts. However, a high-level refresher will be provided to ensure a smooth experience for all participants.