SAST tools are notoriously hard to evaluate and benchmark. The most important thing you want to know about a tool before spending time and money on it: does it give me relevant results? Does it really find the vulnerabilities it promises? Vendors are quick to tell you that their technology will find every vulnerability category out there, and claim to cover every CWE under the sun. But, how do you verify such bold claims? How many vulnerabilities will their tool really uncover, and how many frustrating false positives will you have to trawl through?We've all been there: planting mock vulnerabilities in our code bases to challenge a SAST product. It takes a lot of time, and it really only gets you a synthetic set of vulnerabilities to test against. Or you might run tools against one of the many synthetic benchmarking repositories that are riddled with vulnerabilities. Deep inside you know that those codebases have aged and don't really test coverage for modern web frameworks, and rarely test for vulnerabilities that arise due to complex interplay between dependencies and your own code.If only we could test tools against *real* vulnerabilities! But hold on… We carefully give every major security vulnerability a globally unique CVE identifier and a collection of metadata. Why not use those! We've triaged hundreds of CVEs in open source codebases and identified the fix commit(s) for every single vulnerability. At Black Hat Europe, we will release this benchmarking dataset and tooling to the open source community.This is an initiative by the recently founded Open Source Security Foundation, a part of the Linux Foundation. The working group in which this initiative was developed includes partners from GitHub, Google, Microsoft, Mozilla, and OWASP.