Public benchmark · AI-app SAST
The SAST tools your team already trusts weren't designed for AI apps.
AI applications introduce a different attack surface — prompt injection,
unsafe role merging, client-side LLM keys, PII flowing into prompts,
unbounded streams, unsafe tool outputs. None of these patterns existed when
Snyk, Checkmarx, Veracode, GitHub Advanced Security, or any other enterprise
SAST tool calibrated its rule set. CodeSecBench is the public benchmark that
measures who catches what.
Maintain a SAST tool? Get on the leaderboard.
CodeSecBench will grade any tool that submits. The maintainer is transparent — see
governance —
and a multi-maintainer model takes over the moment a second tool's
maintainer joins.
The corpus, the truth files, and the score.js harness go
public once Tier C lands its final two repositories (cycles 5 and 6 in flight).
The methodology, results, targets, and SAST landscape are all browsable now.