CodeSecBench

Landscape · 2026

AI app security tooling, mapped honestly.

The AI security tools market split into clear quadrants in 2026. Two axes matter for positioning: buyer motion (enterprise sales-led vs. dev-led self-serve) and product angle (offensive "AI hacker" vs. continuous SAST vs. AI code review).

Most companies pick a quadrant. The picks reveal who they're really selling to. CodeSecBench grades the SAST quadrant specifically — pre-deploy static analysis tools — and lets adjacent tools (runtime, eval, code review) sit beside it for landscape clarity.

The 2×2 — buyer motion × product angle

Where each tool sits on the two axes that matter for AI security tooling.

Enterprise sales-led Dev-led / self-serve
SAST
scan + fix
Snyk Code, Checkmarx, Veracode, Fortify, GitHub Advanced Security, ZeroPath getdebug, Gecko Security, Semgrep, SonarQube
Offensive
"AI hacker"
Hex Security, Veria Labs, HackerOne (legacy) Winfunc (small team)
Code review
"AI reviewer"
Macroscope, CodeRabbit, Cursor BugBot, Greptile

The dev-led SAST quadrant got its first dedicated AI-SAST competitor in 2024 — Gecko Security. Gecko is generic SAST (no AI-app-specific patterns out of the box); getdebug is AI-app-native. Both are in the same buyer quadrant. The category is not empty — it's young.

Tools, by category

Dev-led = installable without a sales call, free or low-cost tier, dev-time integration. In bench = currently scored on /results. AI-app coverage is the maintainer's best read of the tool's documentation as of 2026 — disputes welcome via PR.

AI-app SAST — code-side, AI-app-native (the narrow category this benchmark is for) 3 tools

Code-side SAST built for AI-app patterns. As of 2026, this category is sparsely populated. getdebug ships full per-category coverage in a free CLI + hosted tier. Snyk has begun adding AI-related rules to its enterprise tier. Vulnhuntr from Protect AI is research-stage and currently doesn't run cleanly on the modern stack. CodeSecBench is partly a public scoreboard for this category and partly a call for more entrants — especially generic SAST vendors who add AI-app rule packs.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
getdebug

getdebug

AI-app SAST across six behavioral categories + secrets + dep-CVE + auto-fix. Free CLI with true local mode. Free CLI + Free/Pro/Pro Plus hosted Built for AI-app patterns
Snyk Code (AI rules)

Snyk Limited

Recent AI-related rule additions inside the enterprise Snyk Code SAST product

Coverage breadth not publicly benchmarked. Embedded in the broader Snyk Code product, not a standalone tool.

Per-developer/seat, enterprise tier org-facing Partial AI-app coverage
Vulnhuntr

Protect AI (acquired by Palo Alto Networks)

LLM-powered AI-app vulnerability scanner, research-stage

CodeSecBench attempted to run it in our June 2026 calibration and it failed on the current Python/Node stack. Maintenance status uncertain post Protect AI acquisition.

AGPL-licensed open source Partial AI-app coverage

Dev-led generic SAST — closest competitive quadrant for AI-app-native tools 3 tools

Dev-led generic SAST is the quadrant getdebug sits in. Same buyer motion (no sales call, free tier, dev-time integration), same product shape (scan + fix in the loop), but generic CWE coverage rather than AI-app patterns. Gecko Security (YC F24) is the closest direct competitor; Semgrep + SonarQube round out the field.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Gecko Security

Gecko (YC F24)

Dev-led AI-SAST with compiler-accurate semantic call-chain indexer + auto-fix PR bot. Generic SAST, not AI-app-specific.

Closest direct competitor in the dev-led SAST quadrant. Strong taint-analysis primitive; published 30+ CVEs in famous OSS. No CLI as of late 2026 per their site; on-prem is Enterprise-tier only.

Free (10 scans + 1 public repo) · Pro $100/mo · Enterprise No AI-app detection
Semgrep

Semgrep Inc.

Pattern-matching SAST, community + paid platform

Community rule packs touch some AI-app patterns; coverage depends on which packs are enabled. CodeSecBench scores Semgrep on its default community rules.

Open core; paid tier per-dev/mo Partial AI-app coverage
SonarQube

Sonar

Code quality + SAST, community + commercial Community + Developer/Enterprise tiers No AI-app detection

Enterprise SAST — org-facing 6 tools

Mature SAST suites built before AI-app patterns existed. Strong on classic CWE coverage; AI-app coverage is either absent or recently bolted on as a rule pack. Sold to security leadership at mid-large companies.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Snyk Code

Snyk Limited

Enterprise SAST + SCA + container, sales-led

Listed twice: once here for the core SAST product, once above for the AI rule additions.

Per-developer/seat, enterprise tier org-facing Partial AI-app coverage
Checkmarx SAST

Checkmarx

Enterprise SAST, mature CWE coverage, sales-led Enterprise licensing, custom quote org-facing No AI-app detection
Veracode SAST

Veracode

Enterprise SAST + SCA + DAST, sales-led Enterprise licensing, custom quote org-facing No AI-app detection
GitHub Advanced Security

Microsoft / GitHub

Native GitHub SAST (CodeQL) + secret scanning + dep review + Copilot Autofix Per-active-committer, enterprise tier org-facing No AI-app detection
Fortify Static Code Analyzer

OpenText

Enterprise SAST, mature coverage, on-prem option Enterprise licensing org-facing No AI-app detection
ZeroPath

ZeroPath

AI-native AppSec platform — SAST + SCA + secrets + IaC + dynamic testing + auto-fix

Closest in feature surface to getdebug, but has gone enterprise. Same category, different shelf.

Enterprise platform org-facing Partial AI-app coverage

AI security platforms — enterprise org-facing (model + runtime + supply chain) 5 tools

Enterprise org-facing AI security platforms — blend of model-side, runtime, and supply-chain protection. Code-side SAST coverage varies; some include partial code analysis as part of a broader platform. Sold to enterprise AI/ML programs, not to individual developers.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Lasso Security

Lasso Security

AI application security observability + runtime protection

Runtime-side + some code analysis; closer to a platform than a CI-time SAST tool.

Enterprise SaaS, custom quote org-facing Partial AI-app coverage
Mindgard

Mindgard

AI red-team + security testing platform

Primarily model / runtime red-teaming.

Enterprise SaaS org-facing Partial AI-app coverage
Protect AI platform

Protect AI (acquired by Palo Alto Networks, 2024–2025)

Model + ML supply chain security platform

Includes ModelScan + NB Defense + Recon. Vulnhuntr was their open-source AI-app SAST research arm.

Enterprise platform org-facing Partial AI-app coverage
CalypsoAI

CalypsoAI

AI security + governance platform Enterprise SaaS org-facing Partial AI-app coverage
HiddenLayer

HiddenLayer

Model-side AI security (model integrity, inference protection)

Model-side, not code-side. Included for landscape completeness.

Enterprise SaaS org-facing Different category

AI code review (adjacent — not security) 4 tools

AI code review tools that catch correctness regressions — "your function signature change breaks 3 callers." Different category from security SAST. Listed here for clarity, because the technical primitives (AST + reference graph indexers) overlap.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Macroscope

Macroscope

AI code reviewer with language-specific AST + reference graph indexer (10+ languages)

Technically the closest indexer primitive to deep security taint analysis, but explicitly positioned as code review / quality — not SAST.

Per-developer/mo paid tier ~$30 Different category
CodeRabbit

CodeRabbit

AI PR reviewer with diff-focused review, summaries, and inline comments

Generic code review, not security-first. If they ship a security-first mode, the wedge narrows.

Per-developer/mo paid tier + free for OSS Different category
Cursor BugBot

Cursor

Cursor's AI bug-review agent for diffs (in beta) Bundled with Cursor Different category
Greptile

Greptile (YC W24)

AI code review with full-repo context Per-developer/mo paid tier Different category

AI offensive / pentesting (adjacent — different buyer) 3 tools

AI-driven pentesting / red-teaming. Different code object in some cases (deployed app vs source), different buyer (CISO vs engineering team). Natural partnership candidates more than competitors per getdebug's competitive map.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Winfunc

Winfunc (YC W24)

AI hacker that autonomously finds, verifies, and patches vulnerabilities; "zero FP via formal verification"

Same find-verify-fix loop as getdebug; different wedge (formal verification + famous-repo findings).

Enterprise org-facing Partial AI-app coverage
Hex Security

Hex Security (YC W26)

Agentic offensive security at scale — autonomous AI agents running continuous pentests

Different code object (deployed app vs source). Natural partnership candidate, not competitor.

Enterprise org-facing Different category
Veria Labs

Veria Labs

Autonomous AppSec spanning code + cloud, generates PoCs against staging

Credential-led enterprise sale. Adjacent threat, not direct competitor.

Enterprise org-facing Partial AI-app coverage

Secret scanners 3 tools

Single-purpose secret detection. Designed for credential leaks, not behavioral AI-app patterns. CodeSecBench tests them on the secret-shape category for fair like-for-like.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
gitleaks

Open source

Regex + entropy secret detection. Generic credential leaks across providers. MIT-licensed open source Different category
trufflehog

Truffle Security

Secret detection + live-credential verification, OSS + commercial OSS + Enterprise platform Different category
getdebug (secrets layer)

getdebug

Multi-purpose: secrets + dep-CVE + AI-app SAST + auto-fix. The secrets layer competes with gitleaks/trufflehog head-to-head; the AI-app SAST layer is the wedge that single-purpose secret scanners don't have.

On the AI-app-context secret category (client-side LLM keys), getdebug currently outperforms trufflehog on recall (100% vs 0% on the Section A fixtures). On generic credential leaks (Section A's leaky-repo-baseline), gitleaks and trufflehog still lead on finding count. The wedge isn't "we beat secret scanners at their job" — it's "we do secrets and AI-app SAST in one pass."

Free CLI + Free/Pro/Pro Plus hosted Built for AI-app patterns

Python SAST 1 tool

Python-only SAST baselines. CWE-aware, not AI-app-aware.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
bandit

PyCQA

Python-only SAST, mature CWE-based rules MIT-licensed open source No AI-app detection

Runtime AI guard (adjacent — not SAST) 2 tools

Runtime defenses in front of model calls. Adjacent to SAST but a different layer — catches attacks at request time, not pre-deploy.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Lakera Guard

Lakera

Runtime prompt-injection / abuse filtering Per-request / enterprise org-facing Different category
Robust Intelligence

Robust Intelligence (acquired by Cisco, 2024)

Model + runtime AI security Enterprise org-facing Different category

Eval framework (adjacent — not SAST) 3 tools

Pre-deploy evaluation and red-teaming frameworks. Test model behavior; do not analyze application code.

Tool Positioning Pricing shape Dev-led AI-app coverage In bench
Promptfoo

Promptfoo

Prompt eval + red-teaming framework OSS + Enterprise Different category
Garak

NVIDIA

LLM vulnerability scanner (model-side eval) Apache-2.0 open source Different category
Patronus AI

Patronus AI

LLM eval + safety testing platform SaaS org-facing Different category

A call for more AI-app-native SAST

The AI-app SAST category — code-side static analysis built specifically for AI-app patterns — is young. Gecko has a strong dev-led SAST primitive but doesn't ship AI-app-specific rules. Snyk has begun adding AI rules but only in their enterprise tier. Vulnhuntr is research-stage. The category needs more entrants — generic SAST vendors adding AI-app rule packs, new dev-first tools, anything that grades against the same public corpus.

The corpus, the truth files, and the score.js harness go public once Tier C lands its final two repositories (cycles 5 and 6 in flight). The methodology and current scoring rules are already published — see /methodology and the multi-maintainer model on /governance. Once the public release lands, this is where the submit-a-tool flow lives.