Landscape · 2026
AI app security tooling, mapped honestly.
The AI security tools market split into clear quadrants in 2026. Two axes matter for positioning: buyer motion (enterprise sales-led vs. dev-led self-serve) and product angle (offensive "AI hacker" vs. continuous SAST vs. AI code review).
Most companies pick a quadrant. The picks reveal who they're really selling to. CodeSecBench grades the SAST quadrant specifically — pre-deploy static analysis tools — and lets adjacent tools (runtime, eval, code review) sit beside it for landscape clarity.
The 2×2 — buyer motion × product angle
Where each tool sits on the two axes that matter for AI security tooling.
| Enterprise sales-led | Dev-led / self-serve | |
|---|---|---|
| SAST scan + fix | Snyk Code, Checkmarx, Veracode, Fortify, GitHub Advanced Security, ZeroPath | getdebug, Gecko Security, Semgrep, SonarQube |
| Offensive "AI hacker" | Hex Security, Veria Labs, HackerOne (legacy) | Winfunc (small team) |
| Code review "AI reviewer" | — | Macroscope, CodeRabbit, Cursor BugBot, Greptile |
The dev-led SAST quadrant got its first dedicated AI-SAST competitor in 2024 — Gecko Security. Gecko is generic SAST (no AI-app-specific patterns out of the box); getdebug is AI-app-native. Both are in the same buyer quadrant. The category is not empty — it's young.
Tools, by category
Dev-led = installable without a sales call, free or low-cost tier, dev-time integration. In bench = currently scored on /results. AI-app coverage is the maintainer's best read of the tool's documentation as of 2026 — disputes welcome via PR.
AI-app SAST — code-side, AI-app-native (the narrow category this benchmark is for) 3 tools
Code-side SAST built for AI-app patterns. As of 2026, this category is sparsely populated. getdebug ships full per-category coverage in a free CLI + hosted tier. Snyk has begun adding AI-related rules to its enterprise tier. Vulnhuntr from Protect AI is research-stage and currently doesn't run cleanly on the modern stack. CodeSecBench is partly a public scoreboard for this category and partly a call for more entrants — especially generic SAST vendors who add AI-app rule packs.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| getdebug getdebug | AI-app SAST across six behavioral categories + secrets + dep-CVE + auto-fix. Free CLI with true local mode. | Free CLI + Free/Pro/Pro Plus hosted | ✓ | Built for AI-app patterns | ✓ |
| Snyk Code (AI rules) Snyk Limited | Recent AI-related rule additions inside the enterprise Snyk Code SAST product Coverage breadth not publicly benchmarked. Embedded in the broader Snyk Code product, not a standalone tool. | Per-developer/seat, enterprise tier | org-facing | Partial AI-app coverage | — |
| Vulnhuntr Protect AI (acquired by Palo Alto Networks) | LLM-powered AI-app vulnerability scanner, research-stage CodeSecBench attempted to run it in our June 2026 calibration and it failed on the current Python/Node stack. Maintenance status uncertain post Protect AI acquisition. | AGPL-licensed open source | ✓ | Partial AI-app coverage | — |
Dev-led generic SAST — closest competitive quadrant for AI-app-native tools 3 tools
Dev-led generic SAST is the quadrant getdebug sits in. Same buyer motion (no sales call, free tier, dev-time integration), same product shape (scan + fix in the loop), but generic CWE coverage rather than AI-app patterns. Gecko Security (YC F24) is the closest direct competitor; Semgrep + SonarQube round out the field.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Gecko Security Gecko (YC F24) | Dev-led AI-SAST with compiler-accurate semantic call-chain indexer + auto-fix PR bot. Generic SAST, not AI-app-specific. Closest direct competitor in the dev-led SAST quadrant. Strong taint-analysis primitive; published 30+ CVEs in famous OSS. No CLI as of late 2026 per their site; on-prem is Enterprise-tier only. | Free (10 scans + 1 public repo) · Pro $100/mo · Enterprise | ✓ | No AI-app detection | — |
| Semgrep Semgrep Inc. | Pattern-matching SAST, community + paid platform Community rule packs touch some AI-app patterns; coverage depends on which packs are enabled. CodeSecBench scores Semgrep on its default community rules. | Open core; paid tier per-dev/mo | ✓ | Partial AI-app coverage | ✓ |
| SonarQube Sonar | Code quality + SAST, community + commercial | Community + Developer/Enterprise tiers | ✓ | No AI-app detection | — |
Enterprise SAST — org-facing 6 tools
Mature SAST suites built before AI-app patterns existed. Strong on classic CWE coverage; AI-app coverage is either absent or recently bolted on as a rule pack. Sold to security leadership at mid-large companies.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Snyk Code Snyk Limited | Enterprise SAST + SCA + container, sales-led Listed twice: once here for the core SAST product, once above for the AI rule additions. | Per-developer/seat, enterprise tier | org-facing | Partial AI-app coverage | — |
| Checkmarx SAST Checkmarx | Enterprise SAST, mature CWE coverage, sales-led | Enterprise licensing, custom quote | org-facing | No AI-app detection | — |
| Veracode SAST Veracode | Enterprise SAST + SCA + DAST, sales-led | Enterprise licensing, custom quote | org-facing | No AI-app detection | — |
| GitHub Advanced Security Microsoft / GitHub | Native GitHub SAST (CodeQL) + secret scanning + dep review + Copilot Autofix | Per-active-committer, enterprise tier | org-facing | No AI-app detection | — |
| Fortify Static Code Analyzer OpenText | Enterprise SAST, mature coverage, on-prem option | Enterprise licensing | org-facing | No AI-app detection | — |
| ZeroPath ZeroPath | AI-native AppSec platform — SAST + SCA + secrets + IaC + dynamic testing + auto-fix Closest in feature surface to getdebug, but has gone enterprise. Same category, different shelf. | Enterprise platform | org-facing | Partial AI-app coverage | — |
AI security platforms — enterprise org-facing (model + runtime + supply chain) 5 tools
Enterprise org-facing AI security platforms — blend of model-side, runtime, and supply-chain protection. Code-side SAST coverage varies; some include partial code analysis as part of a broader platform. Sold to enterprise AI/ML programs, not to individual developers.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Lasso Security Lasso Security | AI application security observability + runtime protection Runtime-side + some code analysis; closer to a platform than a CI-time SAST tool. | Enterprise SaaS, custom quote | org-facing | Partial AI-app coverage | — |
| Mindgard Mindgard | AI red-team + security testing platform Primarily model / runtime red-teaming. | Enterprise SaaS | org-facing | Partial AI-app coverage | — |
| Protect AI platform Protect AI (acquired by Palo Alto Networks, 2024–2025) | Model + ML supply chain security platform Includes ModelScan + NB Defense + Recon. Vulnhuntr was their open-source AI-app SAST research arm. | Enterprise platform | org-facing | Partial AI-app coverage | — |
| CalypsoAI CalypsoAI | AI security + governance platform | Enterprise SaaS | org-facing | Partial AI-app coverage | — |
| HiddenLayer HiddenLayer | Model-side AI security (model integrity, inference protection) Model-side, not code-side. Included for landscape completeness. | Enterprise SaaS | org-facing | Different category | — |
AI code review (adjacent — not security) 4 tools
AI code review tools that catch correctness regressions — "your function signature change breaks 3 callers." Different category from security SAST. Listed here for clarity, because the technical primitives (AST + reference graph indexers) overlap.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Macroscope Macroscope | AI code reviewer with language-specific AST + reference graph indexer (10+ languages) Technically the closest indexer primitive to deep security taint analysis, but explicitly positioned as code review / quality — not SAST. | Per-developer/mo paid tier ~$30 | ✓ | Different category | — |
| CodeRabbit CodeRabbit | AI PR reviewer with diff-focused review, summaries, and inline comments Generic code review, not security-first. If they ship a security-first mode, the wedge narrows. | Per-developer/mo paid tier + free for OSS | ✓ | Different category | — |
| Cursor BugBot Cursor | Cursor's AI bug-review agent for diffs (in beta) | Bundled with Cursor | ✓ | Different category | — |
| Greptile Greptile (YC W24) | AI code review with full-repo context | Per-developer/mo paid tier | ✓ | Different category | — |
AI offensive / pentesting (adjacent — different buyer) 3 tools
AI-driven pentesting / red-teaming. Different code object in some cases (deployed app vs source), different buyer (CISO vs engineering team). Natural partnership candidates more than competitors per getdebug's competitive map.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Winfunc Winfunc (YC W24) | AI hacker that autonomously finds, verifies, and patches vulnerabilities; "zero FP via formal verification" Same find-verify-fix loop as getdebug; different wedge (formal verification + famous-repo findings). | Enterprise | org-facing | Partial AI-app coverage | — |
| Hex Security Hex Security (YC W26) | Agentic offensive security at scale — autonomous AI agents running continuous pentests Different code object (deployed app vs source). Natural partnership candidate, not competitor. | Enterprise | org-facing | Different category | — |
| Veria Labs Veria Labs | Autonomous AppSec spanning code + cloud, generates PoCs against staging Credential-led enterprise sale. Adjacent threat, not direct competitor. | Enterprise | org-facing | Partial AI-app coverage | — |
Secret scanners 3 tools
Single-purpose secret detection. Designed for credential leaks, not behavioral AI-app patterns. CodeSecBench tests them on the secret-shape category for fair like-for-like.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| gitleaks Open source | Regex + entropy secret detection. Generic credential leaks across providers. | MIT-licensed open source | ✓ | Different category | ✓ |
| trufflehog Truffle Security | Secret detection + live-credential verification, OSS + commercial | OSS + Enterprise platform | ✓ | Different category | ✓ |
| getdebug (secrets layer) getdebug | Multi-purpose: secrets + dep-CVE + AI-app SAST + auto-fix. The secrets layer competes with gitleaks/trufflehog head-to-head; the AI-app SAST layer is the wedge that single-purpose secret scanners don't have. On the AI-app-context secret category (client-side LLM keys), getdebug currently outperforms trufflehog on recall (100% vs 0% on the Section A fixtures). On generic credential leaks (Section A's leaky-repo-baseline), gitleaks and trufflehog still lead on finding count. The wedge isn't "we beat secret scanners at their job" — it's "we do secrets and AI-app SAST in one pass." | Free CLI + Free/Pro/Pro Plus hosted | ✓ | Built for AI-app patterns | ✓ |
Python SAST 1 tool
Python-only SAST baselines. CWE-aware, not AI-app-aware.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| bandit PyCQA | Python-only SAST, mature CWE-based rules | MIT-licensed open source | ✓ | No AI-app detection | ✓ |
Runtime AI guard (adjacent — not SAST) 2 tools
Runtime defenses in front of model calls. Adjacent to SAST but a different layer — catches attacks at request time, not pre-deploy.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Lakera Guard Lakera | Runtime prompt-injection / abuse filtering | Per-request / enterprise | org-facing | Different category | — |
| Robust Intelligence Robust Intelligence (acquired by Cisco, 2024) | Model + runtime AI security | Enterprise | org-facing | Different category | — |
Eval framework (adjacent — not SAST) 3 tools
Pre-deploy evaluation and red-teaming frameworks. Test model behavior; do not analyze application code.
| Tool | Positioning | Pricing shape | Dev-led | AI-app coverage | In bench |
|---|---|---|---|---|---|
| Promptfoo Promptfoo | Prompt eval + red-teaming framework | OSS + Enterprise | ✓ | Different category | — |
| Garak NVIDIA | LLM vulnerability scanner (model-side eval) | Apache-2.0 open source | ✓ | Different category | — |
| Patronus AI Patronus AI | LLM eval + safety testing platform | SaaS | org-facing | Different category | — |
A call for more AI-app-native SAST
The AI-app SAST category — code-side static analysis built specifically for AI-app patterns — is young. Gecko has a strong dev-led SAST primitive but doesn't ship AI-app-specific rules. Snyk has begun adding AI rules but only in their enterprise tier. Vulnhuntr is research-stage. The category needs more entrants — generic SAST vendors adding AI-app rule packs, new dev-first tools, anything that grades against the same public corpus.
The corpus, the truth files, and the score.js harness go public once Tier C lands its final two repositories (cycles 5 and 6 in flight). The methodology and current scoring rules are already published — see /methodology and the multi-maintainer model on /governance. Once the public release lands, this is where the submit-a-tool flow lives.