Landscape · 2026

AI app security tooling, mapped honestly.

The AI security tools market split into clear quadrants in 2026. Two axes matter for positioning: buyer motion (enterprise sales-led vs. dev-led self-serve) and product angle (offensive "AI hacker" vs. continuous SAST vs. AI code review).

Most companies pick a quadrant. The picks reveal who they're really selling to. CodeSecBench grades the SAST quadrant specifically — pre-deploy static analysis tools — and lets adjacent tools (runtime, eval, code review) sit beside it for landscape clarity.

The 2×2 — buyer motion × product angle

Where each tool sits on the two axes that matter for AI security tooling.

	Enterprise sales-led	Dev-led / self-serve
SAST scan + fix	Snyk Code, Checkmarx, Veracode, Fortify, GitHub Advanced Security, ZeroPath	getdebug, Gecko Security, Semgrep, SonarQube
Offensive "AI hacker"	Hex Security, Veria Labs, HackerOne (legacy)	Winfunc (small team)
Code review "AI reviewer"	—	Macroscope, CodeRabbit, Cursor BugBot, Greptile

The dev-led SAST quadrant got its first dedicated AI-SAST competitor in 2024 — Gecko Security. Gecko is generic SAST (no AI-app-specific patterns out of the box); getdebug is AI-app-native. Both are in the same buyer quadrant. The category is not empty — it's young.

Tools, by category

Dev-led = installable without a sales call, free or low-cost tier, dev-time integration. In bench = currently scored on /results. AI-app coverage is the maintainer's best read of the tool's documentation as of 2026 — disputes welcome via PR.

AI-app SAST — code-side, AI-app-native (the narrow category this benchmark is for) 3 tools

Code-side SAST built for AI-app patterns. As of 2026, this category is sparsely populated. getdebug ships full per-category coverage in a free CLI + hosted tier. Snyk has begun adding AI-related rules to its enterprise tier. Vulnhuntr from Protect AI is research-stage and currently doesn't run cleanly on the modern stack. CodeSecBench is partly a public scoreboard for this category and partly a call for more entrants — especially generic SAST vendors who add AI-app rule packs.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
getdebug getdebug	AI-app SAST across six behavioral categories + secrets + dep-CVE + auto-fix. Free CLI with true local mode.	Free CLI + Free/Pro/Pro Plus hosted	✓	Built for AI-app patterns	✓
Snyk Code (AI rules) Snyk Limited	Recent AI-related rule additions inside the enterprise Snyk Code SAST product Coverage breadth not publicly benchmarked. Embedded in the broader Snyk Code product, not a standalone tool.	Per-developer/seat, enterprise tier	org-facing	Partial AI-app coverage	—
Vulnhuntr Protect AI (acquired by Palo Alto Networks)	LLM-powered AI-app vulnerability scanner, research-stage CodeSecBench attempted to run it in our June 2026 calibration and it failed on the current Python/Node stack. Maintenance status uncertain post Protect AI acquisition.	AGPL-licensed open source	✓	Partial AI-app coverage	—

Dev-led generic SAST — closest competitive quadrant for AI-app-native tools 3 tools

Dev-led generic SAST is the quadrant getdebug sits in. Same buyer motion (no sales call, free tier, dev-time integration), same product shape (scan + fix in the loop), but generic CWE coverage rather than AI-app patterns. Gecko Security (YC F24) is the closest direct competitor; Semgrep + SonarQube round out the field.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Gecko Security Gecko (YC F24)	Dev-led AI-SAST with compiler-accurate semantic call-chain indexer + auto-fix PR bot. Generic SAST, not AI-app-specific. Closest direct competitor in the dev-led SAST quadrant. Strong taint-analysis primitive; published 30+ CVEs in famous OSS. No CLI as of late 2026 per their site; on-prem is Enterprise-tier only.	Free (10 scans + 1 public repo) · Pro $100/mo · Enterprise	✓	No AI-app detection	—
Semgrep Semgrep Inc.	Pattern-matching SAST, community + paid platform Community rule packs touch some AI-app patterns; coverage depends on which packs are enabled. CodeSecBench scores Semgrep on its default community rules.	Open core; paid tier per-dev/mo	✓	Partial AI-app coverage	✓
SonarQube Sonar	Code quality + SAST, community + commercial	Community + Developer/Enterprise tiers	✓	No AI-app detection	—

Enterprise SAST — org-facing 6 tools

Mature SAST suites built before AI-app patterns existed. Strong on classic CWE coverage; AI-app coverage is either absent or recently bolted on as a rule pack. Sold to security leadership at mid-large companies.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Snyk Code Snyk Limited	Enterprise SAST + SCA + container, sales-led Listed twice: once here for the core SAST product, once above for the AI rule additions.	Per-developer/seat, enterprise tier	org-facing	Partial AI-app coverage	—
Checkmarx SAST Checkmarx	Enterprise SAST, mature CWE coverage, sales-led	Enterprise licensing, custom quote	org-facing	No AI-app detection	—
Veracode SAST Veracode	Enterprise SAST + SCA + DAST, sales-led	Enterprise licensing, custom quote	org-facing	No AI-app detection	—
GitHub Advanced Security Microsoft / GitHub	Native GitHub SAST (CodeQL) + secret scanning + dep review + Copilot Autofix	Per-active-committer, enterprise tier	org-facing	No AI-app detection	—
Fortify Static Code Analyzer OpenText	Enterprise SAST, mature coverage, on-prem option	Enterprise licensing	org-facing	No AI-app detection	—
ZeroPath ZeroPath	AI-native AppSec platform — SAST + SCA + secrets + IaC + dynamic testing + auto-fix Closest in feature surface to getdebug, but has gone enterprise. Same category, different shelf.	Enterprise platform	org-facing	Partial AI-app coverage	—

AI security platforms — enterprise org-facing (model + runtime + supply chain) 5 tools

Enterprise org-facing AI security platforms — blend of model-side, runtime, and supply-chain protection. Code-side SAST coverage varies; some include partial code analysis as part of a broader platform. Sold to enterprise AI/ML programs, not to individual developers.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Lasso Security Lasso Security	AI application security observability + runtime protection Runtime-side + some code analysis; closer to a platform than a CI-time SAST tool.	Enterprise SaaS, custom quote	org-facing	Partial AI-app coverage	—
Mindgard Mindgard	AI red-team + security testing platform Primarily model / runtime red-teaming.	Enterprise SaaS	org-facing	Partial AI-app coverage	—
Protect AI platform Protect AI (acquired by Palo Alto Networks, 2024–2025)	Model + ML supply chain security platform Includes ModelScan + NB Defense + Recon. Vulnhuntr was their open-source AI-app SAST research arm.	Enterprise platform	org-facing	Partial AI-app coverage	—
CalypsoAI CalypsoAI	AI security + governance platform	Enterprise SaaS	org-facing	Partial AI-app coverage	—
HiddenLayer HiddenLayer	Model-side AI security (model integrity, inference protection) Model-side, not code-side. Included for landscape completeness.	Enterprise SaaS	org-facing	Different category	—

AI code review (adjacent — not security) 4 tools

AI code review tools that catch correctness regressions — "your function signature change breaks 3 callers." Different category from security SAST. Listed here for clarity, because the technical primitives (AST + reference graph indexers) overlap.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Macroscope Macroscope	AI code reviewer with language-specific AST + reference graph indexer (10+ languages) Technically the closest indexer primitive to deep security taint analysis, but explicitly positioned as code review / quality — not SAST.	Per-developer/mo paid tier ~$30	✓	Different category	—
CodeRabbit CodeRabbit	AI PR reviewer with diff-focused review, summaries, and inline comments Generic code review, not security-first. If they ship a security-first mode, the wedge narrows.	Per-developer/mo paid tier + free for OSS	✓	Different category	—
Cursor BugBot Cursor	Cursor's AI bug-review agent for diffs (in beta)	Bundled with Cursor	✓	Different category	—
Greptile Greptile (YC W24)	AI code review with full-repo context	Per-developer/mo paid tier	✓	Different category	—

AI offensive / pentesting (adjacent — different buyer) 3 tools

AI-driven pentesting / red-teaming. Different code object in some cases (deployed app vs source), different buyer (CISO vs engineering team). Natural partnership candidates more than competitors per getdebug's competitive map.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Winfunc Winfunc (YC W24)	AI hacker that autonomously finds, verifies, and patches vulnerabilities; "zero FP via formal verification" Same find-verify-fix loop as getdebug; different wedge (formal verification + famous-repo findings).	Enterprise	org-facing	Partial AI-app coverage	—
Hex Security Hex Security (YC W26)	Agentic offensive security at scale — autonomous AI agents running continuous pentests Different code object (deployed app vs source). Natural partnership candidate, not competitor.	Enterprise	org-facing	Different category	—
Veria Labs Veria Labs	Autonomous AppSec spanning code + cloud, generates PoCs against staging Credential-led enterprise sale. Adjacent threat, not direct competitor.	Enterprise	org-facing	Partial AI-app coverage	—

Secret scanners 3 tools

Single-purpose secret detection. Designed for credential leaks, not behavioral AI-app patterns. CodeSecBench tests them on the secret-shape category for fair like-for-like.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
gitleaks Open source	Regex + entropy secret detection. Generic credential leaks across providers.	MIT-licensed open source	✓	Different category	✓
trufflehog Truffle Security	Secret detection + live-credential verification, OSS + commercial	OSS + Enterprise platform	✓	Different category	✓
getdebug (secrets layer) getdebug	Multi-purpose: secrets + dep-CVE + AI-app SAST + auto-fix. The secrets layer competes with gitleaks/trufflehog head-to-head; the AI-app SAST layer is the wedge that single-purpose secret scanners don't have. On the AI-app-context secret category (client-side LLM keys), getdebug currently outperforms trufflehog on recall (100% vs 0% on the Section A fixtures). On generic credential leaks (Section A's leaky-repo-baseline), gitleaks and trufflehog still lead on finding count. The wedge isn't "we beat secret scanners at their job" — it's "we do secrets and AI-app SAST in one pass."	Free CLI + Free/Pro/Pro Plus hosted	✓	Built for AI-app patterns	✓

Python SAST 1 tool

Python-only SAST baselines. CWE-aware, not AI-app-aware.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
bandit PyCQA	Python-only SAST, mature CWE-based rules	MIT-licensed open source	✓	No AI-app detection	✓

Runtime AI guard (adjacent — not SAST) 2 tools

Runtime defenses in front of model calls. Adjacent to SAST but a different layer — catches attacks at request time, not pre-deploy.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Lakera Guard Lakera	Runtime prompt-injection / abuse filtering	Per-request / enterprise	org-facing	Different category	—
Robust Intelligence Robust Intelligence (acquired by Cisco, 2024)	Model + runtime AI security	Enterprise	org-facing	Different category	—

Eval framework (adjacent — not SAST) 3 tools

Pre-deploy evaluation and red-teaming frameworks. Test model behavior; do not analyze application code.

Tool	Positioning	Pricing shape	Dev-led	AI-app coverage	In bench
Promptfoo Promptfoo	Prompt eval + red-teaming framework	OSS + Enterprise	✓	Different category	—
Garak NVIDIA	LLM vulnerability scanner (model-side eval)	Apache-2.0 open source	✓	Different category	—
Patronus AI Patronus AI	LLM eval + safety testing platform	SaaS	org-facing	Different category	—

A call for more AI-app-native SAST

The AI-app SAST category — code-side static analysis built specifically for AI-app patterns — is young. Gecko has a strong dev-led SAST primitive but doesn't ship AI-app-specific rules. Snyk has begun adding AI rules but only in their enterprise tier. Vulnhuntr is research-stage. The category needs more entrants — generic SAST vendors adding AI-app rule packs, new dev-first tools, anything that grades against the same public corpus.

The corpus, the truth files, and the score.js harness go public once Tier C lands its final two repositories (cycles 5 and 6 in flight). The methodology and current scoring rules are already published — see /methodology and the multi-maintainer model on /governance. Once the public release lands, this is where the submit-a-tool flow lives.

See current results Read the methodology