Claude Code is Anthropic's CLI tool that lets you build entire features by describing them in natural language. It writes clean, well-structured code fast — and that speed is exactly the problem. Claude Code's output looks production-ready but consistently contains overly permissive security defaults, placeholder error handling that silently swallows failures, missing rate limiting, and dependencies with known CVEs. These patterns pass linting, pass CI, and break in production. This guide covers what Claude Code gets right, what it reliably gets wrong, and how to review its output before shipping.
Claude Code is genuinely good at certain categories of work. It excels at scaffolding project architecture — setting up folder structures, configuring build tools, and creating boilerplate that would otherwise take hours. It handles refactoring well: renaming variables across files, extracting functions, reorganizing modules. It writes clear documentation and generates solid first drafts of API endpoints, database schemas, and test suites.
The problem is not that Claude Code writes bad code. The problem is that it writes confident code that optimizes for "make it work in development" rather than "make it safe in production." And because the output is clean, well-formatted, and often passes linting on the first try, developers trust it more than they should.
The security risks of Claude-generated code are not hypothetical. Claude Code itself has had confirmed vulnerabilities: CVE-2025-59536 (CVSS 8.7, code injection) and CVE-2026-21852 (CVSS 5.3, information disclosure that could exfiltrate API keys). Both were patched by Anthropic. In March 2026, Anthropic accidentally published a debugging JavaScript sourcemap for Claude Code v2.1.88 to npm — exposing 512,000 lines of TypeScript across 1,900 files. Days later, Adversa AI disclosed a critical permission bypass: deny rules, security validators, and command injection detection could all be skipped under certain conditions. The broader IDEsaster research found 30+ CVEs across 10+ AI coding tools, resulting in 24 assigned CVEs.
In April 2026, Anthropic launched its own Code Review feature — dispatching specialized agents on every PR. Internally, on large PRs (1,000+ lines), 84% get findings, averaging 7.5 issues per review. Less than 1% of findings are marked incorrect by engineers. This is Anthropic's acknowledgment that Claude-generated code needs structured review — and their own data confirms the scale of issues that slip through without it.
Here is what Claude Code consistently struggles with:
These are not random bugs. They are systematic patterns that appear across Claude-generated codebases regardless of the project type. The following section breaks down each one with concrete code examples.
After reviewing dozens of Claude-generated codebases at Vibers, these six patterns account for the majority of security and reliability issues. Each one looks correct at first glance — which is exactly why they survive code review by developers and automated tools alike.
When you ask Claude Code to "add CORS" or "set up authentication," it optimizes for making the app work immediately. That means permissive defaults that should never reach production.
What Claude Code generates:
// Express.js — Claude's typical CORS setup
const cors = require('cors');
app.use(cors({
origin: '*', // Allows ANY domain
credentials: true // Sends cookies to ANY origin
}));
// Flask — same pattern
CORS(app, resources={r"/api/*": {"origins": "*"}})
What it should be:
// Restrict to your actual domains
app.use(cors({
origin: ['https://yourapp.com', 'https://staging.yourapp.com'],
credentials: true
}));
The origin: '*' with credentials: true combination is particularly dangerous — it allows any website to make authenticated requests to your API on behalf of your users. This is a textbook cross-site request scenario. Claude writes it because it works in development, and because its training data contains thousands of tutorials that use this exact pattern.
Claude Code almost never adds rate limiting unless you explicitly ask for it. Login endpoints, signup forms, password reset flows, API endpoints — all exposed without throttling.
// Claude generates a clean login endpoint — with no rate limiting
app.post('/api/login', async (req, res) => {
const { email, password } = req.body;
const user = await User.findOne({ email });
if (!user || !await bcrypt.compare(password, user.password)) {
return res.status(401).json({ error: 'Invalid credentials' });
}
const token = jwt.sign({ id: user._id }, process.env.JWT_SECRET);
res.json({ token });
});
This endpoint accepts unlimited login attempts per second. An attacker can brute-force passwords, enumerate valid email addresses via timing differences, and overwhelm your database — all because Claude treated rate limiting as optional. Adding express-rate-limit or an equivalent takes five lines of code, but Claude does not include it unless prompted.
This is the most insidious pattern. Claude generates error handling that appears comprehensive — try/catch blocks, error responses, status codes — but silently swallows critical failures.
// Looks correct. Is not correct.
async function processPayment(orderId, amount) {
try {
const charge = await stripe.charges.create({
amount: amount * 100,
currency: 'usd',
source: req.body.token
});
await Order.updateOne({ _id: orderId }, { status: 'paid' });
return { success: true };
} catch (error) {
console.log('Payment error:', error.message);
return { success: false, error: 'Payment failed' };
}
}
The problems: (1) console.log instead of a proper logger — in production this disappears into stdout with no alerting; (2) the generic catch swallows Stripe-specific errors that require different handling (card declined vs network error vs fraud detection); (3) if the Stripe charge succeeds but the database update fails, the user is charged but the order is not marked as paid — and nobody gets notified because the error is logged as a string.
What production error handling looks like:
async function processPayment(orderId, amount) {
let charge;
try {
charge = await stripe.charges.create({
amount: amount * 100,
currency: 'usd',
source: req.body.token,
idempotency_key: orderId // Prevent duplicate charges
});
} catch (error) {
logger.error('Stripe charge failed', { orderId, error: error.type });
if (error.type === 'StripeCardError') {
return { success: false, error: error.message };
}
throw error; // Re-throw unexpected errors — don't swallow them
}
try {
await Order.updateOne({ _id: orderId }, {
status: 'paid',
chargeId: charge.id
});
} catch (dbError) {
logger.error('DB update failed after successful charge', {
orderId, chargeId: charge.id
});
// Alert on-call — user was charged but order not updated
await alertOncall('payment-db-mismatch', { orderId, chargeId: charge.id });
throw dbError;
}
return { success: true };
}
Claude Code pulls packages from its training data. If a package had a critical CVE published after Claude's training cutoff, Claude will still recommend the vulnerable version. It does not run npm audit or pip audit after generating code.
npm audit or pip audit after Claude generates your package.json or requirements.txt is not optional — it is the bare minimum.
Claude Code frequently creates .env files with database credentials, API keys, and JWT secrets — and does not always add them to .gitignore. Even when it does create a .gitignore, it may miss files like .env.local, .env.production, or custom config files containing secrets.
# Claude creates .env with real-looking placeholder secrets
DATABASE_URL=postgres://admin:password123@localhost:5432/myapp
JWT_SECRET=super-secret-key-change-in-production
STRIPE_SECRET_KEY=sk_test_EXAMPLE_KEY_REPLACE_ME
SENDGRID_API_KEY=SG.xxxxx
If this gets committed — even once — the secrets are in your git history permanently. GitHub's secret scanning catches some patterns, but not custom secrets like database URLs or JWT signing keys. Always verify: git log --all --diff-filter=A -- .env will show you if .env was ever committed.
Claude generally uses ORMs correctly. But when you ask for raw SQL — complex queries, bulk operations, database migrations — it sometimes falls back to string interpolation instead of parameterized queries.
# Claude's raw query for a search endpoint
@app.get("/api/search")
async def search(q: str, sort_by: str = "created_at"):
query = f"SELECT * FROM products WHERE name ILIKE '%{q}%' ORDER BY {sort_by}"
results = await database.fetch_all(query)
return results
Both q and sort_by are injected directly into the SQL string. An attacker can pass sort_by=created_at; DROP TABLE products;-- and execute arbitrary SQL. This is CWE-89, the most basic category of injection vulnerability — and Claude generates it when constructing queries with dynamic column names or search patterns.
Parameterized version:
ALLOWED_SORT_COLUMNS = {"created_at", "name", "price"}
@app.get("/api/search")
async def search(q: str, sort_by: str = "created_at"):
if sort_by not in ALLOWED_SORT_COLUMNS:
sort_by = "created_at"
query = f"SELECT * FROM products WHERE name ILIKE $1 ORDER BY {sort_by}"
results = await database.fetch_all(query, [f"%{q}%"])
return results
Vibers reviews your Claude Code output against your spec. We catch the issues that pass linting, pass CI, and break in production.
Get Your Free First ReviewYou do not need a professional reviewer to catch the most common issues. This checklist covers roughly 80% of the security and reliability problems we find in Claude-generated codebases. Run through it after every significant feature Claude builds.
origin: '*' and Access-Control headers. Every CORS configuration should list specific domains. If you see a wildcard, replace it with your actual frontend URLs. Check both application code and any reverse proxy configs (nginx, Cloudflare).try/catch and except block. Search for catch (e) and except Exception. For each one: does it log with a real logger (not console.log)? Does it re-throw unexpected errors? Does it handle specific error types differently? If the catch block just returns a generic error message, it is placeholder code that needs to be replaced.npm audit / pip audit / cargo audit. Do this after every package.json or requirements.txt change. Fix critical and high severity findings before shipping. Claude does not check for CVEs — you must..gitignore covers all secrets. Check for .env, .env.*, *.pem, *.key, credentials.json, and any project-specific config files. Run git log --all --diff-filter=A -- .env to confirm no secrets were ever committed. If they were, rotate them immediately — removing from git history is not enough.f"SELECT, f"INSERT, f"UPDATE, f"DELETE in Python, and template literals near query( in JavaScript. Every user-supplied value must go through parameterized queries. Column names in ORDER BY and GROUP BY must be validated against a whitelist.This checklist is not exhaustive — it does not cover business logic validation, race conditions, or requirement compliance. But it catches the systematic security gaps that Claude Code produces reliably. For the deeper categories, you need someone who has read your spec.
The self-review checklist handles common patterns. But there are situations where a professional review is not optional — it is a risk management necessity:
Vibers is a GitHub App built specifically for reviewing AI-generated codebases — including code written by Claude Code, Cursor, GitHub Copilot, and similar tools. Here is the workflow:
The first review is free — all we ask is a GitHub star on the Vibers repository. Standard reviews are billed at $15 per hour. A typical Claude-generated MVP (3,000 to 10,000 lines of code) takes two to four hours.
cors: '*' because it optimized for "make it work," the same reasoning still applies when it reviews the code. Human reviewers bring external context — production requirements, threat models, compliance constraints — that the AI does not have.
AI code review tools and human review are not competitors — they operate on different layers of the problem. Here is what each catches:
| Issue Category | AI Tools (CodeRabbit, Qodo) | Human Review (Vibers) |
|---|---|---|
| Syntax errors, linting | Yes | Yes |
| Known security anti-patterns | Yes | Yes |
| Overly permissive CORS | Sometimes | Yes |
| Missing rate limiting | No | Yes |
| Placeholder error handling | No | Yes |
| Business logic correctness | No | Yes (spec-verified) |
| Auth flow completeness | Partial | Yes |
| Payment race conditions | No | Yes |
| Requirement mismatches | No (no spec access) | Yes |
| Review speed | Instant (every push) | Within 24 hours |
| Output format | PR comments | Fix PRs + summary |
| Price | $24–25 / user / month | Free first + $15/hr |
The practical recommendation: use an AI review tool (CodeRabbit or Qodo) for fast, automated feedback on every push. Use human review before launch, before handling real payments, and whenever Claude generates code in a domain you are not deeply familiar with. The two approaches are complementary — AI tools catch the 46–57% of issues they can see; human review catches the business logic, auth flows, and requirement mismatches that automated tools structurally cannot find.
"AI is great at catching code-level issues, but it doesn't understand your business goals, your team's coding standards, or the long-term vision for a feature." — Tembo.io Engineering Blog, March 2026
cors: '*' because it optimized for "make it work," it will not flag that as a security issue on review because the same reasoning still applies. Human reviewers bring external context — production requirements, compliance constraints, threat models — that the AI does not have.cors, origin, and Access-Control headers — verify they are restricted to your actual domains; (2) check every database query for parameterized inputs; (3) verify .env and secrets files are in .gitignore; (4) check that every public endpoint has rate limiting; (5) run npm audit or pip audit; (6) search for generic try/catch blocks that swallow errors without logging; (7) verify auth middleware is applied to every protected route. This checklist catches roughly 80% of the security issues found in Claude-generated codebases.Install Vibers, share your spec, push code. We review against your requirements and send fix PRs — not comments. First review is free.
Install Vibers GitHub App — Free First Review