April 14, 2026 14 min read Blog · Code Review

Claude Code Review: What AI-Generated Code Gets Wrong (And How to Catch It)

Claude Code is Anthropic's CLI tool that lets you build entire features by describing them in natural language. It writes clean, well-structured code fast — and that speed is exactly the problem. Claude Code's output looks production-ready but consistently contains overly permissive security defaults, placeholder error handling that silently swallows failures, missing rate limiting, and dependencies with known CVEs. These patterns pass linting, pass CI, and break in production. This guide covers what Claude Code gets right, what it reliably gets wrong, and how to review its output before shipping.

Key Takeaways

Claude Code Is Powerful — But Not Infallible

What is Claude Code? Claude Code is Anthropic's official command-line tool that lets developers interact with Claude directly from the terminal. It can read your codebase, create and edit files, run commands, search the web, and manage git workflows — all through natural language. It is the engine behind "vibe coding" with Claude: describe what you want, and Claude Code builds it.

Claude Code is genuinely good at certain categories of work. It excels at scaffolding project architecture — setting up folder structures, configuring build tools, and creating boilerplate that would otherwise take hours. It handles refactoring well: renaming variables across files, extracting functions, reorganizing modules. It writes clear documentation and generates solid first drafts of API endpoints, database schemas, and test suites.

The problem is not that Claude Code writes bad code. The problem is that it writes confident code that optimizes for "make it work in development" rather than "make it safe in production." And because the output is clean, well-formatted, and often passes linting on the first try, developers trust it more than they should.

1.7x more defects in AI-generated pull requests vs human-written ones. Security issues are up to 2.74x higher, and performance problems like excessive I/O operations are 8x more common. Source: CodeRabbit, State of AI vs Human Code Generation Report, December 2025 (470 real-world PRs analyzed).

Real Vulnerabilities in Claude Code Itself (2025–2026)

The security risks of Claude-generated code are not hypothetical. Claude Code itself has had confirmed vulnerabilities: CVE-2025-59536 (CVSS 8.7, code injection) and CVE-2026-21852 (CVSS 5.3, information disclosure that could exfiltrate API keys). Both were patched by Anthropic. In March 2026, Anthropic accidentally published a debugging JavaScript sourcemap for Claude Code v2.1.88 to npm — exposing 512,000 lines of TypeScript across 1,900 files. Days later, Adversa AI disclosed a critical permission bypass: deny rules, security validators, and command injection detection could all be skipped under certain conditions. The broader IDEsaster research found 30+ CVEs across 10+ AI coding tools, resulting in 24 assigned CVEs.

In April 2026, Anthropic launched its own Code Review feature — dispatching specialized agents on every PR. Internally, on large PRs (1,000+ lines), 84% get findings, averaging 7.5 issues per review. Less than 1% of findings are marked incorrect by engineers. This is Anthropic's acknowledgment that Claude-generated code needs structured review — and their own data confirms the scale of issues that slip through without it.

Here is what Claude Code consistently struggles with:

These are not random bugs. They are systematic patterns that appear across Claude-generated codebases regardless of the project type. The following section breaks down each one with concrete code examples.

6 Common Issues in Claude-Generated Code

After reviewing dozens of Claude-generated codebases at Vibers, these six patterns account for the majority of security and reliability issues. Each one looks correct at first glance — which is exactly why they survive code review by developers and automated tools alike.

1. Overly permissive CORS and auth defaults

When you ask Claude Code to "add CORS" or "set up authentication," it optimizes for making the app work immediately. That means permissive defaults that should never reach production.

What Claude Code generates:

// Express.js — Claude's typical CORS setup
const cors = require('cors');
app.use(cors({
  origin: '*',           // Allows ANY domain
  credentials: true      // Sends cookies to ANY origin
}));

// Flask — same pattern
CORS(app, resources={r"/api/*": {"origins": "*"}})

What it should be:

// Restrict to your actual domains
app.use(cors({
  origin: ['https://yourapp.com', 'https://staging.yourapp.com'],
  credentials: true
}));

The origin: '*' with credentials: true combination is particularly dangerous — it allows any website to make authenticated requests to your API on behalf of your users. This is a textbook cross-site request scenario. Claude writes it because it works in development, and because its training data contains thousands of tutorials that use this exact pattern.

2. Missing rate limiting on public endpoints

Claude Code almost never adds rate limiting unless you explicitly ask for it. Login endpoints, signup forms, password reset flows, API endpoints — all exposed without throttling.

// Claude generates a clean login endpoint — with no rate limiting
app.post('/api/login', async (req, res) => {
  const { email, password } = req.body;
  const user = await User.findOne({ email });
  if (!user || !await bcrypt.compare(password, user.password)) {
    return res.status(401).json({ error: 'Invalid credentials' });
  }
  const token = jwt.sign({ id: user._id }, process.env.JWT_SECRET);
  res.json({ token });
});

This endpoint accepts unlimited login attempts per second. An attacker can brute-force passwords, enumerate valid email addresses via timing differences, and overwhelm your database — all because Claude treated rate limiting as optional. Adding express-rate-limit or an equivalent takes five lines of code, but Claude does not include it unless prompted.

3. Placeholder error handling that looks real

This is the most insidious pattern. Claude generates error handling that appears comprehensive — try/catch blocks, error responses, status codes — but silently swallows critical failures.

// Looks correct. Is not correct.
async function processPayment(orderId, amount) {
  try {
    const charge = await stripe.charges.create({
      amount: amount * 100,
      currency: 'usd',
      source: req.body.token
    });
    await Order.updateOne({ _id: orderId }, { status: 'paid' });
    return { success: true };
  } catch (error) {
    console.log('Payment error:', error.message);
    return { success: false, error: 'Payment failed' };
  }
}

The problems: (1) console.log instead of a proper logger — in production this disappears into stdout with no alerting; (2) the generic catch swallows Stripe-specific errors that require different handling (card declined vs network error vs fraud detection); (3) if the Stripe charge succeeds but the database update fails, the user is charged but the order is not marked as paid — and nobody gets notified because the error is logged as a string.

What production error handling looks like:

async function processPayment(orderId, amount) {
  let charge;
  try {
    charge = await stripe.charges.create({
      amount: amount * 100,
      currency: 'usd',
      source: req.body.token,
      idempotency_key: orderId  // Prevent duplicate charges
    });
  } catch (error) {
    logger.error('Stripe charge failed', { orderId, error: error.type });
    if (error.type === 'StripeCardError') {
      return { success: false, error: error.message };
    }
    throw error;  // Re-throw unexpected errors — don't swallow them
  }

  try {
    await Order.updateOne({ _id: orderId }, {
      status: 'paid',
      chargeId: charge.id
    });
  } catch (dbError) {
    logger.error('DB update failed after successful charge', {
      orderId, chargeId: charge.id
    });
    // Alert on-call — user was charged but order not updated
    await alertOncall('payment-db-mismatch', { orderId, chargeId: charge.id });
    throw dbError;
  }
  return { success: true };
}

4. Dependencies with known vulnerabilities

Claude Code pulls packages from its training data. If a package had a critical CVE published after Claude's training cutoff, Claude will still recommend the vulnerable version. It does not run npm audit or pip audit after generating code.

Every Claude-generated project we have reviewed at Vibers had at least one dependency with a known vulnerability. Most had three to five. Running npm audit or pip audit after Claude generates your package.json or requirements.txt is not optional — it is the bare minimum.

5. .env files and secrets not in .gitignore

Claude Code frequently creates .env files with database credentials, API keys, and JWT secrets — and does not always add them to .gitignore. Even when it does create a .gitignore, it may miss files like .env.local, .env.production, or custom config files containing secrets.

# Claude creates .env with real-looking placeholder secrets
DATABASE_URL=postgres://admin:password123@localhost:5432/myapp
JWT_SECRET=super-secret-key-change-in-production
STRIPE_SECRET_KEY=sk_test_EXAMPLE_KEY_REPLACE_ME
SENDGRID_API_KEY=SG.xxxxx

If this gets committed — even once — the secrets are in your git history permanently. GitHub's secret scanning catches some patterns, but not custom secrets like database URLs or JWT signing keys. Always verify: git log --all --diff-filter=A -- .env will show you if .env was ever committed.

6. SQL injection in dynamic queries

Claude generally uses ORMs correctly. But when you ask for raw SQL — complex queries, bulk operations, database migrations — it sometimes falls back to string interpolation instead of parameterized queries.

# Claude's raw query for a search endpoint
@app.get("/api/search")
async def search(q: str, sort_by: str = "created_at"):
    query = f"SELECT * FROM products WHERE name ILIKE '%{q}%' ORDER BY {sort_by}"
    results = await database.fetch_all(query)
    return results

Both q and sort_by are injected directly into the SQL string. An attacker can pass sort_by=created_at; DROP TABLE products;-- and execute arbitrary SQL. This is CWE-89, the most basic category of injection vulnerability — and Claude generates it when constructing queries with dynamic column names or search patterns.

Parameterized version:

ALLOWED_SORT_COLUMNS = {"created_at", "name", "price"}

@app.get("/api/search")
async def search(q: str, sort_by: str = "created_at"):
    if sort_by not in ALLOWED_SORT_COLUMNS:
        sort_by = "created_at"
    query = f"SELECT * FROM products WHERE name ILIKE $1 ORDER BY {sort_by}"
    results = await database.fetch_all(query, [f"%{q}%"])
    return results

Shipping Claude-generated code to production?

Vibers reviews your Claude Code output against your spec. We catch the issues that pass linting, pass CI, and break in production.

Get Your Free First Review

How to Review Claude-Generated Code: A 7-Point Checklist

You do not need a professional reviewer to catch the most common issues. This checklist covers roughly 80% of the security and reliability problems we find in Claude-generated codebases. Run through it after every significant feature Claude builds.

  1. Search for origin: '*' and Access-Control headers. Every CORS configuration should list specific domains. If you see a wildcard, replace it with your actual frontend URLs. Check both application code and any reverse proxy configs (nginx, Cloudflare).
  2. Verify rate limiting on every public endpoint. Login, signup, password reset, API endpoints, webhook receivers. If there is no rate limiter middleware, add one. Five requests per minute for auth endpoints, 100 per minute for general API is a reasonable starting point.
  3. Audit every try/catch and except block. Search for catch (e) and except Exception. For each one: does it log with a real logger (not console.log)? Does it re-throw unexpected errors? Does it handle specific error types differently? If the catch block just returns a generic error message, it is placeholder code that needs to be replaced.
  4. Run npm audit / pip audit / cargo audit. Do this after every package.json or requirements.txt change. Fix critical and high severity findings before shipping. Claude does not check for CVEs — you must.
  5. Verify .gitignore covers all secrets. Check for .env, .env.*, *.pem, *.key, credentials.json, and any project-specific config files. Run git log --all --diff-filter=A -- .env to confirm no secrets were ever committed. If they were, rotate them immediately — removing from git history is not enough.
  6. Search for string interpolation near SQL keywords. Grep for f"SELECT, f"INSERT, f"UPDATE, f"DELETE in Python, and template literals near query( in JavaScript. Every user-supplied value must go through parameterized queries. Column names in ORDER BY and GROUP BY must be validated against a whitelist.
  7. Check auth middleware coverage. List every route in your application. For each route that should be protected, verify that auth middleware is actually applied — not just assumed. Claude sometimes adds auth to the routes you mentioned but forgets adjacent routes that also need protection (admin endpoints, settings pages, file upload paths).

This checklist is not exhaustive — it does not cover business logic validation, race conditions, or requirement compliance. But it catches the systematic security gaps that Claude Code produces reliably. For the deeper categories, you need someone who has read your spec.

When You Need a Professional Claude Code Review

The self-review checklist handles common patterns. But there are situations where a professional review is not optional — it is a risk management necessity:

Security issues in AI-generated code: 2.74x higher than in human-written code, with improper password handling and missing auth checks as the most common categories. Source: CodeRabbit, December 2025.

How Vibers Reviews Claude-Generated Code

Vibers is a GitHub App built specifically for reviewing AI-generated codebases — including code written by Claude Code, Cursor, GitHub Copilot, and similar tools. Here is the workflow:

  1. Install the GitHub App (one click). Go to github.com/apps/vibers-review and install on your repository. No configuration needed.
  2. Share your spec. After installation, you are prompted to share a link to your Google Doc, Notion page, Figma file, or product brief — whatever describes what your app should do. This is the step that differentiates Vibers from every automated tool: a human reviewer reads your spec before looking at any code.
  3. Push code. When you push to your repository, the Vibers reviewer is notified. They read your changes in the context of your full codebase and your spec — not just the diff.
  4. Receive a review within 24 hours. The reviewer checks security (OWASP Top 10), business logic compliance, auth flows, error handling, and the six Claude-specific patterns described in this article. Issues are documented with explanations.
  5. Get fix PRs, not just comments. When issues are found, you receive pull requests with the fixes already written. Review, approve, merge. No manual copy-pasting from comment threads.

The first review is free — all we ask is a GitHub star on the Vibers repository. Standard reviews are billed at $15 per hour. A typical Claude-generated MVP (3,000 to 10,000 lines of code) takes two to four hours.

Why can't Claude Code review its own output? You can ask Claude to review code it generated, and it will catch some surface-level issues. But an AI reviewing its own output has a fundamental blind spot: it tends to validate its own reasoning rather than challenge it. If Claude chose cors: '*' because it optimized for "make it work," the same reasoning still applies when it reviews the code. Human reviewers bring external context — production requirements, threat models, compliance constraints — that the AI does not have.

AI Review Tools vs Human Review for Claude Code

AI code review tools and human review are not competitors — they operate on different layers of the problem. Here is what each catches:

Issue Category AI Tools (CodeRabbit, Qodo) Human Review (Vibers)
Syntax errors, linting Yes Yes
Known security anti-patterns Yes Yes
Overly permissive CORS Sometimes Yes
Missing rate limiting No Yes
Placeholder error handling No Yes
Business logic correctness No Yes (spec-verified)
Auth flow completeness Partial Yes
Payment race conditions No Yes
Requirement mismatches No (no spec access) Yes
Review speed Instant (every push) Within 24 hours
Output format PR comments Fix PRs + summary
Price $24–25 / user / month Free first + $15/hr

The practical recommendation: use an AI review tool (CodeRabbit or Qodo) for fast, automated feedback on every push. Use human review before launch, before handling real payments, and whenever Claude generates code in a domain you are not deeply familiar with. The two approaches are complementary — AI tools catch the 46–57% of issues they can see; human review catches the business logic, auth flows, and requirement mismatches that automated tools structurally cannot find.

"AI is great at catching code-level issues, but it doesn't understand your business goals, your team's coding standards, or the long-term vision for a feature." Tembo.io Engineering Blog, March 2026

Frequently Asked Questions

Is code generated by Claude Code safe for production?
Not without review. Claude Code writes syntactically correct, well-structured code that passes linting and basic tests. But it consistently produces overly permissive defaults (CORS set to *, missing rate limiting, broad auth rules), placeholder error handling that looks real but swallows errors silently, and occasionally uses string interpolation in SQL queries. These patterns are not caught by CI/CD pipelines or standard linters — they require a human reviewing the code with security and production context.
What are the most common issues in Claude-generated code?
The six most common issues: (1) overly permissive CORS and auth defaults, (2) missing rate limiting on public endpoints, (3) placeholder error handling that catches exceptions but does nothing useful with them, (4) dependencies with known CVEs from Claude's training data, (5) .env files and secrets not properly excluded from version control, and (6) SQL injection vulnerabilities from string interpolation in dynamic queries. These are systematic patterns, not random bugs — Claude produces them reliably across different projects.
Can Claude Code review its own code?
You can ask Claude to review code it generated, and it will catch some surface-level issues. But an AI reviewing its own output has a fundamental blind spot: it tends to validate its own reasoning patterns rather than challenge them. If Claude chose cors: '*' because it optimized for "make it work," it will not flag that as a security issue on review because the same reasoning still applies. Human reviewers bring external context — production requirements, compliance constraints, threat models — that the AI does not have.
How do I review Claude-generated code for security?
Start with a targeted checklist: (1) search for cors, origin, and Access-Control headers — verify they are restricted to your actual domains; (2) check every database query for parameterized inputs; (3) verify .env and secrets files are in .gitignore; (4) check that every public endpoint has rate limiting; (5) run npm audit or pip audit; (6) search for generic try/catch blocks that swallow errors without logging; (7) verify auth middleware is applied to every protected route. This checklist catches roughly 80% of the security issues found in Claude-generated codebases.
Should I use an AI review tool or a human to review Claude Code output?
Use both — they catch different things. AI review tools like CodeRabbit and Qodo catch syntax-level issues and known anti-patterns fast, on every commit. But they detect only 46–57% of bugs and cannot verify whether code matches your business requirements. Human review catches business logic errors, auth flow gaps, payment race conditions, and requirement mismatches — the issues that cause production incidents. For Claude-generated code specifically, human review is critical because Claude's failure modes (permissive defaults, confident placeholder code) are designed to look correct to automated tools.

Your Claude Code deserves a human review.

Install Vibers, share your spec, push code. We review against your requirements and send fix PRs — not comments. First review is free.

Install Vibers GitHub App — Free First Review

Alex Noxon — Founder, Vibers

Alex has reviewed over 40 AI-generated codebases for indie hackers and solo founders since 2024, including projects built with Claude Code, Cursor, and GitHub Copilot. He writes about the practical limits of vibe coding for production software and builds tools at the intersection of human judgment and AI automation. Vibers is his answer to the question no AI tool has solved: reviewing code against the spec that generated it.