7 AI Code Vulnerabilities That Show Up in Almost Every Repo

The most common AI code vulnerabilities aren't random — they're the same 7 patterns, showing up in repo after repo. Veracode's State of Software Security 2025 report found that 45% of AI-generated code contains at least one vulnerability — a finding we covered in depth in our post on vibe coding security risks. In December 2025, Tenzai researchers tested 5 popular vibe coding tools and found an average of 69 vulnerabilities across the tools tested. Palo Alto Networks launched their SHIELD framework in January 2026 specifically in response to AI code security risks. These aren't outliers. They're the norm.

This post is the practical reference: what each AI code vulnerability looks like in real output from Cursor, Copilot, and ChatGPT, why these tools keep producing it, and what the fix looks like. Items 1-4 are high-frequency — you probably have at least one of them. Items 5-7 are real but appear less often.

If you want to check your own repo right now before reading the list, you can scan it free in under 5 minutes. Otherwise, here's what to look for.

Why AI Coding Tools Keep Making the Same Security Mistakes

Before the list, the mechanism — because "AI has bugs" isn't useful knowledge. Understanding why these tools produce insecure patterns tells you exactly where to look.

Training data is full of insecure code. LLMs learn from code scraped from public repositories, Stack Overflow answers, and tutorial sites. That corpus is full of hardcoded credentials, unparameterized queries, and missing auth checks — because historically, that's what code looked like before security became a mainstream concern. The model doesn't know that sk_live_xxxxxxxx pasted directly into a file is wrong. It knows that it appears alongside Stripe integration code in its training data.

These tools have no project context. When Cursor generates your /api/payments route, it doesn't know your project has a .env.local file. It doesn't know you set up NextAuth in a different file. It doesn't know which routes are supposed to be public and which aren't. It's solving a local problem — "write code that does X" — without seeing the whole system. LLM code security flaws cluster at integration points precisely because that's where missing context matters most.

They optimize for "works," not "secure." The training signal for these models is code that compiles, passes tests, and matches the prompt. There's no loss function for "this code exposes user data." A route that returns a database record without checking who's asking is a correct implementation of "write a route that returns a database record."

No dependency audit happens after generation. When Copilot suggests npm install some-package@2.3.1, it's suggesting the version it saw in training data. That version may have had a vulnerability discovered six months after the training cutoff. The AI doesn't know, and npm install doesn't tell you.

This is why AI-generated code bugs are predictable and pattern-based. They're not random errors — they're systematic gaps between "code that works" and "code that's secure."

#1: Hardcoded Secrets and API Keys

What it is: A secret — API key, database password, authentication token — written directly into source code instead of loaded from an environment variable at runtime.

This is the finding we see most often in scans. Nearly every AI-assisted repo we've scanned that was built quickly has at least one credential somewhere it shouldn't be. The scan data from 50 real Cursor repos put the number at 62%.

Why AI tools produce it: Cursor generates code in the context of the current file. When you paste your OpenAI key into a prompt and ask it to write an integration, it inlines that key in the most direct way it can. It has no global awareness that you have a .env.local file. It's optimizing for "code that matches what you showed me," not "code that handles secrets correctly."

What it looks like:

// BAD: AI-generated code with API key hardcoded in source
const openai = new OpenAI({ apiKey: "sk-proj-xxxxxxxxxxxxxxxxxxxxxxxx" });

// GOOD: Key loaded from environment variable at runtime
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

The real-world consequence: A secret committed to a public repo is usually scraped by automated bots within minutes. Leaked AWS keys have generated $50,000 bills overnight. Leaked Stripe keys can expose customer payment data. If you've already committed a secret, changing the file isn't enough — it's still in Git history. Rotate the key in your provider's dashboard immediately, then purge the history with BFG Repo-Cleaner. Also check if your .env file is publicly accessible on your deployed URL — that's a separate exposure path.

#2: Outdated Dependencies with Known Vulnerabilities

What it is: A package in your package.json (or requirements.txt, go.mod, etc.) that has a documented security vulnerability that was discovered after it was installed.

Why AI tools produce it: Every AI coding tool has a training cutoff. When Copilot generates npm install express@4.17.1, it's suggesting a version that was current in its training data. Vulnerabilities get discovered after that cutoff date. The package gets committed and forgotten. Nobody runs npm audit because the app worked fine in development.

What makes this particularly sneaky: the code itself looks completely clean. The vulnerability isn't in anything you wrote — it's sitting in node_modules waiting for someone to send the right request.

What to look for: Check your package-lock.json against the OSV vulnerability database or run:

# Quick first pass — doesn't catch everything but flags the obvious ones
npm audit --audit-level=high

The real-world consequence: A vulnerable dependency can give an attacker access to your file system, let them forge authentication tokens, or enable server-side request forgery — depending on the package and the CVE. Supply chain attacks targeting popular npm packages are documented in multiple OWASP advisories. Static analysis catches more of these than npm audit alone.

#3: Missing Authentication on API Routes

What it is: An API endpoint that should verify the requesting user's identity — but doesn't. Anyone who knows the URL can call it.

Why AI tools produce it: You ask for "a route that returns user data." The AI writes a route that returns user data. It doesn't know that this route should only return data for the currently authenticated user — unless your prompt made that explicit, and often it doesn't. The missing auth check isn't a bug in the AI's code generation. It's a missing requirement in the prompt, and the AI filled the gap with the simplest implementation.

What it looks like:

// BAD: No auth check — any caller gets this user's data
export async function GET(req: Request) {
  const { searchParams } = new URL(req.url);
  const userId = searchParams.get("userId");
  const user = await db.users.findUnique({ where: { id: userId } });
  return Response.json(user);
}

// GOOD: Session verified before any data is returned
export async function GET(req: Request) {
  const session = await getServerSession(authOptions); // your NextAuth config
  if (!session) return Response.json({ error: "Unauthorized" }, { status: 401 });
 
  // Only return the authenticated user's own data
  const user = await db.users.findUnique({ where: { id: session.user.id } });
  return Response.json(user);
}

The real-world consequence: One unprotected route can expose your entire user table, payment records, or admin functionality to anyone who finds it. This vulnerability class — Broken Access Control — has been #1 on the OWASP Top 10 for years. If you're using Supabase, database-level access control is just as important — the Supabase RLS security checklist covers the 10 SQL checks you should run. Not sure if your routes are protected? Scan your repo free and you'll know in under 5 minutes.

#4: SQL Injection Through Unsafe Input Handling

What it is: User-supplied input gets passed directly into a database query without sanitization, letting an attacker insert malicious database commands through form fields or URL parameters.

Why AI tools produce it: When you ask for a "custom query" or the prompt context doesn't make ORM usage obvious, AI tools sometimes generate raw SQL with string concatenation. The model has seen this pattern in its training data — it appears in tutorials, Stack Overflow answers, and older codebases. It generates what it's seen.

What it looks like:

// BAD: User input concatenated directly into SQL — classic injection target
const query = `SELECT * FROM users WHERE email = '${req.body.email}'`;
const result = await db.query(query);

// GOOD: Parameterized query — input never touches the SQL structure
const result = await db.query(
  "SELECT * FROM users WHERE email = $1",
  [req.body.email] // safely passed as a parameter, not concatenated
);

The real-world consequence: SQL injection lets an attacker bypass login, dump your entire database, or delete records. Modern ORMs like Prisma and Drizzle prevent this by default — but AI-generated raw SQL bypasses those protections. If any part of your codebase uses raw queries, check them.

#5: Cross-Site Scripting (XSS) via Unescaped Output

What it is: Cross-Site Scripting (XSS) happens when user-controlled input gets rendered directly into a webpage without being escaped — letting an attacker inject JavaScript that runs in other users' browsers.

Why AI tools produce it: AI generates front-end code that renders data back to the user. When that data comes from user input and the prompt doesn't specify sanitization, the AI takes the shortest path: render it directly. React escapes JSX output by default, which prevents most of this — but dangerouslySetInnerHTML bypasses that protection entirely, and AI tools sometimes reach for it when building rich-text displays or markdown renderers.

What to look for:

// BAD: dangerouslySetInnerHTML with unsanitized user content
function Comment({ content }: { content: string }) {
  return <div dangerouslySetInnerHTML={{ __html: content }} />;
}

// GOOD: Sanitize before rendering, or avoid dangerouslySetInnerHTML entirely
import DOMPurify from "isomorphic-dompurify"; // SSR-safe for Next.js
 
function Comment({ content }: { content: string }) {
  return <div dangerouslySetInnerHTML={{ __html: DOMPurify.sanitize(content) }} />;
}

The real-world consequence: An XSS vulnerability lets attackers steal session cookies, redirect users to phishing pages, or make requests on behalf of logged-in users. This is less common than items 1-4 in the repos we've scanned, but when it appears, it tends to appear in user-generated content features — comments, bios, notes, rich text editors.

#6: Overly Permissive CORS Configuration

What it is: Cross-Origin Resource Sharing (CORS) is the browser mechanism that controls which websites can make requests to your API. A permissive CORS config — specifically Access-Control-Allow-Origin: * — tells the browser that any website can call your API.

Why AI tools produce it: The tutorials and Stack Overflow answers in the AI's training data are full of Access-Control-Allow-Origin: * as the quick fix for "my API isn't working from my front end." It's the pattern that resolves the immediate error, so that's what the model learned to suggest. The fact that this setting is dangerous in production isn't part of the prompt resolution.

What it looks like:

// BAD: Any origin can call this API — including malicious ones
res.setHeader("Access-Control-Allow-Origin", "*");

// GOOD: Restrict to your own domain(s) only
const allowedOrigins = ["https://yourdomain.com", "https://app.yourdomain.com"];
const origin = req.headers.get("origin") ?? "";
if (allowedOrigins.includes(origin)) {
  res.setHeader("Access-Control-Allow-Origin", origin);
}

The real-world consequence: A wildcard CORS config combined with permissive credential handling lets malicious websites make authenticated requests to your API from a victim's browser. For APIs that rely on cookie-based auth, this is particularly dangerous.

#7: Exposed Debug Endpoints and Stack Traces

What it is: Error handling code that returns full stack traces, internal file paths, or debug information in API responses — information that's useful for developers but equally useful for attackers mapping your system.

Why AI tools produce it: AI generates helpful error handling. When you ask it to write a catch block, it writes a catch block that gives you the most information possible — which often means returning error.message or the full error object directly. In development, this is exactly what you want. In production, it's a data leak.

What it looks like:

// BAD: Full error details returned to the caller — maps your internals for attackers
export async function POST(req: Request) {
  try {
    const result = await processPayment(req);
    return Response.json(result);
  } catch (error: any) {
    return Response.json({ error: error.message, stack: error.stack }, { status: 500 });
  }
}

// GOOD: Generic message to the caller, full details logged server-side only
export async function POST(req: Request) {
  try {
    const result = await processPayment(req);
    return Response.json(result);
  } catch (error) {
    console.error("[payment] Unexpected error:", error); // full details in your logs
    return Response.json({ error: "Payment processing failed" }, { status: 500 });
  }
}

The real-world consequence: Stack traces expose file paths, library versions, and database query structures. An attacker who knows your tech stack, your file layout, and which libraries you use can target known vulnerabilities in those specific versions. Debug endpoints that were left on in production have been the entry point for several high-profile breaches. It's a low-severity finding that enables high-severity attacks.

Not sure where your app stands on the full security spectrum? Take our free Security Score Calculator — 10 questions, instant results, no signup.

How to Find All 7 in Your Repo Without Reading Every Line

Checking for all 7 patterns manually is possible. It's also slow, inconsistent, and easy to miss things — especially in a large project where AI has generated hundreds of files you've barely reviewed.

Here's the faster path. Data Hogo runs over 350 security checks including all 7 vulnerability types above, in parallel, in under 5 minutes.

Step 1: Go to datahogo.com and connect your GitHub account. It's a standard OAuth flow — read-only access to your code.

Step 2: Select the repository you want to scan. Public repos work on the free plan.

Step 3: Start the scan. The scanner runs five engines in parallel: secrets detection (Gitleaks + custom pattern matching), dependency auditing (npm audit + OSV database), code pattern analysis (Semgrep with 250+ security rules), configuration file review, and URL/header analysis.

Step 4: Read your security score and findings list. Each finding has a plain-English explanation: what it is, where it is in your code, why it matters, and what to change. No jargon walls, no CVSS ratings.

Step 5: Fix the critical findings first. Exposed secrets get rotated. Vulnerable packages get updated. Unprotected routes get session checks added. The scan prioritizes by severity so you know where to start.

The hardcoded secrets and missing auth findings — items 1 and 3 on this list — are the ones that cause immediate damage when exploited. Start there.

The free plan gives you 3 scans per month on 1 public repository. No credit card required.

Check your code — it's free, no credit card →

Frequently Asked Questions

What are the most common security vulnerabilities in AI-generated code?

The most common AI code vulnerabilities are hardcoded secrets and API keys, outdated dependencies with known vulnerabilities, missing authentication checks on API routes, SQL injection from unsafe input handling, Cross-Site Scripting (XSS) from unescaped user output, overly permissive CORS configuration, and exposed debug endpoints that leak stack traces. This order reflects real-world frequency — items 1-4 appear in the majority of AI-assisted repos we've scanned, while items 5-7 are less frequent but still consistently present.

Why does AI code have security bugs?

AI coding tools learn from training data that includes decades of insecure code patterns from public repositories and tutorials. They also operate without awareness of your project's security context — they can't see your .env file, don't know your auth setup, and generate code file-by-file without understanding your full application. The optimization target is code that works and matches your prompt, not code that's secure. LLM code security flaws are predictable and pattern-based as a result, which is what makes them scannable.

Does GitHub Copilot write insecure code?

Copilot can write secure code, but it frequently produces insecure patterns — especially around secrets, dependency selection, and authentication. The Veracode State of Software Security 2025 report found that 45% of AI-generated code has at least one vulnerability. Copilot doesn't audit your dependencies, can't see your .env file, and generates code based on the current file's context without a view of your full security model. Its output needs a security pass before you ship anything that handles user data.

How do I find AI code vulnerabilities in my repo?

The fastest approach is a static analysis scan that covers the full surface: secrets detection, dependency auditing, code pattern analysis, and configuration review. Data Hogo runs all of these in parallel and returns a prioritized findings list in under 5 minutes. Manual review is possible but slow — there are 7 common patterns to check and potentially hundreds of files to cover. Scan your repo free →

Is ChatGPT code safe to use in production?

ChatGPT-generated code has the same structural problems as other LLM tools — it doesn't see your environment, can't audit your dependencies, and defaults to inline API keys when not given explicit context about secrets management. The code may work perfectly and still expose credentials or skip authentication. Run a security scan before shipping anything that handles user accounts, payments, or external service integrations. If your project touches real user data, a quick scan is a reasonable step before you go live.

What security problems does Cursor AI code have?

Based on scanning 50 public Cursor repos, the most common Cursor AI security problems are exposed secrets (62% of repos), insecure dependencies with known CVEs (56%), and missing security headers (82%). Missing auth checks appeared in 28% of repos. Cursor is particularly prone to inlining API keys because it generates code file-by-file without global context about your .env setup — a mechanism explained in more detail in the vibe coding security risks post.

Your first 3 scans are free. No credit card. No sales call.

Scan your repo free →

Why AI Coding Tools Keep Making the Same Security Mistakes

#1: Hardcoded Secrets and API Keys

#2: Outdated Dependencies with Known Vulnerabilities

#3: Missing Authentication on API Routes

#4: SQL Injection Through Unsafe Input Handling

#5: Cross-Site Scripting (XSS) via Unescaped Output

#6: Overly Permissive CORS Configuration

#7: Exposed Debug Endpoints and Stack Traces

How to Find All 7 in Your Repo Without Reading Every Line

Frequently Asked Questions

What are the most common security vulnerabilities in AI-generated code?

Why does AI code have security bugs?

Does GitHub Copilot write insecure code?

How do I find AI code vulnerabilities in my repo?

Is ChatGPT code safe to use in production?

What security problems does Cursor AI code have?

Related Posts