mediumCWE-770OWASP LLM04:2025

No AI Rate Limiting

Your app makes AI API calls with no per-user limits, letting a single user (or bot) trigger thousands of requests and drain your API budget in minutes.

How It Works

AI API calls are expensive — GPT-4o can cost $10-30 per million tokens, and Claude Opus is similar. Without rate limiting, a free user can write a script that hammers your /api/ai endpoint all day. This is a cost attack: they pay nothing, you pay the bill. It's also a denial-of-service vector if you hit API rate limits and legitimate users get blocked.

Vulnerable Code
// BAD: no rate limiting on AI endpoint
export async function POST(req: Request) {
  const { message } = await req.json();
  // Anyone authenticated can call this unlimited times
  const response = await openai.chat.completions.create({ messages: [{ role: 'user', content: message }] });
  return Response.json({ result: response.choices[0].message.content });
}
Secure Code
// GOOD: enforce per-user rate limits before calling the AI
export async function POST(req: Request) {
  const user = await getAuthUser(req);
  const usage = await getUserAiUsage(user.id); // check Redis/DB counter
  if (usage.requestsThisHour > 20) {
    return Response.json({ error: 'Rate limit exceeded' }, { status: 429 });
  }
  await incrementAiUsage(user.id);
  const response = await openai.chat.completions.create({ messages: [{ role: 'user', content: message }] });
  return Response.json({ result: response.choices[0].message.content });
}

Real-World Example

Multiple indie developers have reported waking up to $500-$2000 AI API bills after forgetting to add rate limiting. Bots scrape public apps for unprotected AI endpoints within days of launch. OpenAI now offers spend limits, but that doesn't protect you from hitting your own limits and breaking your app for real users.

How to Prevent It

  • Implement per-user rate limits (e.g., 20 requests/hour for free, 200 for paid) tracked in Redis or your database
  • Set hard spending limits on your AI provider dashboard as a safety net
  • Add authentication to all AI endpoints — never expose them publicly without auth
  • Track token usage per user, not just request count (one request can consume thousands of tokens)
  • Alert when usage spikes — a user making 10x their normal usage is a red flag

Affected Technologies

Node.jsPython

Data Hogo detects this vulnerability automatically.

Scan Your Repo Free

Related Vulnerabilities