lowCWE-636

AI Model Fallback Insecure

When the primary AI model fails, your app silently falls back to a weaker or unvalidated model that bypasses your safety configurations.

How It Works

You carefully configure your primary model with system prompts, safety filters, and output validation. Then you add a try/catch fallback to a different model — without applying the same safety configurations. An attacker who can trigger errors in the primary model (through crafted inputs that cause timeouts or rate limit errors) can force execution through the unvalidated fallback path.

Vulnerable Code

// BAD: fallback model has no system prompt or safety config
try {
  response = await anthropic.messages.create({ model: 'claude-opus-4-6', system: SAFETY_PROMPT, ...params });
} catch {
  // Fallback with no safety config applied
  response = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: userInput }] });
}

Secure Code

// GOOD: apply the same safety config to both primary and fallback
const safeParams = { system: SAFETY_PROMPT, maxTokens: 1024 };
try {
  response = await callWithSafetyConfig(anthropic, safeParams, userInput);
} catch {
  // Fallback uses identical safety configuration
  response = await callWithSafetyConfig(openai, safeParams, userInput);
}

Real-World Example

A customer service bot was configured with strict topic filters on its primary model. The fallback path, added as a quick fix for reliability, had no filters — testers discovered they could force the fallback by sending very long inputs that triggered a timeout, then getting unconstrained responses.

How to Prevent It

Apply identical safety configurations (system prompts, output validation, rate limits) to all fallback models
Log whenever a fallback is triggered so you can detect if it's being exploited
Test your fallback paths explicitly — they often have different behavior than the primary
Consider failing closed instead of falling back if safety is critical to your use case
Use a single abstraction layer that applies safety config regardless of which model is chosen

Affected Technologies

Node.jsPython

Data Hogo detects this vulnerability automatically.

Scan Your Repo Free

Related Vulnerabilities

Prompt Injection

high

User input is concatenated directly into an LLM prompt, letting attackers override your instructions and make the AI do things you never intended.

CWE-77OWASP LLM01:2025

PII Leakage to AI Models

high

Your app sends personally identifiable information — emails, names, passwords, phone numbers — to external AI APIs, exposing user data to third-party model providers.

CWE-359OWASP LLM02:2025

AI Response Without Validation

medium

LLM output is rendered or executed directly without checking whether it matches the expected format or contains harmful content.

CWE-116OWASP LLM02:2025

AI API Key in Frontend

critical

Your OpenAI, Anthropic, or other AI API key is exposed in client-side code, where anyone can steal it and rack up charges on your account.

CWE-312OWASP LLM09:2025