lowCWE-636

AI Model Fallback Insecure

When the primary AI model fails, your app silently falls back to a weaker or unvalidated model that bypasses your safety configurations.

How It Works

You carefully configure your primary model with system prompts, safety filters, and output validation. Then you add a try/catch fallback to a different model — without applying the same safety configurations. An attacker who can trigger errors in the primary model (through crafted inputs that cause timeouts or rate limit errors) can force execution through the unvalidated fallback path.

Vulnerable Code
// BAD: fallback model has no system prompt or safety config
try {
  response = await anthropic.messages.create({ model: 'claude-opus-4-6', system: SAFETY_PROMPT, ...params });
} catch {
  // Fallback with no safety config applied
  response = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: userInput }] });
}
Secure Code
// GOOD: apply the same safety config to both primary and fallback
const safeParams = { system: SAFETY_PROMPT, maxTokens: 1024 };
try {
  response = await callWithSafetyConfig(anthropic, safeParams, userInput);
} catch {
  // Fallback uses identical safety configuration
  response = await callWithSafetyConfig(openai, safeParams, userInput);
}

Real-World Example

A customer service bot was configured with strict topic filters on its primary model. The fallback path, added as a quick fix for reliability, had no filters — testers discovered they could force the fallback by sending very long inputs that triggered a timeout, then getting unconstrained responses.

How to Prevent It

  • Apply identical safety configurations (system prompts, output validation, rate limits) to all fallback models
  • Log whenever a fallback is triggered so you can detect if it's being exploited
  • Test your fallback paths explicitly — they often have different behavior than the primary
  • Consider failing closed instead of falling back if safety is critical to your use case
  • Use a single abstraction layer that applies safety config regardless of which model is chosen

Affected Technologies

Node.jsPython

Data Hogo detects this vulnerability automatically.

Scan Your Repo Free

Related Vulnerabilities