AI Model Fallback Insecure
When the primary AI model fails, your app silently falls back to a weaker or unvalidated model that bypasses your safety configurations.
How It Works
You carefully configure your primary model with system prompts, safety filters, and output validation. Then you add a try/catch fallback to a different model — without applying the same safety configurations. An attacker who can trigger errors in the primary model (through crafted inputs that cause timeouts or rate limit errors) can force execution through the unvalidated fallback path.
// BAD: fallback model has no system prompt or safety config
try {
response = await anthropic.messages.create({ model: 'claude-opus-4-6', system: SAFETY_PROMPT, ...params });
} catch {
// Fallback with no safety config applied
response = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: userInput }] });
}// GOOD: apply the same safety config to both primary and fallback
const safeParams = { system: SAFETY_PROMPT, maxTokens: 1024 };
try {
response = await callWithSafetyConfig(anthropic, safeParams, userInput);
} catch {
// Fallback uses identical safety configuration
response = await callWithSafetyConfig(openai, safeParams, userInput);
}Real-World Example
A customer service bot was configured with strict topic filters on its primary model. The fallback path, added as a quick fix for reliability, had no filters — testers discovered they could force the fallback by sending very long inputs that triggered a timeout, then getting unconstrained responses.
How to Prevent It
- Apply identical safety configurations (system prompts, output validation, rate limits) to all fallback models
- Log whenever a fallback is triggered so you can detect if it's being exploited
- Test your fallback paths explicitly — they often have different behavior than the primary
- Consider failing closed instead of falling back if safety is critical to your use case
- Use a single abstraction layer that applies safety config regardless of which model is chosen
Affected Technologies
Data Hogo detects this vulnerability automatically.
Scan Your Repo FreeRelated Vulnerabilities
Prompt Injection
highUser input is concatenated directly into an LLM prompt, letting attackers override your instructions and make the AI do things you never intended.
PII Leakage to AI Models
highYour app sends personally identifiable information — emails, names, passwords, phone numbers — to external AI APIs, exposing user data to third-party model providers.
AI Response Without Validation
mediumLLM output is rendered or executed directly without checking whether it matches the expected format or contains harmful content.
AI API Key in Frontend
criticalYour OpenAI, Anthropic, or other AI API key is exposed in client-side code, where anyone can steal it and rack up charges on your account.