PDF Generation Injection
Injecting HTML or JavaScript into PDF generation templates allows attackers to read server-side files, make internal network requests, or execute scripts in the PDF viewer.
How It Works
Many applications generate PDFs from HTML templates using libraries like Puppeteer, wkhtmltopdf, or WeasyPrint. When user input is inserted into the HTML template without sanitization, attackers can inject malicious HTML and JavaScript. Since the PDF engine renders the HTML on the server, injected scripts execute with server-side context. An attacker can use <script> tags to read local files via XMLHttpRequest('file:///etc/passwd'), make requests to internal services (SSRF), or exfiltrate environment variables. The resulting PDF contains the leaked data, which the attacker then downloads.
const puppeteer = require('puppeteer');
app.post('/invoice', async (req, res) => {
const { customerName, items } = req.body;
const html = `<h1>Invoice for ${customerName}</h1>
<ul>${items.map(i => `<li>${i}</li>`).join('')}</ul>`;
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(html);
const pdf = await page.pdf();
res.send(pdf);
});const puppeteer = require('puppeteer');
const DOMPurify = require('isomorphic-dompurify');
app.post('/invoice', async (req, res) => {
const name = DOMPurify.sanitize(req.body.customerName);
const items = req.body.items.map(i => DOMPurify.sanitize(i));
const html = `<h1>Invoice for ${name}</h1>
<ul>${items.map(i => `<li>${i}</li>`).join('')}</ul>`;
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setJavaScriptEnabled(false);
await page.setContent(html);
const pdf = await page.pdf();
res.send(pdf);
});Real-World Example
In 2022, researchers demonstrated SSRF attacks through PDF generation in multiple SaaS platforms. By injecting HTML like <iframe src='http://169.254.169.254/latest/meta-data/'>, they accessed AWS metadata endpoints and extracted IAM credentials from PDF invoices and reports.
How to Prevent It
- Sanitize all user input with DOMPurify before inserting into HTML templates
- Disable JavaScript execution in the PDF rendering engine
- Use text-only PDF libraries instead of HTML-to-PDF converters when possible
- Block network access from the PDF rendering process using sandboxing
Affected Technologies
Data Hogo detects this vulnerability automatically.
Scan Your Repo FreeRelated Vulnerabilities
Path Traversal
highFile paths constructed with unvalidated user input allow attackers to read or write arbitrary files on the server using ../ sequences.
File Upload No Validation
highAccepting file uploads without verifying type, size, or content allows attackers to upload malicious executables, web shells, or oversized files that crash the server.
SVG with JavaScript
mediumAccepting SVG uploads without sanitization allows attackers to embed JavaScript in SVG files, enabling XSS attacks when the SVG is rendered in a browser.
EXIF Not Stripped
lowImages served without stripping EXIF metadata can leak GPS coordinates, device information, timestamps, and other sensitive data about the person who took the photo.