Portable Data exFiltration: XSS for PDFs

Conference:  BlackHat EU 2020



The presentation discusses the vulnerabilities of PDFs and how they can be exploited through injection attacks. The speaker demonstrates various techniques for injecting code into PDFs and stealing their contents.
  • PDFs can be vulnerable to injection attacks if user input is not properly validated
  • Chrome and Acrobat enable these injections
  • Injection attacks can be used to steal the contents of a PDF or execute code on a victim's machine
  • Techniques for injecting code into PDFs include using annotations, form buttons, and text fields
  • The speaker provides examples of how to extract text from a PDF and perform SSRF attacks through injection
  • To prevent PDF injection attacks, libraries should escape PDF strings and validate user input
  • The speaker was inspired by the work of other researchers in the field, including Insert Script and Albertini
The speaker demonstrates how they were able to inject code into a PDF and steal its contents by using a form button and annotations. They also show how they were able to bypass a WAF by using cached resources. The speaker emphasizes the importance of properly validating user input and escaping PDF strings to prevent injection attacks.


PDF documents and PDF generators are ubiquitous on the web, and so are injection vulnerabilities. Did you know that controlling a measly HTTP hyperlink can provide a foothold into the inner workings of a PDF? In this session, you will learn how to use a single link to compromise the contents of a PDF and exfiltrate it to a remote server, just like a blind XSS attack.I'll show how you can inject PDF code to escape objects, hijack links, and even execute arbitrary JavaScript - basically XSS within the bounds of a PDF document. I evaluate several popular PDF libraries for injection attacks, as well as the most common readers: Acrobat and Chrome's PDFium. You'll learn how to create the "alert(1)" of PDF injection and how to improve it to inject JavaScript that can steal the contents of a PDF on both readers. I'll share how I was able to enumerate the various PDF objects to discover functions that make external requests using a custom JavaScript enumerator, which enable you to exfiltrate data from the PDF. Even PDFs loaded from the filesystem in Acrobat, which have more rigorous protection, can still be made to make external requests. I've successfully crafted an injection that can perform a SSRF attack on a PDF rendered server-side. I've also managed to read the contents of files from the same domain, even when the Acrobat user agent is blocked by a WAF. Finally, I'll show you how to steal the contents of a PDF without user interaction, and wrap up with a hybrid PDF that works on both PDFium and Acrobat.