Blind Spots: Automatically detecting ignored program inputs
Henrik Brodin, Evan Sultanik, Marek Surovi\v{c}

TL;DR
This paper introduces an automated dynamic analysis technique to detect blind spots in program inputs, revealing exploitable bugs and ignored data in complex file parsers like PDF, with broad applicability to other formats.
Contribution
It formalizes blind spots, develops a dynamic information flow tracking method, and demonstrates its effectiveness in identifying parser bugs and ignored inputs in PDFs.
Findings
Detected exploitable bugs in MuPDF PDF parser
Missed detection rate is no higher than 11%
At least 5% of each PDF is ignored by the parser
Abstract
A blind spot is any input to a program that can be arbitrarily mutated without affecting the program's output. Blind spots can be used for steganography or to embed malware payloads. If blind spots overlap file format keywords, they indicate parsing bugs that can lead to exploitable differentials. For example, one could craft a document that renders one way in one viewer and a completely different way in another viewer. They have also been used to circumvent code signing in Android binaries, to coerce certificate authorities to misbehave, and to execute HTTP request smuggling and parameter pollution attacks. This paper formalizes the operational semantics of blind spots, leading to a technique based on dynamic information flow tracking that automatically detects blind spots. An efficient implementation is introduced and evaluated against a corpus of over a thousand diverse PDFs parsed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Advanced Data Storage Technologies · Security and Verification in Computing
