An Incremental Learner for Language-Based Anomaly Detection in XML
Harald Lampesberger

TL;DR
This paper presents an incremental automaton-based method for detecting anomalies in XML protocols, effectively identifying security issues and resisting poisoning attacks, outperforming traditional validation methods.
Contribution
It introduces datatyped XML visibly pushdown automata (dXVPAs) and an incremental learning algorithm for stream validation and anomaly detection in XML protocols.
Findings
Automaton achieves zero false positives in all scenarios.
Outperforms traditional schema validation methods.
Effective against adversarial poisoning attacks.
Abstract
The Extensible Markup Language (XML) is a complex language, and consequently, XML-based protocols are susceptible to entire classes of implicit and explicit security problems. Message formats in XML-based protocols are usually specified in XML Schema, and as a first-line defense, schema validation should reject malformed input. However, extension points in most protocol specifications break validation. Extension points are wildcards and considered best practice for loose composition, but they also enable an attacker to add unchecked content in a document, e.g., for a signature wrapping attack. This paper introduces datatyped XML visibly pushdown automata (dXVPAs) as language representation for mixed-content XML and presents an incremental learner that infers a dXVPA from example documents. The learner generalizes XML types and datatypes in terms of automaton states and transitions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
