An Incremental Learner for Language-Based Anomaly Detection in XML

Harald Lampesberger

arXiv:1603.07924·cs.CR·March 28, 2016

An Incremental Learner for Language-Based Anomaly Detection in XML

Harald Lampesberger

PDF

TL;DR

This paper presents an incremental automaton-based method for detecting anomalies in XML protocols, effectively identifying security issues and resisting poisoning attacks, outperforming traditional validation methods.

Contribution

It introduces datatyped XML visibly pushdown automata (dXVPAs) and an incremental learning algorithm for stream validation and anomaly detection in XML protocols.

Findings

01

Automaton achieves zero false positives in all scenarios.

02

Outperforms traditional schema validation methods.

03

Effective against adversarial poisoning attacks.

Abstract

The Extensible Markup Language (XML) is a complex language, and consequently, XML-based protocols are susceptible to entire classes of implicit and explicit security problems. Message formats in XML-based protocols are usually specified in XML Schema, and as a first-line defense, schema validation should reject malformed input. However, extension points in most protocol specifications break validation. Extension points are wildcards and considered best practice for loose composition, but they also enable an attacker to add unchecked content in a document, e.g., for a signature wrapping attack. This paper introduces datatyped XML visibly pushdown automata (dXVPAs) as language representation for mixed-content XML and presents an incremental learner that infers a dXVPA from example documents. The learner generalizes XML types and datatypes in terms of automaton states and transitions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.