HDNA: A graph-based change detection in HTML pages(Deface Attack Detection)
Mahdi Akhi, Nona Ghazizadeh

TL;DR
HDNA is a novel graph-based method for detecting changes in HTML DOM trees, aiding in security, testing, and development by identifying structural differences even in dynamic content.
Contribution
Introduces HDNA, a new approach for analyzing and comparing DOM trees to detect structural changes in HTML pages, including dynamic content.
Findings
Effective in identifying DOM changes caused by server updates or user interactions.
Useful for security analysis and detecting vulnerabilities in web pages.
Supports web development and testing by tracking page evolution.
Abstract
In this paper, a new approach called HDNA (HTML DNA) is introduced for analyzing and comparing Document Object Model (DOM) trees in order to detect differences in HTML pages. This method assigns an identifier to each HTML page based on its structure, which proves to be particularly useful for detecting variations caused by server-side updates, user interactions or potential security risks. The process involves preprocessing the HTML content generating a DOM tree and calculating the disparities between two or more trees. By assigning weights to the nodes valuable insights about their hierarchical importance are obtained. The effectiveness of the HDNA approach has been demonstrated in identifying changes in DOM trees even when dynamically generated content is involved. Not does this method benefit web developers, testers, and security analysts by offering a deeper understanding of how web…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Web Data Mining and Analysis
