SpecMap: Hierarchical LLM Agent for Datasheet-to-Code Traceability Link Recovery in Systems Engineering
Vedant Nipane, Pulkit Agrawal, Amit Singh

TL;DR
This paper introduces SpecMap, a hierarchical LLM-based approach for accurate and efficient traceability link recovery between datasheets and code in embedded systems, improving over traditional methods.
Contribution
It presents a novel hierarchical methodology leveraging large language models for semantic and structural analysis across multiple abstraction levels in embedded systems.
Findings
Achieves up to 73.3% file mapping accuracy.
Reduces LLM token consumption by 84%.
Cuts end-to-end runtime by approximately 80%.
Abstract
Establishing precise traceability between embedded systems datasheets and their corresponding code implementations remains a fundamental challenge in systems engineering, particularly for low-level software where manual mapping between specification documents and large code repositories is infeasible. Existing Traceability Link Recovery approaches primarily rely on lexical similarity and information retrieval techniques, which struggle to capture the semantic, structural, and symbol level relationships prevalent in embedded systems software. We present a hierarchical datasheet-to-code mapping methodology that employs large language models for semantic analysis while explicitly structuring the traceability process across multiple abstraction levels. Rather than performing direct specification-to-code matching, the proposed approach progressively narrows the search space through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Software Engineering Methodologies · Software System Performance and Reliability
