H2LooP Spark Preview: Continual Pretraining of Large Language Models for Low-Level Embedded Systems Code
Amit Singh, Vedant Nipane, Pulkit Agrawal, Jatin Kishnani, Sairanjan Mishra

TL;DR
This paper presents H2LooP Spark Preview, a continual pretraining pipeline that adapts a large language model to low-level embedded systems programming, significantly improving its code generation performance in this specialized domain.
Contribution
It introduces a novel continual pretraining approach using high-rank LoRA on a large, domain-specific dataset, enabling smaller models to excel in embedded systems code generation.
Findings
70.4% reduction in in-domain perplexity
Outperforms larger models on embedded code benchmarks
Open-source release of the trained model checkpoint
Abstract
Large language models (LLMs) demonstrate strong code generation abilities in general-purpose programming languages but remain limited in specialized domains such as low-level embedded systems programming. This domain involves hardware register manipulation, vendor-specific SDKs, real-time operating system APIs, and hardware abstraction layers that are underrepresented in standard pretraining corpora. We introduce H2LooP Spark Preview, a continual pretraining (CPT) pipeline that adapts the OLMo-3-7B-a fully open language model to the embedded systems domain using BF16 LoRA with rank-stabilized scaling on 8 NVIDIA H100 GPUs. Our training corpus is constructed from repository-datasheet pairs covering 100B tokens of raw embedded systems data across 117 manufacturers, processed using the hierarchical datasheet-to-code mapping approach proposed in SpecMap (Nipane et al., 2026). The resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Software Engineering Research · Embedded Systems Design Techniques
