Block-Level Parallelism in Parsing Block Structured Languages
Abhinav Jangda

TL;DR
This paper introduces a block-level parallel parser for LR(1) grammars that significantly speeds up parsing large, block-structured source code by enabling parallel processing of independent blocks.
Contribution
It presents a novel block parallel parser derived from Incremental Jump Shift Reduce Parser, capable of parallel parsing of independent code blocks without major grammar modifications.
Findings
Achieved 28% to 52% performance improvement in parsing Linux Kernel files.
Successfully developed a block parallel parser compatible with LR(1) grammars.
Demonstrated the parser's effectiveness on real-world large codebases.
Abstract
Softwares source code is becoming large and complex. Compilation of large base code is a time consuming process. Parallel compilation of code will help in reducing the time complexity. Parsing is one of the phases in compiler in which significant amount of time of compilation is spent. Techniques have already been developed to extract the parallelism available in parser. Current LR(k) parallel parsing techniques either face difficulty in creating Abstract Syntax Tree or requires modification in the grammar or are specific to less expressive grammars. Most of the programming languages like C, ALGOL are block-structured, and in most languages grammars the grammar of different blocks is independent, allowing different blocks to be parsed in parallel. We are proposing a block level parallel parser derived from Incremental Jump Shift Reduce Parser by [13]. Block Parallelized Parser (BPP) can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · semigroups and automata theory · Algorithms and Data Compression
