On-the-Fly Syntax Highlighting using Neural Networks
Marco Edoardo Palma, Pasquale Salza, Harald C. Gall

TL;DR
This paper introduces a neural network-based method for real-time syntax highlighting that improves accuracy and efficiency over traditional regex-based tools, especially for complex language features.
Contribution
It presents a novel deep learning approach for on-the-fly syntax highlighting that handles both correct and incorrect code derivations, surpassing existing regex-based methods.
Findings
Achieves near-perfect accuracy in syntax highlighting
Outperforms regex-based strategies in speed and correctness
Effective across multiple programming languages
Abstract
With the presence of online collaborative tools for software developers, source code is shared and consulted frequently, from code viewers to merge requests and code snippets. Typically, code highlighting quality in such scenarios is sacrificed in favor of system responsiveness. In these on-the-fly settings, performing a formal grammatical analysis of the source code is not only expensive, but also intractable for the many times the input is an invalid derivation of the language. Indeed, current popular highlighters heavily rely on a system of regular expressions, typically far from the specification of the language's lexer. Due to their complexity, regular expressions need to be periodically updated as more feedback is collected from the users and their design unwelcome the detection of more complex language formations. This paper delivers a deep learning-based approach suitable for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
