# CPEG: A Typed Tree Construction from Parsing Expression Grammars with   Regex-Like Captures

**Authors:** Daisuke Yamaguchi, Kimio Kuramitsu

arXiv: 1812.07429 · 2018-12-19

## TL;DR

This paper introduces CPEG, an extended parsing expression grammar with regex-like captures that constructs syntax trees with guaranteed structural constraints, reducing user effort and enabling formal type inference.

## Contribution

It presents a novel CPEG formalism with capture annotations and a formal type inference system ensuring structural constraints and soundness.

## Key findings

- CPEG guarantees structural constraints of syntax trees for any input.
- Type inference for CPEG is sound and unique.
- CPEG reduces user code for syntax validation.

## Abstract

CPEG is an extended parsing expression grammar with regex-like capture annotation. Two annotations (capture and left-folding) allow a flexible construction of syntax trees from arbitrary parsing patterns. More importantly, CPEG is designed to guarantee structural constraints of syntax trees for any input strings. This reduces the amount of user code needed to check whether the intended elements exist.   To represent the structural constraints, we focus on regular expression types, a variant formalism of tree automata, which have been intensively studied in the context of XML schemas. Regular expression type is inferred from a given CPEG by the type inference that is formally developed in this paper. We prove the soundness and the uniqueness of the type inference. The type inference enables a CPEG to serve both as a syntactic specification of the input and a schematic specification of the output.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.07429/full.md

## Figures

51 figures with captions in the complete paper: https://tomesphere.com/paper/1812.07429/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1812.07429/full.md

---
Source: https://tomesphere.com/paper/1812.07429