Integrating Structural Description of Data Format Information into Programming to Auto-generate File Reading Programs
Xinghua Cheng, Erjie Hu, Di Hu

TL;DR
This paper presents a novel method for automatically generating file reading programs by leveraging structured data format descriptions, significantly reducing manual effort in handling heterogeneous data formats.
Contribution
It introduces a new approach that uses Data Format Markup Language (DFML) to automatically generate file reading programs for various data formats, including binary and text files.
Findings
Effective automatic program generation demonstrated on binary and text files.
DFML-based approach reduces manual coding effort.
Tool DFML Editor facilitates editing and generating data format descriptions.
Abstract
File reading is the basis for data sharing and scientific computing. However, manual programming for file reading is labour-intensive and time-consuming, as data formats are heterogeneous and complex. To address such an issue, this study proposes a novel approach for the automatic generation of file reading programs based on structured and self-described data format information. This approach provides two modes composed of sequentially and randomly reading. The file data format is described by Data Format Markup Language and thus DFML documents are generated. The formation of data type sequences by parsing those DFML documents. The generation of programs for sequential or random reading data with formed data type sequences and general programing rules for specific programming languages. A tool named DFML Editor was developed for generating and editing DFML documents. Case studies on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning · Parallel Computing and Optimization Techniques
