Automated Test Data Generation for Enterprise Protobuf Systems: A Metaclass-Enhanced Statistical Approach
Y. Du

TL;DR
This paper introduces a metaclass-based statistical framework for efficient and realistic test data generation in enterprise protobuf systems, significantly improving coverage and reducing preparation time.
Contribution
It proposes a novel approach combining Python metaclasses and statistical analysis to automatically generate complex, realistic protobuf test data for large-scale enterprise systems.
Findings
Up to 95% reduction in test data preparation time
80% improvement in test coverage
Handles protobuf structures with up to 15 nesting levels
Abstract
Large-scale enterprise systems utilizing Protocol Buffers (protobuf) present significant challenges for performance testing, particularly when targeting intermediate business interfaces with complex nested data structures. Traditional test data generation approaches are inadequate for handling the intricate hierarchical and graph-like structures inherent in enterprise protobuf schemas. This paper presents a novel test data generation framework that leverages Python's metaclass system for dynamic type enhancement and statistical analysis of production logs for realistic value domain extraction. Our approach combines automatic schema introspection, statistical value distribution analysis, and recursive descent algorithms for handling deeply nested structures. Experimental evaluation on three real-world enterprise systems demonstrates up to 95\% reduction in test data preparation time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
