DevMuT: Testing Deep Learning Framework via Developer Expertise-Based Mutation

Yanzhou Mu; Juan Zhai; Chunrong Fang; Xiang Chen; Zhixiang Cao; Peiran Yang; Yinglong Zou; Tao Zheng; Zhenyu Chen

arXiv:2507.04360·cs.SE·July 8, 2025

DevMuT: Testing Deep Learning Framework via Developer Expertise-Based Mutation

Yanzhou Mu, Juan Zhai, Chunrong Fang, Xiang Chen, Zhixiang Cao, Peiran Yang, Yinglong Zou, Tao Zheng, Zhenyu Chen

PDF

TL;DR

DevMuT is a novel testing framework for deep learning models that uses developer expertise-based mutation to generate diverse models and detect important defects across the model lifecycle, outperforming existing methods.

Contribution

The paper introduces DevMuT, a new DL framework testing method that incorporates developer expertise into mutation operators to identify critical bugs more effectively.

Findings

01

Achieves at least 71.68% improvement in model diversity

02

Detects 117 defects with 63 confirmed and 24 fixed

03

Outperforms state-of-the-art baselines in defect detection

Abstract

Deep learning (DL) frameworks are the fundamental infrastructure for various DL applications. Framework defects can profoundly cause disastrous accidents, thus requiring sufficient detection. In previous studies, researchers adopt DL models as test inputs combined with mutation to generate more diverse models. Though these studies demonstrate promising results, most detected defects are considered trivial (i.e., either treated as edge cases or ignored by the developers). To identify important bugs that matter to developers, we propose a novel DL framework testing method DevMuT, which generates models by adopting mutation operators and constraints derived from developer expertise. DevMuT simulates developers'common operations in development and detects more diverse defects within more stages of the DL model lifecycle (e.g., model training and inference). We evaluate the performance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.