PHM-Bench: A Domain-Specific Benchmarking Framework for Systematic Evaluation of Large Models in Prognostics and Health Management

Puyu Yang; Laifa Tao; Zijian Huang; Haifei Liu; Wenyan Cao; Hao Ji; Jianan Qiu; Qixuan Huang; Xuanyuan Su; Yuhang Xie; Jun Zhang; Shangyu Li; Chen Lu; Zhixuan Lian

arXiv:2508.02490·cs.AI·August 5, 2025

PHM-Bench: A Domain-Specific Benchmarking Framework for Systematic Evaluation of Large Models in Prognostics and Health Management

Puyu Yang, Laifa Tao, Zijian Huang, Haifei Liu, Wenyan Cao, Hao Ji, Jianan Qiu, Qixuan Huang, Xuanyuan Su, Yuhang Xie, Jun Zhang, Shangyu Li, Chen Lu, Zhixuan Lian

PDF

Open Access

TL;DR

This paper introduces PHM-Bench, a comprehensive evaluation framework for assessing large models in Prognostics and Health Management, addressing current gaps in evaluation methods and supporting the development of specialized PHM models.

Contribution

The study presents PHM-Bench, a three-dimensional, multi-level evaluation framework specifically designed for large models in PHM, incorporating diverse metrics and datasets.

Findings

01

PHM-Bench enables systematic evaluation of models across PHM tasks.

02

It provides a structured approach for assessing general-purpose and domain-specific models.

03

The framework supports guiding the development of PHM-specialized large models.

Abstract

With the rapid advancement of generative artificial intelligence, large language models (LLMs) are increasingly adopted in industrial domains, offering new opportunities for Prognostics and Health Management (PHM). These models help address challenges such as high development costs, long deployment cycles, and limited generalizability. However, despite the growing synergy between PHM and LLMs, existing evaluation methodologies often fall short in structural completeness, dimensional comprehensiveness, and evaluation granularity. This hampers the in-depth integration of LLMs into the PHM domain. To address these limitations, this study proposes PHM-Bench, a novel three-dimensional evaluation framework for PHM-oriented large models. Grounded in the triadic structure of fundamental capability, core task, and entire lifecycle, PHM-Bench is tailored to the unique demands of PHM system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Fault Diagnosis Techniques · Topic Modeling · Artificial Intelligence in Healthcare and Education