Seeing the Unseen: Towards Zero-Shot Inspection for Wind Turbine Blades using Knowledge-Augmented Vision Language Models
Yang Zhang, Qianyu Zhou, Farhad Imani, Jiong Tang

TL;DR
This paper introduces a zero-shot inspection framework for wind turbine blades that leverages knowledge-augmented vision-language models and retrieval techniques to detect diverse damages without extensive labeled data.
Contribution
It presents a novel retrieval-augmented vision-language approach that integrates domain knowledge for zero-shot damage detection in wind turbine blades.
Findings
The framework correctly classified all damage samples in the small dataset.
Retrieval grounding improves accuracy and precision over baseline models.
The method enhances explainability and generalizability in industrial inspection.
Abstract
Wind turbine blades operate in harsh environments, making timely damage detection essential for preventing failures and optimizing maintenance. Drone-based inspection and deep learning are promising, but typically depend on large, labeled datasets, which limit their ability to detect rare or evolving damage types. To address this, we propose a zero-shot-oriented inspection framework that integrates Retrieval-Augmented Generation (RAG) with Vision-Language Models (VLM). A multimodal knowledge base is constructed, comprising technical documentation, representative reference images, and domain-specific guidelines. A hybrid text-image retriever with keyword-aware reranking assembles the most relevant context to condition the VLM at inference, injecting domain knowledge without task-specific training. We evaluate the framework on 30 labeled blade images covering diverse damage categories.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
