YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
Marios Impraimakis, Daniel Vazquez, and Feiyu Zhou

TL;DR
This paper introduces a novel interpretable object detection framework combining Kolmogorov-Arnold networks with YOLOv10 and a multimodal foundation model, enhancing transparency and trustworthiness in autonomous vision systems.
Contribution
It develops a post-hoc surrogate model for trust estimation in object detection, providing visual interpretability and reliable confidence scores in complex scenes.
Findings
Accurately identifies low-trust predictions under challenging conditions.
Enables visualization of feature influence on confidence scores.
Integrates multimodal captions for scene understanding without compromising interpretability.
Abstract
The interpretable object detection capabilities of a novel Kolmogorov-Arnold network framework are examined here. The approach refers to a key limitation in computer vision for autonomous vehicles perception, and beyond. These systems offer limited transparency regarding the reliability of their confidence scores in visually degraded or ambiguous scenes. To address this limitation, a Kolmogorov-Arnold network is employed as an interpretable post-hoc surrogate to model the trustworthiness of the You Only Look Once (Yolov10) detections using seven geometric and semantic features. The additive spline-based structure of the Kolmogorov-Arnold network enables direct visualisation of each feature's influence. This produces smooth and transparent functional mappings that reveal when the model's confidence is well supported and when it is unreliable. Experiments on both Common Objects in Context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
