Vision Transformer for femur fracture classification
Leonardo Tanzi, Andrea Audisio, Giansalvo Cirrincione and, Alessandro Aprato, Enrico Vezzetti

TL;DR
This study demonstrates that Vision Transformers significantly improve femur fracture classification accuracy over CNNs, especially when combined with specialist diagnosis, using the largest annotated dataset of proximal femur fractures.
Contribution
The paper introduces a modified Vision Transformer approach for femur fracture classification, outperforming CNNs and demonstrating its effectiveness with the largest dataset and expert collaboration.
Findings
ViT correctly predicted 83% of test images
Average diagnostic improvement of 29% with ViT assistance
First successful application of ViT in sub-fracture classification
Abstract
In recent years, the scientific community has focused on the development of CAD tools that could improve bone fractures' classification, mostly based on Convolutional Neural Network (CNN). However, the discerning accuracy of fractures' subtypes was far from optimal. This paper proposes a modified version of a very recent and powerful deep learning technique, the Vision Transformer (ViT), outperforming CNNs based approaches and consequently increasing specialists' diagnosis accuracy. 4207 manually annotated images were used and distributed, by following the AO/OTA classification, in different fracture types, the largest labeled dataset of proximal femur fractures used in literature. The ViT architecture was used and compared with a classic CNN and a multistage architecture composed of successive CNNs in cascade. To demonstrate the reliability of this approach, 1) the attention maps were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Artificial Intelligence in Healthcare and Education · Orthopedic Infections and Treatments
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Dropout · Vision Transformer · Label Smoothing · Residual Connection
