Vision Transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus
Sangjoon Park, Gwanghyun Kim, Yujin Oh, Joon Beom Seo, Sang Min Lee,, Jin Hwan Kim, Sungjun Moon, Jae-Kwang Lim, Jong Chul Ye

TL;DR
This paper introduces a novel vision Transformer architecture that leverages low-level CXR features for improved COVID-19 diagnosis, demonstrating superior performance and generalization across diverse datasets.
Contribution
The study proposes a new vision Transformer using abnormal CXR features as corpus, enhancing feature embedding and model robustness for COVID-19 detection.
Findings
Achieved state-of-the-art performance on multiple datasets
Demonstrated improved generalization across different institutions
Utilized low-level CXR features for better feature embedding
Abstract
Under the global COVID-19 crisis, developing robust diagnosis algorithm for COVID-19 using CXR is hampered by the lack of the well-curated COVID-19 data set, although CXR data with other disease are abundant. This situation is suitable for vision transformer architecture that can exploit the abundant unlabeled data using pre-training. However, the direct use of existing vision transformer that uses the corpus generated by the ResNet is not optimal for correct feature embedding. To mitigate this problem, we propose a novel vision Transformer by using the low-level CXR feature corpus that are obtained to extract the abnormal CXR features. Specifically, the backbone network is trained using large public datasets to obtain the abnormal features in routine diagnosis such as consolidation, glass-grass opacity (GGO), etc. Then, the embedded features from the backbone network are used as corpus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Anomaly Detection Techniques and Applications · Seismology and Earthquake Studies
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · *Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Dropout · Attention Is All You Need · Label Smoothing · Adam
