Identification of key factors for early detection of rheumatoid arthritis in primary care using machine learning
Fatemeh Rahimi, Elham Rajaei, Noushin Movafagh, Ali Mohammad Hadianfard

TL;DR
This study uses machine learning to identify key factors for early detection of rheumatoid arthritis in primary care, aiming to reduce delays in specialist referral.
Contribution
The study introduces a machine learning approach to identify critical early indicators of rheumatoid arthritis in primary care settings.
Findings
The CatBoost model achieved high performance with AUC-ROC of 0.966, accuracy of 0.947, and F1-score of 0.951.
Key factors identified include Anti-CCP, tender joint count, and swollen joint count as the most significant for early RA detection.
Fatigue, age, and positive RF were also found to significantly increase the likelihood of rheumatoid arthritis.
Abstract
Rheumatoid arthritis (RA) is a chronic disease that causes irreversible joint damage. Early detection, especially in primary care settings, is crucial for effective disease management. This study aimed to identify the factors that help screen individuals at risk of RA to reduce delays in referral to rheumatologists. This analytical and applied research used a questionnaire to gather data from 377 patients at a rheumatology diagnostic center in Ahvaz, Iran, between August and November 2024. Study variables included patients’ articular and extra-articular symptoms at disease onset, demographic data, and initial laboratory markers. After performing statistical correlation analysis, the dataset was split into training (80%) and testing (20%) subsets. Five machine learning models were developed, and the SHAP method was applied to the best-performing model to identify influential features.…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRheumatoid Arthritis Research and Therapies · Imbalanced Data Classification Techniques · Artificial Intelligence in Healthcare
