Diaformer: Automatic Diagnosis via Symptoms Sequence Generation
Junying Chen, Dongfang Li, Qingcai Chen, Wenxiu Zhou, Xin Liu

TL;DR
Diaformer reformulates automatic medical diagnosis as a symptoms sequence generation task using a Transformer-based model, achieving higher accuracy and efficiency than existing reinforcement learning methods across multiple datasets.
Contribution
The paper introduces Diaformer, a novel Transformer-based model that treats diagnosis as a symptoms sequence generation problem, with new orderless training mechanisms for improved performance.
Findings
Outperforms baselines on three datasets with 1%, 6%, and 11.5% higher accuracy.
Achieves higher training efficiency compared to reinforcement learning methods.
Demonstrates the potential of symptoms sequence generation for automatic diagnosis.
Abstract
Automatic diagnosis has attracted increasing attention but remains challenging due to multi-step reasoning. Recent works usually address it by reinforcement learning methods. However, these methods show low efficiency and require taskspecific reward functions. Considering the conversation between doctor and patient allows doctors to probe for symptoms and make diagnoses, the diagnosis process can be naturally seen as the generation of a sequence including symptoms and diagnoses. Inspired by this, we reformulate automatic diagnosis as a symptoms Sequence Generation (SG) task and propose a simple but effective automatic Diagnosis model based on Transformer (Diaformer). We firstly design the symptom attention framework to learn the generation of symptom inquiry and the disease diagnosis. To alleviate the discrepancy between sequential generation and disorder of implicit symptoms, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Healthcare
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Layer Normalization · Absolute Position Encodings · Dropout
