TL;DR
HypEHR is a hyperbolic geometric model for EHR question answering that leverages clinical data hierarchy, achieving comparable performance to large language models with fewer parameters.
Contribution
It introduces a Lorentzian hyperbolic embedding approach for EHR data, explicitly modeling hierarchical structures for efficient question answering.
Findings
HypEHR approaches LLM performance with fewer parameters.
Pretraining with diagnosis prediction and hierarchy regularization improves embeddings.
Code is publicly available at https://github.com/yuyuliu11037/HypEHR.
Abstract
Electronic health record (EHR) question answering is often handled by LLM-based pipelines that are costly to deploy and do not explicitly leverage the hierarchical structure of clinical data. Motivated by evidence that medical ontologies and patient trajectories exhibit hyperbolic geometry, we propose HypEHR, a compact Lorentzian model that embeds codes, visits, and questions in hyperbolic space and answers queries via geometry-consistent cross-attention with type-specific pointer heads. HypEHR is pretrained with next-visit diagnosis prediction and hierarchy-aware regularization to align representations with the ICD ontology. On two MIMIC-IV-based EHR-QA benchmarks, HypEHR approaches LLM-based methods while using far fewer parameters. Our code is publicly available at https://github.com/yuyuliu11037/HypEHR.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
