Classification of hierarchical text using geometric deep learning: the case of clinical trials corpus
Sohrab Ferdowsi, Nikolay Borissov, Julien Knafou, Poorya, Amini, Douglas Teodoro

TL;DR
This paper introduces a geometric deep learning approach with a novel selective graph pooling method to classify hierarchical clinical trial documents, achieving high accuracy and providing interpretability insights.
Contribution
It proposes a new selective graph pooling technique for hierarchical document classification and applies it to clinical trial protocol categorization with state-of-the-art results.
Findings
F1-score around 0.85 on large clinical trial dataset
Selective pooling improves classification performance
Method offers interpretability insights into document hierarchy
Abstract
We consider the hierarchical representation of documents as graphs and use geometric deep learning to classify them into different categories. While graph neural networks can efficiently handle the variable structure of hierarchical documents using the permutation invariant message passing operations, we show that we can gain extra performance improvements using our proposed selective graph pooling operation that arises from the fact that some parts of the hierarchy are invariable across different documents. We applied our model to classify clinical trial (CT) protocols into completed and terminated categories. We use bag-of-words based, as well as pre-trained transformer-based embeddings to featurize the graph nodes, achieving f1-scores around 0.85 on a publicly available large scale CT registry of around 360K protocols. We further demonstrate how the selective pooling can add insights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Advanced Graph Neural Networks · Topic Modeling
