Contextual Graph Transformer: A Small Language Model for Enhanced Engineering Document Information Extraction
Karan Reddy, Mayukha Pal

TL;DR
The paper introduces the Contextual Graph Transformer (CGT), a hybrid neural architecture combining GNNs and Transformers, designed to improve information extraction from complex engineering documents with enhanced structure and context understanding.
Contribution
It presents a novel hybrid model that integrates graph neural networks with Transformers for domain-specific question answering, offering improved accuracy and efficiency over existing models.
Findings
CGT achieves 24.7% higher accuracy than GPT-2.
CGT uses 62.4% fewer parameters than comparable models.
The model effectively captures structural token interactions and semantic coherence.
Abstract
Standard transformer-based language models, while powerful for general text, often struggle with the fine-grained syntax and entity relationships in complex technical, engineering documents. To address this, we propose the Contextual Graph Transformer (CGT), a hybrid neural architecture that combines Graph Neural Networks (GNNs) and Transformers for domain-specific question answering. CGT constructs a dynamic graph over input tokens using sequential, skip-gram, and semantic similarity edges, which is processed by GATv2Conv layers for local structure learning. These enriched embeddings are then passed to a Transformer encoder to capture global dependencies. Unlike generic large models, technical domains often require specialized language models with stronger contextualization and structure awareness. CGT offers a parameter-efficient solution for such use cases. Integrated into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Machine Learning in Healthcare
