Multimodal Quantum Vision Transformer for Enzyme Commission Classification from Biochemical Representations
Murat Isik, Mandeep Kaur Saggi, Humaira Gowher, Sabre Kais

TL;DR
This paper introduces a multimodal Quantum Vision Transformer framework that integrates various biochemical data types to improve enzyme classification accuracy, surpassing existing models in performance.
Contribution
The novel multimodal QML framework combines multiple biochemical modalities with a quantum vision transformer to enhance enzyme classification.
Findings
Achieved 85.1% top-1 accuracy in EC classification.
Outperformed sequence-only and other QML models.
Effectively captures stereoelectronic interactions behind enzyme function.
Abstract
Accurately predicting enzyme functionality remains one of the major challenges in computational biology, particularly for enzymes with limited structural annotations or sequence homology. We present a novel multimodal Quantum Machine Learning (QML) framework that enhances Enzyme Commission (EC) classification by integrating four complementary biochemical modalities: protein sequence embeddings, quantum-derived electronic descriptors, molecular graph structures, and 2D molecular image representations. Quantum Vision Transformer (QVT) backbone equipped with modality-specific encoders and a unified cross-attention fusion module. By integrating graph features and spatial patterns, our method captures key stereoelectronic interactions behind enzyme function. Experimental results demonstrate that our multimodal QVT model achieves a top-1 accuracy of 85.1%, outperforming sequence-only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics
