FOCA: Multimodal Malware Classification via Hyperbolic Cross-Attention
Nitin Choudhury, Bikrant Bikram Pratap Maurya, Orchid Chetia Phukan, Arun Balaji Buduru

TL;DR
FOCA introduces a hyperbolic space-based multimodal framework for malware classification, effectively leveraging audio-visual relationships to outperform existing unimodal and Euclidean-based methods.
Contribution
It is the first to utilize hyperbolic geometry for multimodal malware classification, enhancing the modeling of hierarchical relationships between audio and visual features.
Findings
Outperforms unimodal models on benchmark datasets.
Surpasses most Euclidean multimodal baselines.
Achieves state-of-the-art results in malware classification.
Abstract
In this work, we introduce FOCA, a novel multimodal framework for malware classification that jointly leverages audio and visual modalities. Unlike conventional Euclidean-based fusion methods, FOCA is the first to exploit the intrinsic hierarchical relationships between audio and visual representations within hyperbolic space. To achieve this, raw binaries are transformed into both audio and visual representations, which are then processed through three key components: (i) a hyperbolic projection module that maps Euclidean embeddings into the Poincare ball, (ii) a hyperbolic cross-attention mechanism that aligns multimodal dependencies under curvature-aware constraints, and (iii) a Mobius addition-based fusion layer. Comprehensive experiments on two benchmark datasets-Mal-Net and CICMalDroid2020- show that FOCA consistently outperforms unimodal models, surpasses most Euclidean…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
