GoCoMA: Hyperbolic Multimodal Representation Fusion for Large Language Model-Generated Code Attribution
Nitin Choudhury, Bikrant Bikram Pratap Maurya, Bhavinkumar Vinodbhai Kuwar, Arun Balaji Buduru

TL;DR
GoCoMA is a hyperbolic multimodal framework that effectively fuses code stylometry and binary image representations to attribute code to its originating LLM, outperforming existing methods.
Contribution
It introduces a novel hyperbolic embedding and geodesic-cosine similarity fusion mechanism for improved code attribution from multimodal data.
Findings
GoCoMA outperforms unimodal baselines on CoDET-M4 and LLMAuthorBench datasets.
Hyperbolic embeddings enhance the modeling of hierarchical relationships in code attribution.
The GCSA fusion mechanism effectively combines style and binary representations for accurate source identification.
Abstract
Large Language Models (LLMs) trained on massive code corpora are now increasingly capable of generating code that is hard to distinguish from human-written code. This raises practical concerns, including security vulnerabilities and licensing ambiguity, and also motivates a forensic question: 'Who (or which LLM) wrote this piece of code?' We present GoCoMA, a multimodal framework that models an extrinsic hierarchy between (i) code stylometry, capturing higher-level structural and stylistic signatures, and (ii) image representations of binary pre-executable artifacts (BPEA), capturing lower-level, execution-oriented byte semantics shaped by compilation and toolchains. GoCoMA projects modality embeddings into a hyperbolic Poincar\'e ball, fuses them via a geodesic-cosine similarity-based cross-modal attention (GCSA) fusion mechanism, and back-projects the fused representation to Euclidean…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
