FuDoBa: Fusing Document and Knowledge Graph-based Representations with Bayesian Optimisation
Boshko Koloski, Senja Pollak, Roberto Navigli, Bla\v{z} \v{S}krlj

TL;DR
FuDoBa is a Bayesian optimization method that combines LLM-based embeddings with domain knowledge to create efficient, interpretable, and task-relevant document representations that improve classification performance.
Contribution
It introduces a novel fusion approach that integrates LLM embeddings with structured knowledge, reducing dimensionality and complexity for domain-specific tasks.
Findings
Outperforms LLM-only embeddings on six datasets across two domains.
Produces low-dimensional, interpretable representations.
Enhances classification accuracy with AutoML classifiers.
Abstract
Building on the success of Large Language Models (LLMs), LLM-based representations have dominated the document representation landscape, achieving great performance on the document embedding benchmarks. However, the high-dimensional, computationally expensive embeddings from LLMs tend to be either too generic or inefficient for domain-specific applications. To address these limitations, we introduce FuDoBa a Bayesian optimisation-based method that integrates LLM-based embeddings with domain-specific structured knowledge, sourced both locally and from external repositories like WikiData. This fusion produces low-dimensional, task-relevant representations while reducing training complexity and yielding interpretable early-fusion weights for enhanced classification performance. We demonstrate the effectiveness of our approach on six datasets in two domains, showing that when paired with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Machine Learning in Healthcare
