Financial Bond Similarity Search Using Representation Learning

Amin Haeri; Mahdi Ghelichi; Nishant Agrawal; David Li; Catalina Gomez Sanchez

arXiv:2602.07020·q-fin.ST·February 10, 2026

Financial Bond Similarity Search Using Representation Learning

Amin Haeri, Mahdi Ghelichi, Nishant Agrawal, David Li, Catalina Gomez Sanchez

PDF

Open Access

TL;DR

This paper introduces embedding models to better capture categorical bond attributes, significantly improving similarity search and risk modeling in fixed-income analytics over traditional methods.

Contribution

It presents a novel embedding approach that effectively captures semantic similarities of categorical bond attributes, outperforming baseline methods.

Findings

01

Embedding models outperform one-hot encoding in bond similarity tasks.

02

Improved risk modeling and curve construction through semantic bond attribute embeddings.

03

Enhanced fixed-income analytics with better categorical attribute representation.

Abstract

Finding similar bonds remains challenging in fixed-income analytics, as numerical financial attributes often overshadow categorical non-financial ones such as issuer sector and domicile. This paper shows that these categorical attributes dominate the predictability of spread curves and proposes embedding models to capture their semantic similarities, outperforming one-hot and many other baselines. Evaluated via sparse-issuer augmentation, the approach improves risk modeling and curve construction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinancial Distress and Bankruptcy Prediction · Stock Market Forecasting Methods · Machine Learning in Healthcare