# Joint Representations of Text and Knowledge Graphs for Retrieval and   Evaluation

**Authors:** Teven Le Scao, Claire Gardent

arXiv: 2302.14785 · 2023-03-01

## TL;DR

This paper introduces a method for learning aligned vector representations of text and knowledge graph elements using contrastive training, enabling effective retrieval and evaluation without reference texts.

## Contribution

It presents a novel approach to jointly embed text and knowledge graphs, overcoming data limitations, and introduces EREDAT, a new similarity metric for data-to-text evaluation.

## Key findings

- EREDAT outperforms existing metrics in correlation with human judgments.
- The approach successfully learns aligned representations suitable for retrieval.
- Contrastive training on heuristic datasets enables cross-modal embedding without parallel data.

## Abstract

A key feature of neural models is that they can produce semantic vector representations of objects (texts, images, speech, etc.) ensuring that similar objects are close to each other in the vector space. While much work has focused on learning representations for other modalities, there are no aligned cross-modal representations for text and knowledge base (KB) elements. One challenge for learning such representations is the lack of parallel data, which we use contrastive training on heuristics-based datasets and data augmentation to overcome, training embedding models on (KB graph, text) pairs. On WebNLG, a cleaner manually crafted dataset, we show that they learn aligned representations suitable for retrieval. We then fine-tune on annotated data to create EREDAT (Ensembled Representations for Evaluation of DAta-to-Text), a similarity metric between English text and KB graphs. EREDAT outperforms or matches state-of-the-art metrics in terms of correlation with human judgments on WebNLG even though, unlike them, it does not require a reference text to compare against.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14785/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14785/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/2302.14785/full.md

---
Source: https://tomesphere.com/paper/2302.14785