# Learning to Predict Novel Noun-Noun Compounds

**Authors:** Prajit Dhar, Lonneke van der Plas

arXiv: 1906.03634 · 2019-09-26

## TL;DR

This paper presents models that predict plausible, unseen noun-noun compounds by learning from time-stamped corpora, achieving high accuracy in identifying novel, plausible concepts.

## Contribution

It introduces temporally and contextually-aware compositional models for predicting novel noun-noun compounds, a task not previously addressed.

## Key findings

- 85% of generated compounds are attested in unseen data
- 5% of compounds are plausible but not attested, confirmed by human raters
- Models effectively generalize to unseen compound combinations

## Abstract

We introduce temporally and contextually-aware models for the novel task of predicting unseen but plausible concepts, as conveyed by noun-noun compounds in a time-stamped corpus. We train compositional models on observed compounds, more specifically the composed distributed representations of their constituents across a time-stamped corpus, while giving it corrupted instances (where head or modifier are replaced by a random constituent) as negative evidence. The model captures generalisations over this data and learns what combinations give rise to plausible compounds and which ones do not. After training, we query the model for the plausibility of automatically generated novel combinations and verify whether the classifications are accurate. For our best model, we find that in around 85% of the cases, the novel compounds generated are attested in previously unseen data. An additional estimated 5% are plausible despite not being attested in the recent corpus, based on judgments from independent human raters.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.03634/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1906.03634/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1906.03634/full.md

---
Source: https://tomesphere.com/paper/1906.03634