Towards a continuous modeling of natural language domains
Sebastian Ruder, Parsa Ghaffari, and John G. Breslin

TL;DR
This paper proposes a continuous domain modeling approach for natural language, extending beyond discrete categories to better capture language variation and adaptation, using representation learning and dialogue modeling as a test case.
Contribution
It introduces a novel continuous domain framework for language modeling, moving past discrete categories, and demonstrates its application with dialogue data.
Findings
Models can adapt smoothly across continuous language domains
Representation learning captures nuanced language variations
Framework facilitates investigation of language change
Abstract
Humans continuously adapt their style and language to a variety of domains. However, a reliable definition of `domain' has eluded researchers thus far. Additionally, the notion of discrete domains stands in contrast to the multiplicity of heterogeneous domains that humans navigate, many of which overlap. In order to better understand the change and variation of human language, we draw on research in domain adaptation and extend the notion of discrete domains to the continuous spectrum. We propose representation learning-based models that can adapt to continuous domains and detail how these can be used to investigate variation in language. To this end, we propose to use dialogue modeling as a test bed due to its proximity to language modeling and its social component.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
