Learning Architectures from an Extended Search Space for Language Modeling
Yinqiao Li, Chi Hu, Yuhao Zhang, Nuo Xu, Yufan Jiang, Tong Xiao,, Jingbo Zhu, Tongran Liu, Changliang Li

TL;DR
This paper introduces an extended neural architecture search method that learns both intra-cell and inter-cell structures, leading to improved language models and transferability to other NLP tasks.
Contribution
It proposes a general approach to extend NAS search space and a joint learning method for intra- and inter-cell architectures, implemented in a differentiable system.
Findings
Outperforms baseline on PTB and WikiText language modeling tasks.
Achieves new state-of-the-art on PTB.
Improves systems on NER and chunking tasks.
Abstract
Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
