Deriving Neural Architectures from Sequence and Graph Kernels

Tao Lei; Wengong Jin; Regina Barzilay; Tommi Jaakkola

arXiv:1705.09037·cs.NE·October 31, 2017·22 cites

Deriving Neural Architectures from Sequence and Graph Kernels

Tao Lei, Wengong Jin, Regina Barzilay, Tommi Jaakkola

PDF

Open Access

TL;DR

This paper introduces a kernel-based method to derive neural architectures for structured data, enabling end-to-end training and achieving state-of-the-art results in language modeling and molecular graph tasks.

Contribution

It formalizes a new class of recurrent neural modules derived from combinatorial structure kernels, bridging kernel methods and neural architecture design.

Findings

01

Achieved state-of-the-art results in language modeling.

02

Performed well on molecular graph regression.

03

Proposed a novel kernel-based neural operation framework.

Abstract

The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process. In this work, we appeal to kernels over combinatorial structures, such as sequences and graphs, to derive appropriate neural operations. We introduce a class of deep recurrent neural operations and formally characterize their associated kernel spaces. Our recurrent modules compare the input to virtual reference objects (cf. filters in CNN) via the kernels. Similar to traditional neural operations, these reference objects are parameterized and directly optimized in end-to-end training. We empirically evaluate the proposed class of neural architectures on standard applications such as language modeling and molecular graph regression, achieving state-of-the-art results across these applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Natural Language Processing Techniques