GeoSeq2Seq: Information Geometric Sequence-to-Sequence Networks
Alessandro Bay, Biswa Sengupta

TL;DR
GeoSeq2Seq introduces an information geometric approach to sequence-to-sequence networks by encoding latent embeddings as Fisher kernels, improving graph routing predictions over traditional methods.
Contribution
It bridges deep recurrent neural networks with information geometry by encoding latent embeddings as Fisher kernels, a novel formalism for Seq2Seq models.
Findings
Probabilistic embeddings outperform non-probabilistic ones by 10-15%.
The method effectively predicts shortest routes in graphs.
Fisher information geometry enhances sequence-to-sequence learning.
Abstract
The Fisher information metric is an important foundation of information geometry, wherein it allows us to approximate the local geometry of a probability distribution. Recurrent neural networks such as the Sequence-to-Sequence (Seq2Seq) networks that have lately been used to yield state-of-the-art performance on speech translation or image captioning have so far ignored the geometry of the latent embedding, that they iteratively learn. We propose the information geometric Seq2Seq (GeoSeq2Seq) network which abridges the gap between deep recurrent neural networks and information geometry. Specifically, the latent embedding offered by a recurrent network is encoded as a Fisher kernel of a parametric Gaussian Mixture Model, a formalism common in computer vision. We utilise such a network to predict the shortest routes between two nodes of a graph by learning the adjacency matrix using the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
