Canonical Correlation Inference for Mapping Abstract Scenes to Text
Nikos Papasarantopoulos, Helen Jiang, Shay B. Cohen

TL;DR
This paper introduces a structured prediction method using canonical correlation analysis to map abstract visual scenes to descriptive text, demonstrating its effectiveness on a language-vision task.
Contribution
It presents a novel application of canonical correlation analysis for structured prediction in language-vision mapping tasks.
Findings
Effective projection of input and output into a shared space
Improved accuracy in scene-to-text mapping
Demonstrated on abstract scene description task
Abstract
We describe a technique for structured prediction, based on canonical correlation analysis. Our learning algorithm finds two projections for the input and the output spaces that aim at projecting a given input and its correct output into points close to each other. We demonstrate our technique on a language-vision problem, namely the problem of giving a textual description to an "abstract scene".
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Neural Networks and Applications
