Action Recognition and State Change Prediction in a Recipe Understanding   Task Using a Lightweight Neural Network Model

Qing Wan; Yoonsuck Choe

arXiv:2001.08665·cs.CL·January 24, 2020·1 cites

Action Recognition and State Change Prediction in a Recipe Understanding Task Using a Lightweight Neural Network Model

Qing Wan, Yoonsuck Choe

PDF

Open Access

TL;DR

This paper introduces a lightweight neural network model for recognizing actions and predicting state changes in recipe instructions, outperforming complex models in accuracy and training efficiency.

Contribution

A simplified neural network that separates action recognition and state change prediction, using a novel loss function to improve learning and performance.

Findings

01

Achieves 67% accuracy in state change prediction

02

Requires fewer training samples (10K vs. 65K+)

03

Outperforms previous complex models in accuracy and efficiency

Abstract

Consider a natural language sentence describing a specific step in a food recipe. In such instructions, recognizing actions (such as press, bake, etc.) and the resulting changes in the state of the ingredients (shape molded, custard cooked, temperature hot, etc.) is a challenging task. One way to cope with this challenge is to explicitly model a simulator module that applies actions to entities and predicts the resulting outcome (Bosselut et al. 2018). However, such a model can be unnecessarily complex. In this paper, we propose a simplified neural network model that separates action recognition and state change prediction, while coupling the two through a novel loss function. This allows learning to indirectly influence each other. Our model, although simpler, achieves higher state change prediction performance (67% average accuracy for ours vs. 55% in (Bosselut et al. 2018)) and takes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications