Action Recognition and State Change Prediction in a Recipe Understanding Task Using a Lightweight Neural Network Model
Qing Wan, Yoonsuck Choe

TL;DR
This paper introduces a lightweight neural network model for recognizing actions and predicting state changes in recipe instructions, outperforming complex models in accuracy and training efficiency.
Contribution
A simplified neural network that separates action recognition and state change prediction, using a novel loss function to improve learning and performance.
Findings
Achieves 67% accuracy in state change prediction
Requires fewer training samples (10K vs. 65K+)
Outperforms previous complex models in accuracy and efficiency
Abstract
Consider a natural language sentence describing a specific step in a food recipe. In such instructions, recognizing actions (such as press, bake, etc.) and the resulting changes in the state of the ingredients (shape molded, custard cooked, temperature hot, etc.) is a challenging task. One way to cope with this challenge is to explicitly model a simulator module that applies actions to entities and predicts the resulting outcome (Bosselut et al. 2018). However, such a model can be unnecessarily complex. In this paper, we propose a simplified neural network model that separates action recognition and state change prediction, while coupling the two through a novel loss function. This allows learning to indirectly influence each other. Our model, although simpler, achieves higher state change prediction performance (67% average accuracy for ours vs. 55% in (Bosselut et al. 2018)) and takes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
