# Cross-task weakly supervised learning from instructional videos

**Authors:** Dimitri Zhukov, Jean-Baptiste Alayrac, Ramazan Gokberk Cinbis, David, Fouhey, Ivan Laptev, Josef Sivic

arXiv: 1903.08225 · 2019-04-30

## TL;DR

This paper presents a weakly supervised learning framework for recognizing task steps in instructional videos, leveraging shared components across tasks and a new dataset to improve performance and generalization.

## Contribution

It introduces a component-based model and a weakly supervised learning method for step recognition, along with a new dataset for cross-task sharing analysis.

## Key findings

- Sharing components improves step recognition accuracy.
- Component models can generalize to unseen tasks.
- Weak supervision with narration and step lists is effective.

## Abstract

In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: `pour egg' should be trained jointly with other tasks involving `pour' and `egg'. We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.08225/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/1903.08225/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/1903.08225/full.md

---
Source: https://tomesphere.com/paper/1903.08225