Intent Lenses: Inferring Capture-Time Intent to Transform Opportunistic Photo Captures into Structured Visual Notes

Ashwin Ram; Aeneas Leon Sommer; Martin Schmitz; J\"urgen Steimle

arXiv:2604.09438·cs.HC·April 13, 2026

Intent Lenses: Inferring Capture-Time Intent to Transform Opportunistic Photo Captures into Structured Visual Notes

Ashwin Ram, Aeneas Leon Sommer, Martin Schmitz, J\"urgen Steimle

PDF

TL;DR

This paper presents Intent Lenses, a novel approach using large language models to infer user capture-time intent from photos, transforming opportunistic captures into meaningful, structured visual notes for better sensemaking.

Contribution

It introduces Intent Lenses, a new method for intent-mediated note generation that leverages large language models to create interactive, reusable objects from captured information.

Findings

01

Intent Lenses produce notes aligned with user expectations.

02

The system facilitates exploration and deeper understanding of photo captures.

03

User study shows improved sensemaking with intent-mediated notes.

Abstract

Opportunistic photo capture (e.g., slides, exhibits, or artifacts) is a common strategy for preserving information encountered in information-rich environments for later revisitation. While fast and minimally disruptive, such photo collections rarely become meaningful notes. Existing automatic note-generation approaches provide some support but often produce generic summaries that fail to reflect what users intended to capture. We introduce Intent Lenses, a conceptual primitive for intent-mediated note generation and sensemaking. Intent Lenses reify users' capture-time intent inferred from captured information into reusable interactive objects that encode the function to perform, the information sources to focus on, and how results are represented at an appropriate level of detail. These lenses are dynamically generated using the reasoning capabilities of large language models. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.