OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable   Personal Question Answering

Jiahao Nick Li; Zhuohao Jerry Zhang; Jiaju Ma

arXiv:2409.08250·cs.HC·February 24, 2025·2 cites

OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering

Jiahao Nick Li, Zhuohao Jerry Zhang, Jiaju Ma

PDF

Open Access

TL;DR

OmniQuery is a system that enhances personal memory retrieval by integrating scattered contextual information from multiple memories, enabling it to answer complex questions with 71.5% accuracy, surpassing traditional retrieval methods.

Contribution

The paper introduces OmniQuery, a novel system that augments captured memories with contextual information to answer complex personal questions, leveraging large language models for inference.

Findings

01

Achieved 71.5% accuracy in answering complex personal questions.

02

Outperformed conventional RAG systems in human evaluations.

03

Effectively integrated scattered contextual memories for improved retrieval.

Abstract

People often capture memories through photos, screenshots, and videos. While existing AI-based tools enable querying this data using natural language, they only support retrieving individual pieces of information like certain objects in photos, and struggle with answering more complex queries that involve interpreting interconnected memories like sequential events. We conducted a one-month diary study to collect realistic user queries and generated a taxonomy of necessary contextual information for integrating with captured memories. We then introduce OmniQuery, a novel system that is able to answer complex personal memory-related questions that require extracting and inferring contextual information. OmniQuery augments individual captured memories through integrating scattered contextual information from multiple interconnected memories. Given a question, OmniQuery retrieves relevant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Context-Aware Activity Recognition Systems · Personal Information Management and User Behavior

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Softmax · Layer Normalization · Dropout · WordPiece · Residual Connection · Attention Dropout · Linear Layer