Document Understanding, Measurement, and Manipulation Using Category Theory
Jared Claypoole, Yunye Gong, Noson S. Yanofsky, Ajay Divakaran

TL;DR
This paper introduces a category theory-based framework for document understanding, enabling new methods for information measurement, summarization, extension, and self-supervised model improvement, applicable to multimodal documents.
Contribution
It develops a mathematical representation of documents as categories of question-answer pairs and introduces orthogonalization and rate distortion analysis for advanced document processing.
Findings
Effective multimodal document structure extraction
Novel summarization and extension techniques
Self-supervised model enhancement using RLVR
Abstract
We apply category theory to extract multimodal document structure which leads us to develop information theoretic measures, content summarization and extension, and self-supervised improvement of large pretrained models. We first develop a mathematical representation of a document as a category of question-answer pairs. Second, we develop an orthogonalization procedure to divide the information contained in one or more documents into non-overlapping pieces. The structures extracted in the first and second steps lead us to develop methods to measure and enumerate the information contained in a document. We also build on those steps to develop new summarization techniques, as well as to develop a solution to a new problem viz. exegesis resulting in an extension of the original document. Our question-answer pair methodology enables a novel rate distortion analysis of summarization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
