FLACON: An Information-Theoretic Approach to Flag-Aware Contextual Clustering for Large-Scale Document Organization
Sungwook Yoon

TL;DR
FLACON is a new clustering method that organizes enterprise documents by combining content similarity with contextual flags like priority and workflow status.
Contribution
FLACON introduces a novel information-theoretic clustering approach using a six-dimensional flag system for context-aware document organization.
Findings
FLACON achieves a 7.8-fold improvement in clustering quality compared to traditional methods.
It performs at 89% of GPT-4's quality but is 7× faster for 10,000 documents.
The method supports deterministic behavior and O(m log n) complexity for incremental updates.
Abstract
Enterprise document management faces a significant challenge: traditional clustering methods focus solely on content similarity while ignoring organizational context, such as priority, workflow status, and temporal relevance. This paper introduces FLACON (Flag-Aware Context-sensitive Clustering), an information-theoretic approach that captures multi-dimensional document context through a six-dimensional flag system encompassing Type, Domain, Priority, Status, Relationship, and Temporal dimensions. FLACON formalizes document clustering as an entropy minimization problem, where the objective is to group documents with similar contextual characteristics. The approach combines a composite distance function—integrating semantic content, contextual flags, and temporal factors—with adaptive hierarchical clustering and efficient incremental updates. This design addresses key limitations of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Advanced Clustering Algorithms Research · Text and Document Classification Technologies
