$\texttt{InfoHier}$: Hierarchical Information Extraction via Encoding and Embedding
Tianru Zhang, Li Ju, Prashant Singh, Salman Toor

TL;DR
$ exttt{InfoHier}$ introduces a novel framework combining self-supervised learning and hierarchical clustering to learn hierarchical data representations, improving analysis of complex datasets.
Contribution
It proposes a joint learning approach that integrates SSL with HC, enhancing the capture of multi-level data relationships.
Findings
Improves hierarchical structure learning from unlabelled data.
Enhances representation quality for complex, high-dimensional datasets.
Facilitates better data analysis and retrieval tasks.
Abstract
Analyzing large-scale datasets, especially involving complex and high-dimensional data like images, is particularly challenging. While self-supervised learning (SSL) has proven effective for learning representations from unlabelled data, it typically focuses on flat, non-hierarchical structures, missing the multi-level relationships present in many real-world datasets. Hierarchical clustering (HC) can uncover these relationships by organizing data into a tree-like structure, but it often relies on rigid similarity metrics that struggle to capture the complexity of diverse data types. To address these we envision , a framework that combines SSL with HC to jointly learn robust latent representations and hierarchical structures. This approach leverages SSL to provide adaptive representations, enhancing HC's ability to capture complex patterns. Simultaneously, it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
