Thesis: Document Summarization with applications to Keyword extraction and Image Retrieval
Jayaprakash Sundararaj

TL;DR
This paper presents methods for document summarization with applications to keyword extraction, image retrieval, and opinion summarization, using probabilistic models, rank aggregation, and submodular functions to improve relevance and sentiment preservation.
Contribution
It introduces novel approaches combining probabilistic models, rank aggregation, and submodular functions for enhanced keyword extraction, image recommendation, and sentiment-aware summarization.
Findings
Proposed image retrieval method outperforms existing baselines.
Rank aggregation with relevance feedback improves image and text retrieval.
Submodular functions effectively balance sentiment and summarization quality.
Abstract
Automatic summarization is the process of reducing a text document in order to generate a summary that retains the most important points of the original document. In this work, we study two problems - i) summarizing a text document as set of keywords/caption, for image recommedation, ii) generating opinion summary which good mix of relevancy and sentiment with the text document. Intially, we present our work on an recommending images for enhancing a substantial amount of existing plain text news articles. We use probabilistic models and word similarity heuristics to generate captions and extract Key-phrases which are re-ranked using a rank aggregation framework with relevance feedback mechanism. We show that such rank aggregation and relevant feedback which are typically used in Tagging Documents, Text Information Retrieval also helps in improving image retrieval. These queries are fed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Biomedical Text Mining and Ontologies · Web Data Mining and Analysis
MethodsSparse Evolutionary Training
