TakeLab Retriever: AI-Driven Search Engine for Articles from Croatian News Outlets
David Duki\'c, Marin Petri\v{c}evi\'c, Sven \'Curkovi\'c, Jan, \v{S}najder

TL;DR
TakeLab Retriever is an AI-powered search engine that semantically analyzes Croatian news articles, enabling researchers to explore trends and patterns in Croatian online media with advanced NLP techniques.
Contribution
It introduces a microservice-based semantic search engine tailored for Croatian news articles, addressing scalability and software engineering challenges.
Findings
Handles over ten million articles from two decades
Utilizes advanced NLP for semantic analysis
Provides a platform for trend and pattern discovery
Abstract
TakeLab Retriever is an AI-driven search engine designed to discover, collect, and semantically analyze news articles from Croatian news outlets. It offers a unique perspective on the history and current landscape of Croatian online news media, making it an essential tool for researchers seeking to uncover trends, patterns, and correlations that general-purpose search engines cannot provide. TakeLab retriever utilizes cutting-edge natural language processing (NLP) methods, enabling users to sift through articles using named entities, phrases, and topics through the web application. This technical report is divided into two parts: the first explains how TakeLab Retriever is utilized, while the second provides a detailed account of its design. In the second part, we also address the software engineering challenges involved and propose solutions for developing a microservice-based semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
