Iterative NLP Query Refinement for Enhancing Domain-Specific Information Retrieval: A Case Study in Career Services
Elham Peimani (1), Gurpreet Singh (1), Nisarg Mahyavanshi (1), Aman, Arora (1), Awais Shaikh (1) ((1) Humber College, Toronto, Canada)

TL;DR
This paper presents an iterative, semi-automated query refinement method tailored for improving document retrieval in niche domains, demonstrated through a case study in career services at Humber College, significantly boosting retrieval accuracy.
Contribution
It introduces a novel semi-automated query refinement approach combining domain-aware terms and keyword extraction to enhance retrieval in specialized fields.
Findings
Top similarity scores increased from 0.18 to 0.42
Iterative refinement significantly improves retrieval performance
Automated keyword extraction aids query expansion
Abstract
Retrieving semantically relevant documents in niche domains poses significant challenges for traditional TF-IDF-based systems, often resulting in low similarity scores and suboptimal retrieval performance. This paper addresses these challenges by introducing an iterative and semi-automated query refinement methodology tailored to Humber College's career services webpages. Initially, generic queries related to interview preparation yield low top-document similarities (approximately 0.2--0.3). To enhance retrieval effectiveness, we implement a two-fold approach: first, domain-aware query refinement by incorporating specialized terms such as resources-online-learning, student-online-services, and career-advising; second, the integration of structured educational descriptors like "online resume and interview improvement tools." Additionally, we automate the extraction of domain-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Web Data Mining and Analysis · Educational Technology and Assessment
