SearchGym: A Modular Infrastructure for Cross-Platform Benchmarking and Hybrid Search Orchestration
Jerome Tze-Hou Hsu

TL;DR
SearchGym introduces a modular, reproducible infrastructure for cross-platform benchmarking and hybrid search orchestration, addressing gaps between prototypes and production systems in retrieval-augmented generation.
Contribution
It presents a decoupled, hierarchical framework with a compositional algebra for flexible system synthesis and analyzes retrieval pipeline strategies for optimal performance.
Findings
Achieves 70% Top-100 retrieval rate on LitSearch benchmark.
Demonstrates the importance of filter strength in retrieval pipeline sequencing.
Reveals a trade-off between generalizability and optimizability in system design.
Abstract
The rapid growth of Retrieval-Augmented Generation (RAG) has created a proliferation of toolkits, yet a fundamental gap remains between experimental prototypes and robust, production-ready systems. We present SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration. Unlike existing model-centric frameworks, SearchGym decouples data representation, embedding strategies, and retrieval logic into stateful abstractions: Dataset, VectorSet, and App. This separation enables a Compositional Config Algebra, allowing designers to synthesize entire systems from hierarchical configurations while ensuring perfect reproducibility. Moreover, we analyze the "Top- Cognizance" in hybrid retrieval pipelines, demonstrating that the optimal sequence of semantic ranking and structured filtering is highly dependent on filter strength. Evaluated on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Semantic Web and Ontologies · Advanced Graph Neural Networks
