Alibaba International E-commerce Product Search Competition DILAB Team Technical Report
Hyewon Lee, Junghyun Oh, Minkyung Song, Soyoung Park, Seunghoon Han

TL;DR
This paper describes the development of a multilingual e-commerce search system by the DILAB team, which achieved high performance in a competition through a multi-stage pipeline involving data refinement, preprocessing, and adaptive modeling.
Contribution
The paper introduces a comprehensive multi-stage pipeline for multilingual e-commerce search, emphasizing data curation, preprocessing, and model tuning to improve robustness and performance.
Findings
Achieved 5th place with a score of 0.8819 in the competition.
Demonstrated robustness across multiple languages and domains.
Showcased the effectiveness of systematic data curation and iterative evaluation.
Abstract
This study presents the multilingual e-commerce search system developed by the DILAB team, which achieved 5th place on the final leaderboard with a competitive overall score of 0.8819, demonstrating stable and high-performing results across evaluation metrics. To address challenges in multilingual query-item understanding, we designed a multi-stage pipeline integrating data refinement, lightweight preprocessing, and adaptive modeling. The data refinement stage enhanced dataset consistency and category coverage, while language tagging and noise filtering improved input quality. In the modeling phase, multiple architectures and fine-tuning strategies were explored, and hyperparameters optimized using curated validation sets to balance performance across query-category (QC) and query-item (QI) tasks. The proposed framework exhibited robustness and adaptability across languages and domains,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Text and Document Classification Technologies · Topic Modeling
