Results Merging in the Patent Domain

Vasileios Stamatis; Michail Salampasis

arXiv:2203.00350·cs.IR·March 2, 2022

Results Merging in the Patent Domain

Vasileios Stamatis, Michail Salampasis

PDF

TL;DR

This study evaluates various machine learning techniques for merging results in patent document retrieval, finding that random forests outperform other models in accuracy and data fitting.

Contribution

It compares multiple ML models and merging methods for patent retrieval, highlighting the effectiveness of random forests over linear and polynomial models.

Findings

01

Random forest achieves the best results among tested models.

02

Random forest fits the data better than linear and polynomial models.

03

The ranking of document scores is not linearly explainable.

Abstract

In this paper, we test machine learning methods for results merging in patent document retrieval. Specifically, we examine random forest, decision tree, support vector machine (SVR), linear regression, polynomial regression, and deep neural networks (DNNs). We use two different methods for results merging, the multiple models (MM) method and the global model method (GM). Furthermore, we examine whether the ranking of the document's scores is linearly explainable. The CLEF-IP 2011 standard test collection was used in our experiments. The random forest produces the best results in comparison to all other models, and it fits the data better than linear and polynomial approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.