HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System

Bao-Sinh Nguyen; Quang-Bach Tran; Tuan-Anh Nguyen Dang; Duc Nguyen,; Hung Le

arXiv:2206.02628·cs.IR·October 11, 2022

HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System

Bao-Sinh Nguyen, Quang-Bach Tran, Tuan-Anh Nguyen Dang, Duc Nguyen,, Hung Le

PDF

Open Access

TL;DR

This paper introduces HYCEDIS, a novel hybrid confidence estimation system for deep document information extraction, combining conformal prediction and anomaly detection to provide reliable confidence scores without modifying existing models.

Contribution

The paper presents a new architecture that accurately estimates confidence in deep learning-based document information extraction without altering the original models.

Findings

01

Outperforms existing confidence estimators significantly

02

Demonstrates strong generalization to out-of-distribution data

03

Effective on real-world scanned document datasets

Abstract

Measuring the confidence of AI models is critical for safely deploying AI in real-world industrial systems. One important application of confidence measurement is information extraction from scanned documents. However, there exists no solution to provide reliable confidence score for current state-of-the-art deep-learning-based information extractors. In this paper, we propose a complete and novel architecture to measure confidence of current deep learning models in document information extraction task. Our architecture consists of a Multi-modal Conformal Predictor and a Variational Cluster-oriented Anomaly Detector, trained to faithfully estimate its confidence on its outputs without the need of host models modification. We evaluate our architecture on real-wold datasets, not only outperforming competing confidence estimators by a huge margin but also demonstrating generalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification · Topic Modeling