A categorical error sensitivity index (ISEC): A preventive ordinal decision-support measure for irrecoverable errors in manual data entry systems
Ricardo Ra\'ul Palma, Mauro Anibal Benetti, Fabricio Orlando Sanchez Varretti

TL;DR
The paper presents ISEC, a novel ordinal index to assess and prevent irrecoverable categorical errors in manual data entry, improving data quality and decision-making in SMEs.
Contribution
Introduces ISEC, a comprehensive, scalable index combining semantic, morphological, and frequency data to identify vulnerable category pairs in SME data systems.
Findings
ISEC effectively ranks category pairs by susceptibility to confusion.
ISEC achieves 195x performance improvement over brute-force approaches.
Validated across diverse datasets, demonstrating scalability and robustness.
Abstract
Data entry systems remain structurally vulnerable to categorical misclassifications, particularly in small and medium sized enterprises (SMEs). When nominal categories exhibit semantic or morphological proximity, human machine interaction may produce errors that are irrecoverable ex post. In the absence of automated input controls, manual data entry frequently generates irrecoverable categorical distortions that propagate into Key Performance Indicators (KPIs), thereby misleading managerial decision making. State of the art normalization tools typically evaluate semantic and morphological dimensions in isolation and rely heavily on standard dictionaries, rendering them ineffective for SME master data rich in custom SKUs, abbreviations, and domain-specific technical jargon. This paper introduces the Categorical Error Sensitivity Index (ISEC), an ordinal composite score designed to rank…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
