Evaluating Front-end & Back-end of Human Automation Interaction   Applications \Delta-EVAL A Hypothetical Benchmark

Gon\c{c}alo Hora de Carvalho

arXiv:2407.18953·cs.HC·March 11, 2025

Evaluating Front-end & Back-end of Human Automation Interaction Applications \Delta-EVAL A Hypothetical Benchmark

Gon\c{c}alo Hora de Carvalho

PDF

Open Access

TL;DR

This paper proposes a comprehensive benchmark framework for evaluating human-automation interaction systems, focusing on both user interface and underlying processes, inspired by AI benchmarking techniques to ensure reliability and future-proofing.

Contribution

It introduces a structured set of metrics and tests for assessing HAI systems' efficacy, reliability, and design, unifying existing guidelines within a formal benchmarking approach.

Findings

01

Proposes a formal benchmark framework for HAI systems.

02

Integrates cognitive engineering principles into evaluation metrics.

03

Aims for reproducible, general, and insightful assessment methods.

Abstract

Human Factors, Cognitive Engineering, and Human-Automation Interaction (HAI) form a trifecta, where users and technological systems of ever increasing autonomous control occupy a centre position. But with great autonomy comes great responsibility. It is in this context that we propose metrics and a benchmark framework based on known regimes in Artificial Intelligence (AI). A benchmark is a set of tests and metrics or measurements conducted on those tests or tasks. We hypothesise about possible tasks designed to assess operator-system interactions and both the front-end and back-end components of HAI applications. Here, front-end pertains to the user interface and direct interactions the user has with a system, while the back-end is composed of the underlying processes and mechanisms that support the front-end experience. By evaluating HAI systems through the proposed metrics, based on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman-Automation Interaction and Safety