HIDAgent: A Toolkit Enabling "Personal Agents" on HID-Compatible Devices
Jeffrey P. Bigham

TL;DR
HIDAgent is an affordable, open-source toolkit that enables AI-powered UI agents to control HID-compatible devices by emulating keyboard and mouse inputs, facilitating research into new interaction paradigms.
Contribution
The paper introduces HIDAgent, a low-cost hardware/software toolkit that allows UI agents to operate HID devices through emulation, expanding platform compatibility and research possibilities.
Findings
Successfully built five diverse prototypes across platforms
Supports research into new interaction scenarios
Uses three off-the-shelf components costing less than $30
Abstract
UI Agents powered by increasingly performant AI promise to eventually use computers the way that people do - by visually interpreting UIs on screen and issuing appropriate actions to control them (e.g., mouse clicks and keyboard entry). While significant progress has been made on interpreting visual UIs computationally, and in sequencing together steps to complete tasks, controlling UIs is still done with system-specific APIs or VNC connections, which limits the platforms and use cases that can be explored. This paper introduces HIDAgent, an open-source hardware/software toolkit enabling UI agents to operate HID-compatible computing systems by emulating the physical keyboard and mouse. HIDAgent is built using three off-the-shelf components costing less than $30 and a Python library supporting flexible integration. We validated the HIDAgent toolkit by building five diverse use case…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Interactive and Immersive Displays · Speech and dialogue systems
