Automating the Enterprise with Foundation Models
Michael Wornow, Avanika Narayan, Krista Opsahl-Ong, Quinn McIntyre,, Nigam H. Shah, Christopher Re

TL;DR
This paper introduces ECLAIR, a system leveraging multimodal foundation models like GPT-4 to automate enterprise workflows with minimal human supervision, addressing traditional RPA limitations such as high setup costs and unreliable execution.
Contribution
The paper presents ECLAIR, a novel system that uses multimodal foundation models for end-to-end enterprise workflow automation with minimal human input, demonstrating high understanding accuracy and quick setup.
Findings
93% accuracy on workflow understanding task
40% end-to-end completion rate with natural language input
Addresses RPA limitations like setup time and reliability
Abstract
Automating enterprise workflows could unlock $4 trillion/year in productivity gains. Despite being of interest to the data management community for decades, the ultimate vision of end-to-end workflow automation has remained elusive. Current solutions rely on process mining and robotic process automation (RPA), in which a bot is hard-coded to follow a set of predefined rules for completing a workflow. Through case studies of a hospital and large B2B enterprise, we find that the adoption of RPA has been inhibited by high set-up costs (12-18 months), unreliable execution (60% initial accuracy), and burdensome maintenance (requiring multiple FTEs). Multimodal foundation models (FMs) such as GPT-4 offer a promising new approach for end-to-end workflow automation given their generalized reasoning and planning abilities. To study these capabilities we propose ECLAIR, a system to automate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis
MethodsAttention Is All You Need · Sparse Evolutionary Training · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Dropout · Label Smoothing · Residual Connection · Softmax · Absolute Position Encodings
