What Kinds of Contracts Do ML APIs Need?
Samantha Syeda Khairunnesa, Shibbir Ahmed, Sayem Mohammad Imtiaz,, Hridesh Rajan, Gary T. Leavens

TL;DR
This paper investigates the types of contracts needed for ML APIs by analyzing Stack Overflow posts, identifying common contract violations, and suggesting how contract mining can improve early error detection in ML pipelines.
Contribution
It provides an empirical analysis of ML API contract violations, categorizes necessary contracts, and highlights the potential of contract mining to enhance API usability and error detection.
Findings
Most needed contracts check single argument constraints or call order
Existing contract mining approaches can be adapted for ML APIs
Combining behavioral and temporal contract mining is beneficial
Abstract
Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow, Scikit-learn, Keras, and PyTorch. For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Imbalanced Data Classification Techniques · Advanced Malware Detection Techniques
