On Using Information Retrieval to Recommend Machine Learning Good Practices for Software Engineers
Laura Cabra-Acela, Anamaria Mojica-Hanke, Mario, Linares-V\'asquez, Steffen Herbold

TL;DR
This paper introduces Idaka, a tool that uses information retrieval and large language models to recommend machine learning best practices to software engineers, aiming to improve ML system performance and usability.
Contribution
The paper presents a novel recommender system combining IR and LLM approaches to help users find relevant ML practices from various sources.
Findings
Implemented Idaka with BM25 and Alpaca models
Provides a platform for comparative evaluation of retrieval methods
Available publicly for further research and development
Abstract
Machine learning (ML) is nowadays widely used for different purposes and in several disciplines. From self-driving cars to automated medical diagnosis, machine learning models extensively support users' daily activities, and software engineering tasks are no exception. Not embracing good ML practices may lead to pitfalls that hinder the performance of an ML system and potentially lead to unexpected results. Despite the existence of documentation and literature about ML best practices, many non-ML experts turn towards gray literature like blogs and Q&A systems when looking for help and guidance when implementing ML systems. To better aid users in distilling relevant knowledge from such sources, we propose a recommender system that recommends ML practices based on the user's context. As a first step in creating a recommender system for machine learning practices, we implemented Idaka. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Machine Learning and Data Classification · Big Data and Business Intelligence
