Making Software FAIR: A machine-assisted workflow for the research software lifecycle
Petr Knoth (CORE, Knowledge Media institute, The Open University),, Laurent Romary (Inria), Patrice Lopez (Science Miner), Roberto Di Cosmo, (Inria), Pavel Smrz (Brno University of Technology), Tomasz Umerle (Polish, Academy of Sciences)

TL;DR
This paper presents SoFAIR, a project developing a machine-assisted workflow to improve the discoverability, attribution, and FAIRness of research software by identifying, validating, and registering software assets with persistent identifiers.
Contribution
It introduces a novel workflow integrating machine learning and existing infrastructures to enhance the management and FAIR compliance of research software.
Findings
Development of ML-assisted identification of research software
Integration with existing repositories and tools
Enhanced registration and archival of software assets
Abstract
A key issue hindering discoverability, attribution and reusability of open research software is that its existence often remains hidden within the manuscript of research papers. For these resources to become first-class bibliographic records, they first need to be identified and subsequently registered with persistent identifiers (PIDs) to be made FAIR (Findable, Accessible, Interoperable and Reusable). To this day, much open research software fails to meet FAIR principles and software resources are mostly not explicitly linked from the manuscripts that introduced them or used them. SoFAIR is a 2-year international project (2024-2025) which proposes a solution to the above problem realised over the content available through the global network of open repositories. SoFAIR will extend the capabilities of widely used open scholarly infrastructures (CORE, Software Heritage, HAL) and tools…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices
