Challenges and Governance Solutions for Data Science Services based on Open Data and APIs
Juha-Pekka Joutsenlahti, Timo Lehtonen, Mikko Raatikainen, Elina, Kettunen, Tommi Mikkonen

TL;DR
This paper discusses the challenges faced in developing data science services using open data and APIs, particularly in marine traffic, and proposes governance solutions to address these issues.
Contribution
It provides firsthand insights into practical challenges and suggests governance practices to improve open data and API utilization in data science applications.
Findings
Identified five key challenges: relevant data, historical data, licensing, runtime quality, API evolution.
Proposed governance practices can mitigate these challenges.
Enhanced data governance can enable more effective data science services.
Abstract
Increasingly common open data and open application programming interfaces (APIs) together with the progress of data science -- such as artificial intelligence (AI) and especially machine learning (ML) -- create opportunities to build novel services by combining data from different sources. In this experience report, we describe our firsthand experiences on open data and in the domain of marine traffic in Finland and Sweden and identified technological opportunities for novel services. We enumerate five challenges that we have encountered with the application of open data: relevant data, historical data, licensing, runtime quality, and API evolution. These challenges affect both business model and technical implementation. We discuss how these challenges could be alleviated by better governance practices for provided open APIs and data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
