Fine-Grained Traceability for Transparent ML Pipelines

Liping Chen; Mujie Liu; Haytham Fayek

arXiv:2601.14971·cs.LG·January 22, 2026

Fine-Grained Traceability for Transparent ML Pipelines

Liping Chen, Mujie Liu, Haytham Fayek

PDF

Open Access

TL;DR

FG-Trac is a novel, model-agnostic framework that provides verifiable, fine-grained traceability of individual data samples throughout machine learning pipelines, enhancing transparency and accountability.

Contribution

It introduces a comprehensive mechanism for sample-level traceability, integrating cryptographic commitments and contribution scoring without altering existing models.

Findings

01

Preserves predictive performance while enabling traceability

02

Provides verifiable evidence of sample usage and propagation

03

Works with diverse ML pipeline architectures

Abstract

Modern machine learning systems are increasingly realised as multistage pipelines, yet existing transparency mechanisms typically operate at a model level: they describe what a system is and why it behaves as it does, but not how individual data samples are operationally recorded, tracked, and verified as they traverse the pipeline. This absence of verifiable, sample-level traceability leaves practitioners and users unable to determine whether a specific sample was used, when it was processed, or whether the corresponding records remain intact over time. We introduce FG-Trac, a model-agnostic framework that establishes verifiable, fine-grained sample-level traceability throughout machine learning pipelines. FG-Trac defines an explicit mechanism for capturing and verifying sample lifecycle events across preprocessing and training, computes contribution scores explicitly grounded in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Ethics and Social Impacts of AI