From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation

Minh Duc Bui; Xenia Heilmann; Mattia Cerrato; Manuel Mager; Katharina von der Wense

arXiv:2604.21716·cs.CL·April 24, 2026

From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation

Minh Duc Bui, Xenia Heilmann, Mattia Cerrato, Manuel Mager, Katharina von der Wense

PDF

TL;DR

This paper reveals that bias in code-generation for machine learning pipelines is significantly underestimated by simple conditional tests, as real-world pipeline generation shows much higher bias prevalence.

Contribution

It demonstrates that existing bias evaluation methods are inadequate by analyzing bias in ML pipeline generation, revealing much higher bias prevalence than simple conditionals suggest.

Findings

01

Generated ML pipelines show sensitive attributes in 87.7% of cases.

02

Bias prevalence in pipelines is higher than in simple conditional statements.

03

Results are consistent across mitigation strategies and pipeline complexities.

Abstract

Prior work evaluates code generation bias primarily through simple conditional statements, which represent only a narrow slice of real-world programming and reveal solely overt, explicitly encoded bias. We demonstrate that this approach dramatically underestimates bias in practice by examining a more realistic task: generating machine learning (ML) pipelines. Testing both code-specialized and general-instruction large language models, we find that generated pipelines exhibit significant bias during feature selection. Sensitive attributes appear in 87.7% of cases on average, despite models demonstrably excluding irrelevant features (e.g., including "race" while dropping "favorite color" for credit scoring). This bias is substantially more prevalent than that captured by conditional statements, where sensitive attributes appear in only 59.2% of cases. These findings are robust across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.