TL;DR
Fast-MIA is a Python library that significantly improves the efficiency and scalability of membership inference attacks on large language models by combining batch inference and shared intermediate result caching.
Contribution
It introduces a unified framework with high-throughput inference and cross-method caching, enabling large-scale, reproducible privacy auditing of LLMs.
Findings
Achieves approximately 5× speedup in inference tasks.
Supports large-scale evaluation with shared intermediate computations.
Provides a flexible, reproducible framework for privacy risk assessment.
Abstract
We propose Fast-MIA (https://github.com/Nikkei/fast-mia), a Python library for efficiently evaluating membership inference attacks (MIA) against large language models (LLMs). MIA has emerged as a crucial technique for auditing privacy risks and copyright infringement in LLMs. However, computational demands have grown substantially: recent methods rely on repeated inference, while practical auditing requires large-scale evaluation. Progress is further hindered by existing implementations that execute methods independently, redundantly computing shared intermediate results such as log-probabilities. To address these challenges, Fast-MIA combines two strategies: (1) high-throughput batch inference via vLLM, achieving approximately 5 speedup, and (2) a cross-method caching architecture that computes intermediate results once and shares them across methods. The library includes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
