Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A

Anna Leschanowsky; Zahra Kolagar; Erion \c{C}ano; Ivan Habernal; Dara; Hallinan; Emanu\"el A. P. Habets; Birgit Popp

arXiv:2502.06652·cs.CL·February 11, 2025

Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A

Anna Leschanowsky, Zahra Kolagar, Erion \c{C}ano, Ivan Habernal, Dara, Hallinan, Emanu\"el A. P. Habets, Birgit Popp

PDF

Open Access

TL;DR

This paper explores how Retrieval Augmented Generation systems with alignment techniques can improve privacy-related question answering to meet GDPR transparency requirements, highlighting their strengths and limitations.

Contribution

It introduces MultiRAIN, an extension of RAIN, and evaluates their effectiveness in enhancing transparency and compliance in NLP-based privacy Q&A.

Findings

01

RAG systems with alignment outperform baseline models on most metrics.

02

None of the systems fully match human answer quality.

03

Complex metric interactions suggest need for better evaluation methods.

Abstract

The transparency principle of the General Data Protection Regulation (GDPR) requires data processing information to be clear, precise, and accessible. While language models show promise in this context, their probabilistic nature complicates truthfulness and comprehensibility. This paper examines state-of-the-art Retrieval Augmented Generation (RAG) systems enhanced with alignment techniques to fulfill GDPR obligations. We evaluate RAG systems incorporating an alignment module like Rewindable Auto-regressive Inference (RAIN) and our proposed multidimensional extension, MultiRAIN, using a Privacy Q&A dataset. Responses are optimized for preciseness and comprehensibility and are assessed through 21 metrics, including deterministic and large language model-based evaluations. Our results show that RAG systems with an alignment module outperform baseline RAG systems on most metrics,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExpert finding and Q&A systems · Authorship Attribution and Profiling · Spam and Phishing Detection