Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A
Anna Leschanowsky, Zahra Kolagar, Erion \c{C}ano, Ivan Habernal, Dara, Hallinan, Emanu\"el A. P. Habets, Birgit Popp

TL;DR
This paper explores how Retrieval Augmented Generation systems with alignment techniques can improve privacy-related question answering to meet GDPR transparency requirements, highlighting their strengths and limitations.
Contribution
It introduces MultiRAIN, an extension of RAIN, and evaluates their effectiveness in enhancing transparency and compliance in NLP-based privacy Q&A.
Findings
RAG systems with alignment outperform baseline models on most metrics.
None of the systems fully match human answer quality.
Complex metric interactions suggest need for better evaluation methods.
Abstract
The transparency principle of the General Data Protection Regulation (GDPR) requires data processing information to be clear, precise, and accessible. While language models show promise in this context, their probabilistic nature complicates truthfulness and comprehensibility. This paper examines state-of-the-art Retrieval Augmented Generation (RAG) systems enhanced with alignment techniques to fulfill GDPR obligations. We evaluate RAG systems incorporating an alignment module like Rewindable Auto-regressive Inference (RAIN) and our proposed multidimensional extension, MultiRAIN, using a Privacy Q&A dataset. Responses are optimized for preciseness and comprehensibility and are assessed through 21 metrics, including deterministic and large language model-based evaluations. Our results show that RAG systems with an alignment module outperform baseline RAG systems on most metrics,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems · Authorship Attribution and Profiling · Spam and Phishing Detection
