From Transcripts to AI Agents: Knowledge Extraction, RAG Integration, and Robust Evaluation of Conversational AI Assistants

Krittin Pachtrachai; Petmongkon Pornpichitsuwan; Wachiravit Modecrua; Touchapon Kraisingkorn

arXiv:2602.15859·cs.CL·February 19, 2026

From Transcripts to AI Agents: Knowledge Extraction, RAG Integration, and Robust Evaluation of Conversational AI Assistants

Krittin Pachtrachai, Petmongkon Pornpichitsuwan, Wachiravit Modecrua, Touchapon Kraisingkorn

PDF

Open Access

TL;DR

This paper introduces an end-to-end framework for building and evaluating conversational AI assistants from call transcripts, emphasizing data quality, knowledge extraction, modular prompt design, and robust evaluation in real-world domains.

Contribution

It presents a novel pipeline integrating transcript filtering, LLM-based knowledge extraction, modular prompt tuning, and a transcript-grounded simulator for comprehensive evaluation.

Findings

01

Assistant handles ~30% of calls autonomously.

02

Achieves near-perfect factual accuracy.

03

Demonstrates robustness against adversarial prompts.

Abstract

Building reliable conversational AI assistants for customer-facing industries remains challenging due to noisy conversational data, fragmented knowledge, and the requirement for accurate human hand-off - particularly in domains that depend heavily on real-time information. This paper presents an end-to-end framework for constructing and evaluating a conversational AI assistant directly from historical call transcripts. Incoming transcripts are first graded using a simplified adaptation of the PIPA framework, focusing on observation alignment and appropriate response behavior, and are filtered to retain only high-quality interactions exhibiting coherent flow and effective human agent responses. Structured knowledge is then extracted from curated transcripts using large language models (LLMs) and deployed as the sole grounding source in a Retrieval-Augmented Generation (RAG) pipeline.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · AI in Service Interactions