Agentic DraCor and the Art of Docstring Engineering: Evaluating MCP-empowered LLM Usage of the DraCor API
Peer Trilcke, Ingo B\"orner, Henny Sluyter-G\"athje, Daniil Skorinkin, Frank Fischer, Carsten Milling

TL;DR
This paper presents the development and evaluation of a Model Context Protocol server for DraCor, enabling LLMs to autonomously interact with the API, emphasizing the importance of docstring engineering for effective tool use in digital humanities research.
Contribution
It introduces MCP for DraCor, demonstrating how reflexive docstring engineering enhances LLM-tool interaction and reliability in digital humanities applications.
Findings
Docstring engineering improves tool correctness
Systematic prompt observation reveals LLM behavior
Reliable LLM tool use is achievable with proper documentation
Abstract
This paper reports on the implementation and evaluation of a Model Context Protocol (MCP) server for DraCor, enabling Large Language Models (LLM) to autonomously interact with the DraCor API. We conducted experiments focusing on tool selection and application by the LLM, employing a qualitative approach that includes systematic observation of prompts to understand how LLMs behave when using MCP tools, evaluating "Tool Correctness", "Tool-Calling Efficiency", and "Tool-Use Reliability". Our findings highlight the importance of "Docstring Engineering", defined as reflexively crafting tool documentation to optimize LLM-tool interaction. Our experiments demonstrate both the promise of agentic AI for research in Computational Literary Studies and the essential infrastructure development needs for reliable Digital Humanities infrastructures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
