TableVault: Managing Dynamic Data Collections for LLM-Augmented Workflows
Jinjin Zhao, Sanjay Krishnan

TL;DR
TableVault is a data management system that effectively handles dynamic data collections in LLM-augmented workflows, supporting concurrency, reproducibility, versioning, and composability.
Contribution
It introduces a novel platform combining database techniques with LLM-driven needs for managing complex, dynamic data in AI workflows.
Findings
Supports concurrent execution of data tasks
Ensures reproducibility and robust versioning
Facilitates transparent, composable workflows
Abstract
Large Language Models (LLMs) have emerged as powerful tools for automating and executing complex data tasks. However, their integration into more complex data workflows introduces significant management challenges. In response, we present TableVault - a data management system designed to handle dynamic data collections in LLM-augmented environments. TableVault meets the demands of these workflows by supporting concurrent execution, ensuring reproducibility, maintaining robust data versioning, and enabling composable workflow design. By merging established database methodologies with emerging LLM-driven requirements, TableVault offers a transparent platform that efficiently manages both structured data and associated data artifacts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
