LIDA: Lightweight Interactive Dialogue Annotator
Edward Collins, Nikolai Rozanov, Bingbing Zhang

TL;DR
LIDA is an open-source, comprehensive dialogue annotation tool that streamlines the entire process from raw text to structured data, supporting machine learning integration and disagreement resolution.
Contribution
It is the first system to handle the full dialogue annotation pipeline, enhancing efficiency and accuracy in dialogue data annotation.
Findings
Supports raw text to structured data conversion
Integrates machine learning models for annotation recommendations
Provides tools for resolving inter-annotator disagreements
Abstract
Dialogue systems have the potential to change how people interact with machines but are highly dependent on the quality of the data used to train them. It is therefore important to develop good dialogue annotation tools which can improve the speed and quality of dialogue data annotation. With this in mind, we introduce LIDA, an annotation tool designed specifically for conversation data. As far as we know, LIDA is the first dialogue annotation system that handles the entire dialogue annotation pipeline from raw text, as may be the output of transcription services, to structured conversation data. Furthermore it supports the integration of arbitrary machine learning models as annotation recommenders and also has a dedicated interface to resolve inter-annotator disagreements such as after crowdsourcing annotations for a dataset. LIDA is fully open source, documented and publicly available…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
