From data to corpus: semiotic and documentary issues in audiovisual archives
Peter Stockinger (Inalco, PLIDAM EA 4514)

TL;DR
This paper explores the theoretical, methodological, and technical challenges of constructing and analyzing audiovisual corpora in digital humanities, emphasizing semiotic frameworks, data documentation, and semantic enrichment.
Contribution
It provides a comprehensive analysis of the foundational issues in audiovisual corpus research, integrating semiotic theory with digital humanities methodologies.
Findings
Highlights the importance of a transdisciplinary semiotic approach.
Clarifies distinctions between data collections, corpora, and archives.
Discusses semantic enrichment processes for meaningful data analysis.
Abstract
The article examines the theoretical, methodological, and technical foundations of research on audiovisual corpora within the field of digital humanities. It outlines the main transversal issues underlying the processes of constructing, exploiting, and interpreting such corpora, which are conceived as specific forms of textual data in the broad sense - that is, as sets of semiotic traces (written, visual, sound, or multimodal) that make it possible to document, analyze, and transmit domains of knowledge. The analysis is organized around five complementary themes. The first concerns the status and structure of textual data lato sensu: any data, regardless of its medium, participates in a meaningful representation of a domain and therefore requires a unified theoretical and methodological framework based on a transdisciplinary semiotic approach. The second theme addresses the documentary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Narrative Theory and Analysis · Radio, Podcasts, and Digital Media
