Automatically Segmenting Oral History Transcripts

Ryan Shaw

arXiv:1509.08842·cs.CL·September 30, 2015

Automatically Segmenting Oral History Transcripts

Ryan Shaw

PDF

Open Access

TL;DR

This paper explores automated methods for segmenting oral history transcripts into coherent sections, comparing algorithms like BayesSeg and TextTiling, and discusses challenges in evaluation due to low inter-annotator agreement.

Contribution

It evaluates the performance of BayesSeg and TextTiling algorithms for oral history segmentation and highlights the need for clearer segmentation task definitions.

Findings

01

BayesSeg performs slightly better than TextTiling.

02

TextTiling does not significantly outperform a uniform segmentation.

03

Inter-annotator agreement is low, complicating evaluation.

Abstract

Dividing oral histories into topically coherent segments can make them more accessible online. People regularly make judgments about where coherent segments can be extracted from oral histories. But making these judgments can be taxing, so automated assistance is potentially attractive to speed the task of extracting segments from open-ended interviews. When different people are asked to extract coherent segments from the same oral histories, they often do not agree about precisely where such segments begin and end. This low agreement makes the evaluation of algorithmic segmenters challenging, but there is reason to believe that for segmenting oral history transcripts, some approaches are more promising than others. The BayesSeg algorithm performs slightly better than TextTiling, while TextTiling does not perform significantly better than a uniform segmentation. BayesSeg might be used…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Drilling and Well Engineering · Oral History, Memory, Narrative Analysis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings