Measuring Similarity: Computationally Reproducing the Scholar's   Interests

Ashley Lee; Jo Guldi; Andras Zsom

arXiv:1812.05984·cs.CL·December 17, 2018

Measuring Similarity: Computationally Reproducing the Scholar's Interests

Ashley Lee, Jo Guldi, Andras Zsom

PDF

Open Access

TL;DR

This paper explores how computational text classification methods can be made transparent and understandable to humanists, enabling critique and improvement of personalized document grouping algorithms.

Contribution

It introduces a framework for translating computational classification procedures into human-readable terms for scholarly critique.

Findings

01

Proposes methods for translating algorithms into human-understandable language

02

Highlights opportunities for expert critique of automated classification

03

Suggests improvements for transparency in personalized search algorithms

Abstract

Computerized document classification already orders the news articles that Apple's "News" app or Google's "personalized search" feature groups together to match a reader's interests. The invisible and therefore illegible decisions that go into these tailored searches have been the subject of a critique by scholars who emphasize that our intelligence about documents is only as good as our ability to understand the criteria of search. This article will attempt to unpack the procedures used in computational classification of texts, translating them into term legible to humanists, and examining opportunities to render the computational text classification process subject to expert critique and improvement.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Natural Language Processing Techniques · Topic Modeling