Semantic Technology-Assisted Review (STAR) Document analysis and   monitoring using random vectors

Jean-Fran\c{c}ois Delpech

arXiv:1711.10307·cs.IR·November 30, 2017·1 cites

Semantic Technology-Assisted Review (STAR) Document analysis and monitoring using random vectors

Jean-Fran\c{c}ois Delpech

PDF

Open Access

TL;DR

This paper introduces STAR, a semantic technology that uses random vectors to embed words and documents into a low-dimensional space, enabling fast, accurate analysis and monitoring of large document collections with minimal expert input.

Contribution

The paper presents a novel application of random vector embeddings for semantic analysis, improving efficiency and accuracy in document review and monitoring tasks.

Findings

01

Fast computation of semantic similarities

02

High-quality document classification and summarization

03

Minimal expert involvement required

Abstract

The review and analysis of large collections of documents and the periodic monitoring of new additions thereto has greatly benefited from new developments in computer software. This paper demonstrates how using random vectors to construct a low-dimensional Euclidean space embedding words and documents enables fast and accurate computation of semantic similarities between them. With this technique of Semantic Technology-Assisted Review (STAR), documents can be selected, compared, classified, summarized and evaluated very quickly with minimal expert involvement and high-quality results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Scientific Computing and Data Management