To show or not to show: Redacting sensitive text from videos of   electronic displays

Abhishek Mukhopadhyay; Shubham Agarwal; Patrick Dylan Zwick; and; Pradipta Biswas

arXiv:2208.10270·cs.CV·August 23, 2022

To show or not to show: Redacting sensitive text from videos of electronic displays

Abhishek Mukhopadhyay, Shubham Agarwal, Patrick Dylan Zwick, and, Pradipta Biswas

PDF

Open Access

TL;DR

This paper presents a method for redacting sensitive text from videos of electronic displays using OCR and NLP, comparing Tesseract and Google Cloud Vision OCR for effectiveness.

Contribution

It introduces a combined OCR and NLP approach for privacy-preserving video redaction and evaluates the performance of different OCR models in this context.

Findings

01

GCV OCR outperforms Tesseract in accuracy and speed

02

The approach effectively redacts personally identifiable information from videos

03

Trade-offs between OCR models are discussed for real-world use

Abstract

With the increasing prevalence of video recordings there is a growing need for tools that can maintain the privacy of those recorded. In this paper, we define an approach for redacting personally identifiable text from videos using a combination of optical character recognition (OCR) and natural language processing (NLP) techniques. We examine the relative performance of this approach when used with different OCR models, specifically Tesseract and the OCR system from Google Cloud Vision (GCV). For the proposed approach the performance of GCV, in both accuracy and speed, is significantly higher than Tesseract. Finally, we explore the advantages and disadvantages of both models in real-world applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Digital Media Forensic Detection · Advanced Steganography and Watermarking Techniques