Scout: Leveraging Large Language Models for Rapid Digital Evidence Discovery

Shariq Murtuza

arXiv:2507.18478·cs.CR·July 25, 2025

Scout: Leveraging Large Language Models for Rapid Digital Evidence Discovery

Shariq Murtuza

PDF

Open Access

TL;DR

Scout leverages large language models to rapidly identify and prioritize relevant digital evidence across various file types, significantly reducing investigation time and false positives.

Contribution

This work introduces Scout, a novel framework that uses foundational language models for efficient preliminary evidence processing in digital forensics.

Findings

01

Effective identification of relevant evidence files

02

Reduction in false positives during investigation

03

Rapid processing of diverse digital evidence types

Abstract

Recent technological advancements and the prevalence of technology in day to day activities have caused a major increase in the likelihood of the involvement of digital evidence in more and more legal investigations. Consumer-grade hardware is growing more powerful, with expanding memory and storage sizes and enhanced processor capabilities. Forensics investigators often have to sift through gigabytes of data during an ongoing investigation making the process tedious. Memory forensics, disk analysis all are well supported by state of the art tools that significantly lower the effort required to be put in by a forensic investigator by providing string searches, analyzing images file etc. During the course of the investigation a lot of false positives are identified that need to be lowered. This work presents Scout, a digital forensics framework that performs preliminary evidence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques