# Efficient video indexing for monitoring disease activity and progression   in the upper gastrointestinal tract

**Authors:** Sharib Ali, Jens Rittscher

arXiv: 1905.04384 · 2019-05-14

## TL;DR

This paper introduces a deep learning-based method combining autoencoders and Siamese networks to efficiently index and retrieve endoscopy videos, aiding clinicians in monitoring disease progression with improved accuracy and speed.

## Contribution

It presents a novel approach integrating autoencoders and Siamese networks for efficient video compression and retrieval in medical endoscopy, addressing data variability and enhancing clinical utility.

## Key findings

- Achieved 5% and 8% improvements over classical and variational autoencoders.
- Demonstrated effective retrieval from large-scale endoscopy videos.
- Validated approach across three patient datasets.

## Abstract

Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. While the endoscopy video contains a wealth of information, tools to capture this information for the purpose of clinical reporting are rather poor. In date, endoscopists do not have any access to tools that enable them to browse the video data in an efficient and user friendly manner. Fast and reliable video retrieval methods could for example, allow them to review data from previous exams and therefore improve their ability to monitor disease progression. Deep learning provides new avenues of compressing and indexing video in an extremely efficient manner. In this study, we propose to use an autoencoder for efficient video compression and fast retrieval of video images. To boost the accuracy of video image retrieval and to address data variability like multi-modality and view-point changes, we propose the integration of a Siamese network. We demonstrate that our approach is competitive in retrieving images from 3 large scale videos of 3 different patients obtained against the query samples of their previous diagnosis. Quantitative validation shows that the combined approach yield an overall improvement of 5% and 8% over classical and variational autoencoders, respectively.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.04384/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1905.04384/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/1905.04384/full.md

---
Source: https://tomesphere.com/paper/1905.04384