VideoSAM: A Large Vision Foundation Model for High-Speed Video Segmentation
Chika Maduabuchi, Ericmoore Jossou, Matteo Bucci

TL;DR
VideoSAM is a fine-tuned vision model that significantly improves high-speed video segmentation accuracy across various fluid environments, surpassing traditional models like U-Net.
Contribution
We introduce VideoSAM, a specialized adaptation of SAM for HSV segmentation, along with an open-source dataset, advancing the robustness and accuracy of phase detection in scientific videos.
Findings
VideoSAM outperforms U-Net in complex segmentation tasks.
The model generalizes well across different fluid environments.
Open-source dataset facilitates future research in HSV segmentation.
Abstract
High-speed video (HSV) segmentation is essential for analyzing dynamic physical processes in scientific and industrial applications, such as boiling heat transfer. Existing models like U-Net struggle with generalization and accurately segmenting complex bubble formations. We present VideoSAM, a specialized adaptation of the Segment Anything Model (SAM), fine-tuned on a diverse HSV dataset for phase detection. Through diverse experiments, VideoSAM demonstrates superior performance across four fluid environments -- Water, FC-72, Nitrogen, and Argon -- significantly outperforming U-Net in complex segmentation tasks. In addition to introducing VideoSAM, we contribute an open-source HSV segmentation dataset designed for phase detection, enabling future research in this domain. Our findings underscore VideoSAM's potential to set new standards in robust and accurate HSV segmentation. The code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques · Advanced Neural Network Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Max Pooling · U-Net · Sparse Evolutionary Training
