MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data

Chika Maduabuchi; Ericmoore Jossou; Matteo Bucci

arXiv:2411.07463·cs.CV·March 17, 2026

MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data

Chika Maduabuchi, Ericmoore Jossou, Matteo Bucci

PDF

Open Access 1 Repo

TL;DR

This paper introduces MSEG-VCUQ, a hybrid multimodal framework combining CNNs and vision foundation models with uncertainty quantification, to improve high-speed video phase detection segmentation in complex industrial processes.

Contribution

It presents the first open-source multimodal HSV PD datasets and integrates CNNs with transformer-based models for enhanced segmentation and reliability assessment.

Findings

01

Outperforms baseline CNNs and VFMs in segmentation accuracy

02

Provides pixel-level uncertainty quantification for critical metrics

03

Enables scalable, reliable phase detection in boiling dynamics

Abstract

High-speed video (HSV) phase detection (PD) segmentation is crucial for monitoring vapor, liquid, and microlayer phases in industrial processes. While CNN-based models like U-Net have shown success in simplified shadowgraphy-based two-phase flow (TPF) analysis, their application to complex HSV PD tasks remains unexplored, and vision foundation models (VFMs) have yet to address the complexities of either shadowgraphy-based or PD TPF video segmentation. Existing uncertainty quantification (UQ) methods lack pixel-level reliability for critical metrics like contact line density and dry area fraction, and the absence of large-scale, multimodal experimental datasets tailored to PD segmentation further impedes progress. To address these gaps, we propose MSEG-VCUQ. This hybrid framework integrates U-Net CNNs with the transformer-based Segment Anything Model (SAM) to achieve enhanced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chikap421/mseg_vcuq
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging

MethodsConcatenated Skip Connection · Max Pooling · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · U-Net · Segment Anything Model