MLLM-Fabric: Multimodal Large Language Model-Driven Robotic Framework for Fabric Sorting and Selection

Liman Wang; Hanyang Zhong; Tianyuan Wang; Shan Luo; and Jihong Zhu

arXiv:2507.04351·cs.RO·October 14, 2025

MLLM-Fabric: Multimodal Large Language Model-Driven Robotic Framework for Fabric Sorting and Selection

Liman Wang, Hanyang Zhong, Tianyuan Wang, Shan Luo, and Jihong Zhu

PDF

1 Datasets

TL;DR

MLLM-Fabric is a multimodal robotic framework that uses large language models to improve fabric sorting and selection, enhancing accuracy and reliability in textile manufacturing and retail.

Contribution

It introduces a novel multimodal robotic system trained with supervised fine-tuning and explanation-guided distillation, along with a new fabric dataset for improved fabric property ranking.

Findings

01

Outperforms vision-language baselines in attribute ranking

02

Achieves higher selection reliability

03

Demonstrates effective multimodal fabric understanding

Abstract

Choosing appropriate fabrics is critical for meeting functional and quality demands in robotic textile manufacturing, apparel production, and smart retail. We propose MLLM-Fabric, a robotic framework leveraging multimodal large language models (MLLMs) for fabric sorting and selection. Built on a multimodal robotic platform, the system is trained through supervised fine-tuning and explanation-guided distillation to rank fabric properties. We also release a dataset of 220 diverse fabrics, each with RGB images and synchronized visuotactile and pressure data. Experiments show that our Fabric-Llama-90B consistently outperforms pretrained vision-language baselines in both attribute ranking and selection reliability. Code and dataset are publicly available at https://github.com/limanwang/MLLM-Fabric.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

EuniceF/MLLM-Fabric
dataset· 137 dl
137 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.