AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models
Zhiqiang Tang, Haoyang Fang, Su Zhou, Taojiannan Yang, Zihan Zhong,, Tony Hu, Katrin Kirchhoff, George Karypis

TL;DR
AutoGluon-Multimodal (AutoMM) is an open-source AutoML library that simplifies multimodal foundation model fine-tuning across various data types and tasks, achieving superior or competitive performance with minimal code.
Contribution
AutoMM introduces a user-friendly framework for multimodal AutoML that supports multiple data modalities and tasks, streamlining the fine-tuning of foundation models with just three lines of code.
Findings
AutoMM outperforms existing AutoML tools in basic classification and regression tasks.
AutoMM achieves competitive results in advanced multimodal tasks.
The library demonstrates ease of use and versatility across diverse datasets.
Abstract
AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text, and tabular data, both independently and in combination, the library offers a comprehensive suite of functionalities spanning classification, regression, object detection, semantic matching, and image segmentation. Experiments across diverse datasets and tasks showcases AutoMM's superior performance in basic classification and regression tasks compared to existing AutoML tools, while also demonstrating competitive results in advanced tasks, aligning with specialized toolboxes designed for such purposes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsLib
