Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

Le Lin; Zihao Zhu; Rainbow Tin Hung Ho; Jing Liao; Yuhan Luo

arXiv:2602.14183·cs.HC·May 12, 2026

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

Le Lin, Zihao Zhu, Rainbow Tin Hung Ho, Jing Liao, Yuhan Luo

PDF

TL;DR

This paper presents a work-in-progress multimodal chatbot that interprets visual art in real-time and engages users in reflective dialogue to support therapeutic art activities.

Contribution

It introduces an MLLM-powered chatbot designed for real-time visual analysis and therapeutic engagement in art activities, outlining future development directions.

Findings

01

Demonstrated potential to facilitate therapeutic engagement

02

Identified key areas for future development in AI-mediated art therapy

03

Evaluation with five art therapy experts supports the chatbot's utility

Abstract

Therapeutic art activities, such as expressive drawing and painting, require the synergy between creative visual production and interactive dialogue. Recent advancements in Multimodal Large Language Models (MLLMs) have expanded the capacity of computing systems to interpret both textual and visual data, offering a new frontier for AI-mediated therapeutic support. This work-in-progress paper introduces an MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations. We conducted an evaluation with five experts in art therapy and related fields, which demonstrated the chatbot's potential to facilitate therapeutic engagement, and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.