Improving Computer Assisted Speech Therapy Through Speech Based Emotion Recognition
Ovidiu Schipor

TL;DR
This paper introduces PhonEM, an emotion recognition framework based on speech analysis, to enhance computer-assisted speech therapy by enabling more empathetic and responsive virtual therapists, especially for children.
Contribution
It presents a novel fuzzy model for emotion representation and a speech-based detection method integrated into a virtual speech therapy system, improving emotional responsiveness.
Findings
Successful integration of PhonEM into CBST Logomon.
Enhanced emotional responsiveness in speech therapy sessions.
Positive preliminary results encouraging further development.
Abstract
Speech therapy consists in a wide range of services whose aim is to prevent, diagnose and treat different types of speech impairments. One of the most important conditions for obtaining favourable and steady results is the "immersing" of the subject as long as possible into therapeutic context: at home, at school/work, on the street. Since nowadays portable computers tend to become habitual accessories, it seems a good idea to create virtual versions of human SLTs and to integrate them into these devices. However one of the main distinctions between a Speech and Language Therapist (SLT) and a Computer Based Speech Therapy System (CBST) arise from the field of emotion intelligence. The inability of current CBSTs to detect emotional state of human subjects leads to inadequate behavioural responses. Furthermore, this "unresponsive" behaviour is perceived as a lack of empathy and,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
