Automated Screening for Distress: A Perspective for the Future
Rajib Rana
University of Southern Queensland, Australia
Siddique Latif
University of Southern Queensland, Australia
Raj Gururajan
University of Southern Queensland, Australia
Anthony Gray
University of Southern Queensland, Australia
Geraldine Mackenzie
University of Southern Queensland, Australia
Gerald Humphris
University of St Andrews, United Kingdom
Jeff Dunn
University of Southern Queensland, Australia
Griffith University, Australia
University of Technology Sydney, Australia
Abstract
Distress is a complex condition which affects a significant percentage of cancer patients and may lead to depression, anxiety, sadness, suicide and other forms of psychological morbidity. Compelling evidence supports screening for distress as a means of facilitating early intervention and subsequent improvements in psychological well-being and overall quality of life. Nevertheless, despite the existence of evidence based and easily administered screening tools, for example the Distress Thermometer, routine screening for distress is yet to achieve widespread implementation. Efforts are intensifying to utilise innovative, cost effective methods now available through emerging technologies in the informatics and computational arenas.
I Introduction
Distress is described as emotional suffering with high global prevalence, which can result in disabling conditions and impairment for patients.
It is highly prevalent in cancer patients affecting 25 to 60% of patients [1] and can cause severe harm to this cohort with diminished Quality of Life (QoL) is one of the key adverse effects of distress [1]. Other implications include shortened survival time [2] and negative outcomes for physical health through impaired immune functioning [3]. Distress may also negatively influence treatment adherence in cancer patients and non-adherence substantially increases healthcare costs through increased likelihood of recurrence and other disease complications [4]. Overall, distress is estimated to increase the cost of cancer care by as much as 20%[5, 6].
Early assessment and screening for distress will enable timely management of distress leading to (1) improved adherence to treatment, (2) more effective communication between patient and clinician, (3) fewer visits to the hospital, and (4) early intervention for and prevention of severe anxiety or depression [7]. This is why International organisations have endorsed the quality care standard of whole-patient care that is achieved through routine comprehensive distress screening [8], however, routine screening for distress has yet to be widely adopted [9]. Reasons for limited system wide update of screening for distress are reported to include time constraints, limitations in the skills of health care providers, cost, and attitudes of health care providers to the use of standardised tools [1, 5, 8, 9].
Automating screening for distress with the assistance of emerging technologies may serve to alleviate many of the challenges associated with its routine applications. However, there are also a number of challenges need to be addressed to develop a robust automated distress detection system. In this paper, we aim to provide critiques on distress detection research in terms of assessment tools, available datasets and existing methods for automatic distress screening. A number of studies [10, 11, 12] have reviewed the research on automatic depression detection but none of them has highlighted the research gap for distress detection. This study attempts to highlight the core differences of distress from other emotions and mental disorders, discuss existing methodologies for inferring distress and provide future directions for developing an automated distress system.
II What is Distress
Distress is considered as a continuum of psychological symptoms with varying severity [13]. In the forensic sciences the term “distress” is specifically used for the affective states that arise in violent situations. The National Comprehensive Cancer Network provides the widely accepted definition of distress[7]: Distress is a multifactorial unpleasant emotional experience of a psychological (cognitive, behavioural, emotional), social, and/or spiritual nature. Distress encompass a range of common feelings of vulnerability, sadness, and fears that can cause depression, anxiety, panic, social isolation, and existential and spiritual crisis. Distress may not always be caused by some unexpected external events, but can also be caused by internal states such as feelings, thoughts, and habitual behaviours [14, 15]. It is an uncomfortable feeling and can impact individuals’ capacity of working, social life, bodily part and the mind.
It is a subjective experience, and different people manifest it differently with varied range of symptoms. However, the most common symptoms [16] of distress are: sleep disturbances, memory problems, anger management issues, obsessive thoughts, fatigue, sadness, weight gain, hallucinations, delusions, etc.
II-A How Distress is different from Emotions
Emotion is an essential component of human life and plays important role for their survival [17]. As a human being, we feel a whole range of emotions that may be comfortable or uncomfortable [18, 19]. Emotional discomfort is a universal human experience. In fact, negative emotions including sadness, anger and fear are important and useful in various situations. For instance, fear is helpful when there is real threat to our safety (for e.g., gun pointed at us or wild ferocious animal in the vicinity) and helps humans to effectively withstand such threatening situations. Similarly, sadness inadvertently helps in spotlighting the things that we care about in our life and it is important reinforce that negative emotions are not necessarily distress [20], for example: disgust [21].
In our daily life, emotions are transient [22] and they fluctuate like waves as they plateau, subside and eventually pass. In other words, emotions are transient, continually moving and changing. In contrast, distress is a prevailing situation that, if not addressed, escalates until emotional combust [20]. Emotions such as fear or anger are aroused to prevent, solve, cope with, or get away from specific situation. Distress is different, it can be felt strongly, it compromises a person’s ability to cope and if left untreated may escalate to more serious conditions [23].
II-B How Distress is different from Stress
Often stress and distress are used interchangeably, which blurs and confuses the distinctions between these concepts. It is however important to distinguish these two terms. Stress is an important element of life, as it has both positive and negative effect. As pointed by Spielberger [24], “Stress is an integral part of the natural fabric of life, and coping with stress is an everyday requirement for normal human growth and development”. The body uses behavioural or physiological mechanisms to counter the perturbation caused by stress and come back to normality. People usually adapt stress but when this adaptive process is compromised stress may develop into distress. Stress may present as either chronic and acute [25] and any transition of stress to distress depends on various factors including duration, intensity, and controllability.
II-C How Distress is different from Depression
Other dimensions of psychopathology such as depression and anxiety are also closely related to distress. In particular, most assessment tools and the consequent treatment of distress is based on the depression symptoms [26]. Patients with depression need to meet at least five of the DSM-5 (Diagnostic and Statistical Manual of Mental Disorder, Fifth Edition) criterion for major depressive disorder nearly every day during a two-week period. However, distress has different symptoms such as poor self-management, feeling angry and scared, and feeling of unsupported by family and friends [27], which are not included in DSM-5. This suggests the need to formulate an alternative screening for distressed people, who are not clinically depressed.
III Assessment Scales
Distress remains undetected in most patients [9], however, surprisingly, there are many scales available to gauge distress.
In this section we present the most popular scales used to screen distress. We also present (see Table I) the number of questions/items in each scale and time to conduct the screening to indicate the complexity of each scale.
The Disability Distress Assessment Tool (DisDAT) [36] was designed by a palliative care team to assess distress and is not a scoring tool, rather it documents a wide range of behaviours and signs related to distress. A distress scale based on ten symptoms was designed by Mccorkle et al. [37]. This scale was tested on 53 patients, where distress score was ranged from 10-41. The Distress Thermometer [38], is another scale which enables patients to rate their distress level on visual scale ranging from 0 (no distress) to 10 (extreme distress). The SCL-90 (Symptom Checklist-90) and BSI (Brief Symptom Inventory) have been widely used for screening of psychological distress in medical patients and demonstrated high levels of specificity and sensitivity [39, 13]. The 12-item General Health Questionnaire (GHQ-12) is designed to study of psychological disorders in general clinical setting and has been used in various studies [40, 41]. A recently proposed scale for distress assessment is the K10 [42]. It is a a 10-item scale specifically designed to assess distress in population surveys. This scale evaluates the individuals on anxio-depressive symptoms over the last 30 days and provides a total score as an index of distress. The Functional Assessment of Chronic Illness Therapy (FACIT) Measurement System [30] is used for the management of chronic illness using questionnaires related to health-related quality of life. Its generic version known as the Functional Assessment of Cancer Therapy-General (FACT-G) is compiled to use in four primary quality of life domains including physical well-being, social/family well-being, emotional well-being, and functional well-being. A six-item sub-scale of Somatic and Psychological Health Report (SPHERE-12) measures the aspects of distress and related conditions [43]. This scale is based on GHQ [29] and each item is scored on a three-point scale between 0 and 2, which gives a maximum score of 12.
A number of scales for depression are also used to scale distress. Hospital Anxiety and Depression Scale (HADS) is a screening instrument that is used to assess anxiety and depression of physically ill patients [34]. It includes 14 items for anxiety and depression with 4 alternative answers, which are used to measure total distress score.
Self-report scales including Beck’s Depression Inventory (BDI) [33], and Patient Health Questionnaire - Anxiety and Depression Scale (PHQ-ADS) [44] have also been shown to have some relevance with distress for particular patient groups.
IV Automatic Distress Assessment
Distress is highly prevalent in patients with chronic disease. Despite the fact that it can cause serious harm, clinicians are reluctant to use the existing distress screening for various reasons, most importantly for cost and time requirements [45]. Emerging information technologies are playing promising role to automate the screening of different health issues [46] and they also have great potential to be exploited for the automated screening of distress that may greatly alleviate these problems and facilitate widespread update. The potential benefit of automated screening for distress has encouraged research efforts and we discuss progress in this section.
Besides the application in health, automated distress detection has also been studied in two other areas and these are Aged Care and Forensics. In homes for elderly people distress calls arise if there is fall or a fire or other such events [47]. In the forensic scenario, automated distress assists the Police prioritise the crime response based on the intensity of distress of the caller [48]. Also, automated distress detection can assist the forensic phoneticians by providing them an objective measure of distress of victims in recorded attacks. In this section, we discuss the methodologies used in these three sectors.
IV-A Health
For automated distress detection in health, most of the studies focused on distress related conditions such as depression, anxiety, PTSD, and suicidal behaviour; very few studies [49, 50] have reported their results on distress detection. For instance, an automated distress management system [51] is piloted in outpatient medical oncology practice using tablet or computer for tailored psychosocial coping recommendations or referrals to individuals after immediate analysis. The authors used Distress Thermometer and problem list proposed by National Comprehensive Cancer Network as a screening tool. Their system matches patients identified concerns with the problem list and proposes evidence-based treatment suggestions and referrals. Verona coding definitions of emotional sequences (VR-CoDES) was developed for the detection and categorisation of patients’ emotions and their corresponding healthcare physicians [52]. Different studies have exploited VR-CoDES [53, 54, 55], however, the need for training of researchers on its usage and skilled labour necessary for labelling consultation recording are its major practical limitations. In this regard, Birkett et al. [56] developed computer-based tools to assist VR-CoDES in the labelling of patients-physicians’ recordings. The authors tried different representations of patients’ utterances and evaluated well-known classifiers including naïve Bayes, logistic regressions, support vector machines, and boosted ensemble decision trees for the labelling of recordings as an explicit concern, an emotional cue, or neither.
Researchers are predominantly attempting to infer distress based on the after effect such as depression, anxiety, PTSD, and suicidality, have developed various techniques. In [57], authors analysed 33 individuals from a clinical trial of depression [58] and investigated the relationship between nonverbal behaviour and severity of depression using video recording over the course of treatment. Scherer at al. [59] evaluated different visual features for psychological disorder analysis. They found that depressed individuals tend to gaze downwards more, give less intense and shorter duration of smile, and show longer self-touches and fidgeting. The inclusion of gender information with the visual is found to be helpful in detecting of distress related situations [60]. In addition to the visual indicators, Space-Time Interest Points (STIP) features are also exploited to detect depression with significantly improved results [61, 62]. These features include gestures related to head, face, shoulder, hands movements.
Recent studies have shown the promise of using speech as an effective marker for diagnosis and monitoring of depression. Speech can provide a wide range of prosodic and spectral features that can be effectively being used for human emotion [63, 64] and depression detection. Many researchers have used speech as an objective indicator for the detection of depression [65, 66, 67]. An interactive voice response (IVR) system was used to collect speech samples for automated HAM-D measures of depression severity [68, 69, 70, 71]. Acoustic features such as spectral, prosodic, cepstral, glottal, and features obtained from Teager energy operators (TEO) were investigated for clinical depression detection in adolescents [72]. TEO based features were produced more promising results compared to all other features and their combinations. Other studies [73, 74, 75, 76] also investigated different acoustic features and identified more relevant identifier for depression. Ozdas et al. [77] studied excitation related speech parameters including glottal flow spectrum and vocal jitter for identification of major depressed, high-risk near-term suicidal, and non-suicidal patients. Vocal jitter was found a significant discriminator clue suicidal and non-depressed control, where glottal flow spectrum related parameters provided discrimination of all three groups with significantly improved results. Scherer et al. [67] used prosody and voice quality related speech parameters for identification of suicidal and non-suicidal adolescents. They found that suicidal adolescents tend to have more breathy voice qualities compared to non-suicidal. A comparative study performed in [78] using acoustic and prosodic features to detect depression in spontaneous speech. Authors found that voice features such as intensity, root mean square, and loudness performed best to detect depression in the dataset. Other studies (for example [79, 80, 81, 82, 83]) also exploited different machine learning techniques and suggested that the speech can be effectively utilised to detect distress and related conditions.
IV-B Aged Care
Life expectancy is increasing globally, leading us to a higher number of older people in our society [84]. This increasing share of the elderly population is in part responsible for a shift in the cause of death from infectious and parasitic illnesses to chronic non-communicable diseases [85, 86]. Ageing can lead to physical limitations that need to be compensated by technical assistance and the help of aged care services. In aged care residential communities, feeling of isolation, fear, and a sense of helplessness, such as an inability to perform routine tasks, may lead to distress [87, 88].
Distress in elderly people often goes unrecognised for a range of reasons including confusing or unknown symptoms of distress [89], avoidance from checkups [90], and lack of systematic method or tool for distress detection [91]. The early detection and treatment of distress among elderly people is important because it can enhance recovery from illness and improve overall quality of life [89]. There exist different innovative products and solutions which promote independence and better quality of life among seniors with physical or cognitive diseases, for instance, the CIRDO project [92] aims to automatically detect the situations of falls and distress in residential care to promote autonomy for elderly people. This system involves video and audio analysis to detect the risky situation and make necessary emergency call using e-lio system111 http://www.technosens.fr/. For distress detection, CIRDO evaluated the proposed system using Automatic Speech Recognition (ASR) to detect distress sentences in AS80 [47] corpus and achieved promising results. The SweetHome project [93] used home equipped noise robust multisource automatic speech recognition (ASR) to detect vocal command or distress sentences in the realistic noisy environment of a smart home. Twenty three subjects or “speakers”, participated in this experiment where the closest distance between speakers and microphone was two meters. The authors performed voice order recognition of speech command belonging to three classes: distress calls, neutral sentences, and home automation orders. Alternatively, a sound based surveillance system [94] to detect alarming sounds in home situation has been described. This system performed real-time audio analysis for the detection of distress situation without compromising patients’ privacy.
Distress detection in elders using ASR system is a very challenging task due age-related degeneration of vocal cords, problems of laryngeal cartilages, and changes in larynx muscles [95, 96]. Some studies have empirically shown that ASR models performed poorly on elderly voice when they are trained on young or middle‐aged adult speech [97, 98]. For such situations, speaker adaptation techniques or training ASR model on elderly voice can help improvement in recognition rate [99]. To explore the performance of ASR in distress situation, Aman et al. [47] presented word error rate in aged voice compared to non-aged speech. They showed that ASR system gives higher word error rate equal to equal to 43.5% for the aged group and 9% on young speakers.
IV-C Forensic
Distress detection has an increasing presence in forensics, particularly in informing opinions about the authenticity of distress in criminal investigations. In forensic investigation, the lie can occur from distortion, denial, evasion, concealment, and outright fabrication by people to appear non-accountable for their exertions [100] and distress surveillance systems are used to identify the presence of reliable emotional clues to detect malingering or deception. Forensic examinations are performed by psychologists using different techniques including interviews, observations, home or institutional visits, psychological tests and instruments, as well as other methods recognised by the Forensic Council [101]. Automatic distress detection can play a crucial role in the assisting the forensic examination practice with an objective measure that can assist the judicial authorities.
A comprehensive study was performed by Lisa [102] to investigate distress in speech using acoustic and perceptual cues and empirically compare the results for real-life victims and actors in life-threatening situations. Based on the results of the acoustic analysis, it is concluded that acoustic parameters can be utilised to detect distress situations for actors and victims. In another study, Lisa [103] reported that two acoustic parameters intensity and formant bandwidth are helpful in differentiating between acted and genuine victims’ speech. Similarly, Fundamental Frequency (F0) mean, range and vowel formant can be used to distinguish between baseline and distress conditions for both victims and actors.
V Distress Datasets
Development of automated systems require historical data that contains the correlation of physical properties such as speech, facial expressions with distress labels. In this section we discuss various datasets that have been used previously for the purpose of distress identification are identified and discussed. As depression is a possible after effect of distress most of these datasets are built to diagnose depression. This section concludes with a summary of the sector wise studies (health, age care, and forensic) with the datasets used within, in Table II.
V-A Distress Assessment Interview Corpus (DAIC)
The Distress Analysis Interview Corpus (DAIC) [168] includes semi-structured clinical interviews of participants to enable the diagnosis of psychological distress conditions such as depression, anxiety, and post-traumatic stress disorder. The interviews of participants were conducted by humans, human controlled agents and autonomous agents. Overall data consists of audio, video, and questionnaire responses of participants and each interview is labelled with a depression score using PHQ-9. A portion of this dataset was released in Audio/Visual Emotion Recognition Depression Sub-challenge (AVEC) [169] 2016, which also contains transcription of the interviews.
V-B Aged and Non-Aged Corpus (AS80)
This corpus was recorded for adaptation of standard Automatic Speech Recognition (ASR) system to aged voice [47]. This corpus contains recording of from 95 speakers who were asked to read distress and casual sentences. These sentences contain a list of home automation orders and of distress calls that could be uttered by an elderly person in distress or fall situations.
V-C AVEC Corpus 2013
This dataset contains 340 video recordings of subjects performing a Human-Computer Interaction tasks [148]. There were total 292 speakers and the length of each recorded video clip is between 20 to 50 minutes. The level of depression for recordings was labelled using Beck Depression Inventory (BDI-II) [170]. The AVEC corpus 2014 [171] is a portion of this dataset which contains 300 video with the duration from 6 seconds to 4 minutes.
V-D SDC (suicidal, depressed, and control subjects)
This database is the collection of different dataset. Suicidal corpus was collected from the existing datasets [172] that was recorded from phone conversations, treatment sessions, and suicide notes. Depression related samples were obtained from Vanderbilt II and depression dataset used by Hollon et al [173]. DSM-IV and, ICD-9-CM (International Classification of Diseases, ninth edition, Clinical Modification) criteria were used for depressed patients. For the control group sample, Vanderbilt II dataset was used.
V-E Pitt Depression Dataset
This is a clinically validated depression dataset collected during the treatment of depressed patients at University of Pittsburgh (Pitt) [174]. All participants from a clinical trial were met with DSM-IV criteria for major depression. Total 57 patients were accessed using the HRSD clinical interview for depression severity. Interviews were recorded in audio-video format and depression was evaluated by the clinicians.
V-F Black Dog dataset
This audio-visual dataset was recorded by the Black Dog Institute Australia [175]. Over 40 depressed individuals (both male and female) were interviewed and asked to read sentences. Audio-video recordings of subjects include self-directed speech, related facial expressions, and body language.
V-G Cincinnati Children’s Interview Corpus (CCIC)
This dataset [67] includes the interview of 60 children patients (average age 15.47 years) at the Emergency Department of Cincinnati Children’s Hospital Medical Center. These children came to the hospital due to suicidal ideation, gestures, and attempts. Data was collected by a professional social worker. Due to lengthy interviews of suicidal and non-suicidal patients, only 60 seconds of speech for each participant is utilised for the analysis [176].
VI Discussions
A search of the literature for research focused on the application of automation in screening for distress found most of the papers in the health area utilizing post-distress conditions including depression, anxiety, and Post Traumatic Stress Disorder (PTSD) as a proxy to determine the presence of distress, rather than screen for distress itself. More broadly, it is evident from Table II where a summary of 75 studies relating to automated approaches to screening for distress in the aged care, forensic and health care settings is presented, that the focus is mostly on depression and PTSD. These studies have statistically analysed depression, anxiety, and PTSD against distress and reported that distress is highly correlated with these measuring dimensions [177, 178, 179]. Only two studies were found, refer Table II, [59, 122] targeted distress specifically using automated methods. However, even these two studies statistically correlated depression, anxiety, and PTSD to categorise (high, low, unclear) distress. In cancer care, this approach, where other conditions such as depression, are used as a proxy or signal for distress has limitations as as patients with distress may not have depression when measured with the existing scales [180] and as such may not be detected. Future research needs to focus on automated approaches to screening for distress in cancer patients which are independent from other related conditions (such as anxiety and depression) and are designed specifically to identify symptoms associated with distress. Guidance can be found in promising work from the Forensic [103] and Health [52] areas where it has been shown that speech independently carries latent properties for inferring distress.
Existing tools to screen for distress vary in relation to complexity, have been criticised as being costly, in terms of time and resources, and have failed to attract widespread or routine implementation, refer Table I. Evidence based automated approaches which efficiently and effectively screen for distress without adding to patient or staff burden may well be the future of screening for distress. Such an approach would triage high distress individuals for the attention of professional staff for further assessment or referral, consistent with a tiered model of care approach [181].
Currently, a number of distress screening tools are available, but as we report in Table I, most of these tools have many questions and require a considerable amount of time to complete. More importantly, it has been found that screening patients with multiple scales can appreciably improve the accuracy of results compared to single scale [182]. However, such multi-scale approach will further increase the screening time. Due to busy practices, oncologists are already reluctant to use distress screening tools, so a further increase in screen time will not be welcome by the oncology practices. Moreover, for aged care, and forensic scenarios, real-time distress inference is sought, so the time taking screening techniques will not be very useful. An automated distress detection/screening is therefore inevitable.
Datasets will play a vital role to develop an independent and automated distress detection system. From Section V, we found that most of the available datasets are recorded for depression, anxiety, and assessment of suicidal behaviour. Very few datasets such as AD80 is designed for distress in the elderly population. However, AD80 only focuses on the detection of distressed sentences using Automated Speech Recognition that focus on words (“help”) spoken by individuals. In addition, each dataset has been recorded in different environment and validated using different scales, which is hard to use for developing the automated distress detection tool. Therefore, it is crucial to develop large-scale validated datasets for automated detection of distress.