ABSTRACT
Introduction: Voice features could be a sensitive marker of affective state in bipolar disorder (BD). Smartphone apps offer an excellent opportunity to collect voice data in the natural setting and become a useful tool in phase prediction in BD.
Aims of the study: We investigate the relations between the symptoms of BD, evaluated by psychiatrists, and patients' voice characteristics. A smartphone app extracted acoustic parameters from the daily phone calls of n = 51 patients. We show how the prosodic, spectral, and voice quality features correlate with clinically assessed affective states and explore their usefulness in predicting the BD phase.
Methods: A smartphone app (BDmon) was developed to collect the voice signal and extract its physical features. BD patients used the application on average for 208 days. Psychiatrists assessed the severity of BD symptoms using the Hamilton depression rating scale -17 and the Young Mania rating scale. We analyze the relations between acoustic features of speech and patients' mental states using linear generalized mixed-effect models.
Results: The prosodic, spectral, and voice quality parameters, are valid markers in assessing the severity of manic and depressive symptoms. The accuracy of the predictive generalized mixed-effect model is 70.9%-71.4%. Significant differences in the effect sizes and directions are observed between female and male subgroups. The greater the severity of mania in males, the louder (β = 1.6) and higher the tone of voice (β = 0.71), more clearly (β = 1.35), and more sharply they speak (β = 0.95), and their conversations are longer (β = 1.64). For females, the observations are either exactly the opposite-the greater the severity of mania, the quieter (β = -0.27) and lower the tone of voice (β = -0.21) and less clearly (β = -0.25) they speak - or no correlations are found (length of speech). On the other hand, the greater the severity of bipolar depression in males, the quieter (β = -1.07) and less clearly they speak (β = -1.00). In females, no distinct correlations between the severity of depressive symptoms and the change in voice parameters are found.
Conclusions: Speech analysis provides physiological markers of affective symptoms in BD and acoustic features extracted from speech are effective in predicting BD phases. This could personalize monitoring and care for BD patients, helping to decide whether a specialist should be consulted.
ABSTRACT
An important step in the classification process of bipolar disorder
episodes is feature selection process indicating the most relevant
factors in patients’ behavior. The features in this task are associated
with costs. Besides basic (low-cost) information about patients’ phone
calls and text messages, we are studying the impact of acoustic features
(high-cost) on classifying patients’ states. Unlike in previous papers, now
we take the costs into account and thus we apply cost-constrained methods.
The purpose of this paper is to examine whether the cost-constrained
feature selection procedure is capable of improving the performance of
the classification model while reducing the cost of making predictions.
Moreover, we are trying to determine whether the reduced number of expensive
features maintains a relatively high performance. We use a filter
feature selection method that applies information theory. In the costconstrained
modification, we add a cost factor parameter that controls
the trade-off between feature importance and its cost. The experiments
were performed on a large medical database collected from patients with
bipolar disorder during their daily mobile calls. The results indicate that
the cost-constrained method allows to achieve better results than traditional
feature selection when the budget is limited.
ABSTRACT
Intelligent systems for the medical domain often require processing data streams that evolve over time and are only partially labeled. At the same time, the need for explanations is of utmost importance not only due to various regulations, but also to increase trust among systems’ users. In this work, an online data-driven learning method with focus on the explainability of evolving models equipped with incremental semi-supervised learning algorithms is considered. The proposed method combines: (i) the Dynamic Incremental Semi-Supervised Fuzzy C-Means (DISSFCM) algorithm to incrementally classify subsets of data; with (ii) Linguistic Summarization, which provides explanations of the classification results in terms of short sentences in a natural language. The approach has been illustrated for streaming data collected from voice calls of patients affected by Bipolar Disorder. The results show the effectiveness of the proposed method in classifying instances belonging to healthy and affective states, and explaining the approximate reasoning behind the classification of new acoustic data related to patients.
ABSTRACT
We introduce an approach called PLENARY (exPlaining bLack-box modEls in Natural lAnguage thRough fuzzY linguistic summaries), which is an explainable classifier based on a data-driven predictive model. Neural learning is exploited to derive a predictive model based on two levels of labels associated with the data. Then, model explanations are derived through the popular SHapley Additive exPlanations (SHAP) tool and conveyed in a linguistic form via fuzzy linguistic summaries. The linguistic summarization allows translating the explanations of the model outputs provided by SHAP into statements expressed in natural language. PLENARY accounts for the imprecision related to model outputs by summarizing them into simple linguistic statements and for the imprecision related to the data labeling process by including additional domain knowledge in the form of middle-layer labels. PLENARY is validated on preprocessed speech signals collected from smartphones from patients with bipolar disorder and on publicly available mental health survey data. The experiments confirm that fuzzy linguistic summarization is an effective technique to support meta-analyses of the outputs of AI models. Also, PLENARY improves explainability by aggregating low-level attributes into high-level information granules, and by incorporating vague domain knowledge into a multi-task sequential and compositional multilayer perceptron. SHAP explanations translated into fuzzy linguistic summaries significantly improve understanding of the predictive modelling process and its outputs.
ABSTRACT
In this work, inspired by the interpretability and usefulness of the statistical process control, we propose a novel procedure for simultaneous monitoring of multiple processes that is based on a neural network with learnable activation functions. The proposed procedure for learning control limits with neural network (CONNF) is aimed at scenarios where labeled data are available and makes use of these labels. CONNF can be particularly useful in monitoring processes when the amount of run-in data is insufficient, or the cost of obtaining such data is high. We illustrate the performance of CONNF method with a simulation study and preliminary results for real-life data collected from smartphones of patients with diagnosed bipolar disorder. These results show the potential of CONNF and indicate further research directions.
ABSTRACT
Currently, it is possible to collect a large amount
of data from sensors. At the same time, data are often only
partially labeled. For example, in the context of smartphone based monitoring of mental state, there are much more data
collected from smartphones than those collected from psychiatrists about the mental state. The approach presented in this
paper is designed to examine if unlabeled data can improve the
accuracy of classification tasks in the considered case study of
classifying a patient’s state. First, unlabeled data are represented
by clusters membership through Fuzzy C-means algorithm which
corresponds to the uncertainty of the patient’s condition in
this disease. Secondly, the classification is performed using two
well-known algorithms, Random Forest and SVM. The obtained
results indicate a minimal improvement in the quality of classification thanks to the use of membership in clusters. These results
are promising due to both, the accuracy and interpretability.
ABSTRACT
Semi-supervised learning has gained great interest because of its ability to combine unlabeled data with – potentially few – labeled observations in a training process. However, in some application contexts, one can question whether all available labels are equally valid. For example, in the context of bipolar disorder (BD) remote monitoring, a common practice is to extrapolate the psychiatrist’s assessment onto some fixed time window surrounding the visit, the so-called ground truth period. In consequence, all data from this period are labeled with the same category. Such an approach may potentially result in misguided supervision affecting the model’s performance. In this paper, we consider the problem of label uncertainty, assuming that the labels are crisp, but they may be assigned to particular observations with varying confidence. We propose a novel method called Confidence Path Regularization (CPR) that incorporates this uncertainty into the fuzzy c-means semi-supervised learning. The proposed CPR approach is a novel method for automatic, data-driven handling of label uncertainty. We achieve it by estimating the confidence factor for each labeled observation. In addition, CPR allows for the exploration of potential class-specific patterns in the adjusted confidence. The proposed method is illustrated with experiments on partially labeled data about speech characteristics collected from smartphone application for BD monitoring. In this particular applied scenario, we also use additional contextual data to improve the construction of confidence paths. It is shown that the proposed CPR approach enables to reflect the varying confidence in labels as compared with the nominal approach which assigns the majority of observations to the same class associated with relevant ground truth period
ABSTRACT
Clinical practice confirms that speech can support the diagnosis of several mental disorders. For example, reduced speech activity, changes in specific voice features, and pause-related measures were found to be sensitive markers of depressive symptoms. Considering the possibility of continuous speech data collection via a smartphone app, voice analysis has great potential for monitoring mental states. Nevertheless, there is still a need to select the most effective validation approaches for solving the task of predicting the mental state. Those validation approaches shall consider that the data collected from sensors and the response variables considered in this BD application problem are subject to various sources of uncertainty. The aim of the study is to perform an experimental evaluation of the accuracy of top-performing crisp and fuzzy methods, such as Naive Bayes Network, SOTA algorithm, Fuzzy Rule, Probabilistic Neural Network, Decision Tree, Gradient Boosted Tree, Random Forest, Tree Ensemble, and an ensemble approach that combines them. Various training and testing scenarios are considered for each of these methods, consisting of a given percentage of all observations. Additionally, the results from multiple methods are aggregated using the dominant function. Thus, the most frequent rating is taken and a metric based on fuzzy numbers is also considered for comparative purposes. The preliminary results of numerical experiments are promising. The sensitive point is the vicinity of the threshold of transition to a disease state. It should be noted that due to minor differences inherent in such cases, it seems intuitive to use fuzzy numbers to determine the patient’s assessment. Experiments confirmed also that the ranking of methods depends on the choice of the training set and evaluation metric.