We have just submitted a paper to the World Congress on Computational Intelligence introducing Confidence Path Regularization for handling label uncertainty in semi-supervised learning: use case in bipolar disorder monitoring.
Semi-supervised learning has gained great interest because of its ability to combine unlabeled data with – potentially few – labeled observations in a training process. However, in some application contexts, one can question whether all available labels are equally valid. For example, in the context of bipolar disorder (BD) remote monitoring, a common practice is to extend psychiatrist’s diagnosis onto some fixed time window surrounding the visit, the so called ground truth. As a consequence of this labeling process, all data collected within this ground truth period of time is associated with the same psychiatric label. However, such approach may potentially result in misguided supervision affecting the model. In this paper, we consider the problem of label uncertainty, assuming that the labels are crisp, but they are assigned to particular observations with varying confidence. We propose a novel method called Confidence Path Regularization (CPR) that estimates confidence values associated to labels and incorporates them into the fuzzy c-means semi- supervised learning. The proposed CPR approach is a novel method for automatic, data-driven handling of label uncertainty. We achieve it by estimating confidence factor for each labelled observation. In addition, the method allows for exploration of potential class-specific patterns in adjusted confidence. The proposed method is illustrated with experiments on partially labeled data about speech collected from smartphone application for BD monitoring. In this particular applied scenario, we also use additional contextual data to improve the construction of confidence paths. It is shown that the proposed CPR approach enables to reflect the varying confidence in labels as compared to the nominal approach which assigns majority of observations to the predefined classes.