Data Uncertainty refers to the lack of precision or confidence in the accuracy of data. Uncertainty can arise due to various factors, including measurement errors, incomplete information, or variability in the data source. Managing and quantifying uncertainty is crucial for making informed decisions based on the available data.


UAC: An Uncertainty-Aware Face Clustering Algorithm

We investigate ways to leverage uncertainty in face images to improve the quality of the face clusters. We observe that popular clustering algorithms do not produce better quality clusters when clustering probabilistic face representations that implicitly model uncertainty – these algorithms predict up to 9.6X more clusters than the ground truth for the IJB-A benchmark. We empirically analyze the causes for this unexpected behavior and identify excessive false-positives and false-negatives (when comparing face-pairs) as the main reasons for poor quality clustering. Based on this insight, we propose an uncertainty-aware clustering algorithm, UAC, which explicitly leverages uncertainty information during clustering to decide when a pair of faces are similar or when a predicted cluster should be discarded. UAC considers (a) uncertainty of faces in face-pairs, (b) bins face-pairs into different categories based on an uncertainty threshold, (c) intelligently varies the similarity threshold during clustering to reduce false-negatives and false-positives, and (d) discards predicted clusters that exhibit a high measure of uncertainty. Extensive experimental results on several popular benchmarks and comparisons with state-of-the-art clustering methods show that UAC produces significantly better clusters by leveraging uncertainty in face images – predicted number of clusters is up to 0.18X more of the ground truth for the IJB-A benchmark.