Publication Date: 6/16/2020
Event: CVPR 2020
Reference: pp 11854-11862, 2020
Authors: Yuqing Zhu, University of California, Santa Barbara, NEC Laboratories America, Inc.; Xiang Yu, NEC Laboratories America, Inc.; Manmohan Chandraker, NEC Laboratories America, Inc., University of California, San Diego; Yu-Xiang Wang, University of California, Santa Barbara
Abstract: With increasing ethical and legal concerns on privacy for deep models in visual recognition, differential privacy has emerged as a mechanism to disguise membership of sensitive data in training datasets. Recent methods like Private Aggregation of Teacher Ensembles (PATE) leverage a large ensemble of teacher models trained on disjoint subsets of private data, to transfer knowledge to a student model with privacy guarantees. However, labeled vision data is often expensive and datasets, when split into many disjoint training sets, lead to significantly sub-optimal accuracy and thus hardly sustain good privacy bounds. We propose a practically data-efficient scheme based on private release of k-nearest neighbor (kNN) queries, which altogether avoids splitting the training dataset. Our approach allows the use of privacy-amplification by subsampling and iterative refinement of the kNN feature embedding. We rigorously analyze the theoretical properties of our method and demonstrate strong experimental performance on practical computer vision datasets for face attribute recognition and person reidentification. In particular, we achieve comparable or better accuracy than PATE while reducing more than 90% of the privacy loss, thereby providing the “most practical method to-date” for private deep learning in computer vision.
Publication Link: https://ieeexplore.ieee.org/document/9156598