PrivGene

This work focuses on analysis tasks that involve model fitting, i.e., finding the parameters of a statistical model that best fit the dataset. For such tasks, the quality of the differentially private results depends upon both the effectiveness of the model fitting algorithm, and the amount of perturbations required to satisfy the privacy guarantees. Most previous studies start from a state-of-the-art, non-private model fitting algorithm, and develop a differentially private version. Unfortunately, many model fitting algorithms require intensive perturbations to satisfy ε-differential privacy, leading to poor overall result quality. 

Motivated by this, we propose PrivGene, a general-purpose differentially private model fitting solution based on genetic algorithms (GA). PrivGene needs significantly less perturbations than previous methods, and it achieves higher overall result quality, even for model fitting tasks where GA is not the first choice without privacy considerations. Further, PrivGene performs the random perturbations using a novel technique called the enhanced exponential mechanism, which improves over the exponential mechanism by exploiting the special properties of model fitting tasks. As case studies, we apply PrivGene to three common analysis tasks involving model fitting: logistic regression, SVM classification, and kmeans clustering. Extensive experiments using real data confirm the high result quality of PrivGene, and its superiority over existing methods.

图片

Publication