Hi. I have a question about the L1 bound for the mean function in the ADD/DROP ONE scenario. A tighter bound can be obtained on the local L1 sensitivity. Given a database X with n entries, with lower and upper bounds m and M on all possible entries, the bound = max( |mean(X) - M|, |mean(X) - m| ) / n should hold.
Since you're using Laplace noise for the mean function, wouldn't using this for the scale of the distribution be better? As the noise would be less but the guarantees would be the same. Moreover, it shouldn't be computationally expensive as (I think) the algorithm computes the mean anyway.
Progressing with my benchmark, I have also tested the libraries with real datasets. 73 epsilons, 500 runs per epsilon for statistical queries: count, sum, mean, var.
You may find up a histogram of the dataset used, here is the link: https://archive.ics.uci.edu/ml/datasets/Adult
We chose the numerical attributes age and hours-per-week for the queries.
As you may observe in the other two pictures below, there is a weird behavior with SmartNoise when epsilon equals 10 onwards. This problem is prevalent in sum, mean, and var.
However, we run another test with this dataset: https://archive.ics.uci.edu/ml/datasets/Student+Performance
We chose absence days and final grade, and the results were as expected, no weird behaviour.
smartnoise-samplesrepo and offered and issue and PR.
@amanjeev We would really appreciate your contributions to the OpenDP library! We are working on replacing SmartNoise-Core with the OpenDP library: https://github.com/opendp/opendp. A few things this library does better- there is a more elegant representation of differential privacy via relations, it removes the protobuf layer (a source of significant serialization overhead), and has generic implementations of components (now called transformations and measurements). There is an early contributor guide here: http://test-docs.opendp.org/contrib/python.html.
We're open to PRs for anything, but you may find implementing new transformations in Rust the easiest place to start. You probably have a better sense of what specific transformations/measurements you are needing. I'd be happy to provide mentoring/guidance here.