Differential Privacy
This page describes how FedDDL applies differential privacy (DP) to model updates. DP is off by default and must be explicitly enabled in your training session.
Learn by doing?
See an example workload of MNist with differential privacy here!
What is Differential Privacy?
Differential Privacy (DP) is a technique that enhances the privacy of your final model by adding targeted noise from a probability distribution into your model. However, it is important to know exactly what differential privacy covers and does not cover, so that you can keep your dataset and model safe.
What DP covers
Configured correctly, differential privacy anonymizes your final model such that there is no way to tell if a given piece of training material was included or excluded. For example, phone keyboard suggestion models often use DP so that users can contribute to a global keyboard model, but each user's privacy remains intact.
What DP does not cover
Differential privacy does not:
Make your dataset more secure. Treat all datasets that are sent to clients via the Obit Network as public data
Offer guarantees during model training. Early in model training, the model will not be differentially private. The guarantee of DP only applies when a sufficient amount of the privacy budget has been used.
Completely prevent reverse-engineering of the model. There will always be a small chance that the privacy guarantee will fail. If DP is applied correctly, this should be almost negligible (similar to how there is a small chance that you will break encryption algorithms like SHA-256 with brute force, but it is negligible enough that they are considered safe to use)
Configuration (session)
Enable DP on a session before starting training:
Key fields:
C: L2 clipping bound applied to model delta before noise.noiseStd: standard deviation of Gaussian noise added to the clipped update.useSvd: toggle SVD-based noise (experimental); otherwise uses additive Gaussian noise.truncationRank,unfoldMode: control the SVD noise shape; ignored ifuseSvdis false.seed: optional RNG seed.
How it works
When to use
Needed when training on sensitive data and sharing weight updates with a server or peers.
Expect some accuracy degradation; tune
CandnoiseStdto balance privacy vs utility.
Tips
Start with small noise (
noiseStd0.1–0.5) and moderate clipping (C~1.0); evaluate accuracy.DP adds compute on the server during aggregation; measure runtime impact if your model is large.
Last updated