Gradient Compression

This guide explains the built-in gradient compression options for the model segment exchange. Use these to reduce upload/download bandwidth per round.

Learn by doing?

See an example of MNist running with gradient compression here!

Modes

none (default): no compression.
quantize8: per-tensor symmetric int8 quantization with per-tensor scale; dequantized on receipt by the client/server.
topk: sparsify to top-K% largest-magnitude elements (keeps indices + values).
quantize8+topk or topk+quantize8: combine int8 quantization with Top-K sparsity
auto: uses both int8 and Top-K (if available)

How to set it (workloads)

In your training session configuration (supervised/unsupervised/RL), set the compression mode string after linking the session. For example:

session.js

const trainingSession = new SupervisedTrainingSession(
    ...
);
linkTrainingSession(trainingSession);

trainingSession.setCompressionMode('quantize8+topk'); // or 'none', 'topk', 'quantize8', 'auto'
trainingSession.setMetricsFunction(...)

If you don’t set it, none is used.

What gets compressed

Model weight segments sent from server to clients and client updates sent back (per round).
Biases follow the same compression path as weights.

Caveats

Quantization is per-tensor; very small-magnitude tensors may lose fidelity. Compare accuracy to your workflow without compression to determine if this loss is acceptable.

Recommendations

Start with quantize8 for bandwidth savings with minimal accuracy impact.
Leave none for very small models or when debugging convergence.
Use topk for more aggressive bandwidth reduction, at the cost of accuracy

PreviousDifferential Privacy NextReinforcement Learning

Last updated 1 month ago

hashtagLearn by doing?

hashtagModes

hashtagHow to set it (workloads)

hashtagWhat gets compressed

hashtagCaveats

hashtagRecommendations