Unsupervised Learning
Unsupervised Learning with FedDDL
This guide shows you how to run an unsupervised (or self-supervised) workload on FedDDL. You supply the model, data arrays, and training hyperparameters; the platform handles client partitioning, rounds, and aggregation.
Learn by doing?
See a full example of training an autoencoder on the MNist dataset here!
What you need to provide
Model architecture: a compiled
tf.Sequential(or similar) model. Optimizers are supported; custom losses/metrics are not (use built-in Tensorflow.js losses/metrics likemeanSquaredError,cosineProximity, etc.).Flattened dataset:
inputarray sized toTOTAL_SAMPLES * INPUT_SIZE. For autoencoders, the target is usually the same as the input.Test/validation split (optional):
testInput(andtestOutputif applicable) for evaluation.Training config:
TOTAL_SAMPLES,INPUT_SIZE,OUTPUT_SIZE(often equalsINPUT_SIZE),TOTAL_ROUNDS,BATCH_SIZE,EPOCHS_PER_ROUND,MIN_CLIENTS_TO_START.
Core class: UnsupervisedTrainingSession
UnsupervisedTrainingSessionCreate one session per workload. It mirrors the supervised flow but uses the input as its own target.
Constructor signature:
new UnsupervisedTrainingSession(
FINAL_OUTPUT_UNITS, // matches last layer units
TOTAL_SAMPLES, // number of training examples
INPUT_SIZE, // flattened input length
OUTPUT_SIZE, // flattened target length (often same as input)
TOTAL_ROUNDS, // federated rounds
BATCH_SIZE, // per-client batch size
EPOCHS_PER_ROUND, // local epochs per round
MIN_CLIENTS_TO_START, // quorum to begin training
input, output, // flattened training data (output optional if same as input)
model, // compiled tf model
testInput, testOutput // optional eval data
)Example workload (autoencoder-style)
Data shape and partitioning
Data is partitioned per client by slicing the flat arrays; ensure arrays are contiguous and sized correctly.
inputlength must equalTOTAL_SAMPLES * INPUT_SIZE.
Constraints and tips
Avoid custom losses/metrics; use built-in tf losses/metrics.
Metrics callback is optional and runs on the server; keep it lightweight.
Choose
BATCH_SIZEandEPOCHS_PER_ROUNDbased on client capability; larger values increase on-device compute.MIN_CLIENTS_TO_STARTgates the first round; set to1for single-client demos.
What runs where
Model training and metrics run on clients.
The server hosts the session, partitions data, and aggregates weight segments.
Last updated