Library API¶
The ReFedEz library provides a high-level API for implementing federated learning algorithms. It abstracts away the complexities of distributed communication, allowing you to focus on your machine learning models and training logic.
What the Library Does¶
The library enables you to write federated learning code that runs seamlessly across distributed machines. Instead of dealing with low-level networking, serialization, and coordination, you write standard ML code and the library handles:
- Model Distribution: Automatically sending model weights between server and clients
- Aggregation: Coordinating federated averaging and other aggregation strategies
- Lifecycle Management: Handling training rounds, validation, and synchronization
- Framework Integration: Supporting PyTorch and NumPy backends out of the box
Where to Use It¶
Use the library in your model.py or training script files. This is where you define your federated learning algorithm. The library is designed for:
- Research Prototyping: Quickly test federated algorithms without infrastructure setup
- Production ML Code: Write clean, framework-agnostic federated training code
- Algorithm Development: Focus on the ML logic while the library handles distribution
Basic Usage¶
-
Choose a Backend: Inherit from
FederatedTorchfor PyTorch models orFederatedNumpyfor NumPy-based implementations. -
Use the @Federated Decorator: Apply this decorator to your class to specify the server and client configurations. The decorator automatically reads from your
refedez.yamland connects to the running ReFedEz deployment. -
Implement Required Methods:
get_weights(): Return current model parametersset_weights(weights): Load new model parameterstrain_step(): Perform one round of local trainingvalidate(weights): Evaluate model performance
Example¶
from refedez.lib import Federated, Server, Client, FederatedTorch
@Federated(
server=Server("server.localhost", save_model_path="/models/final_model.pt"),
clients=[Client("site1"), Client("site2")],
refedez_config="./refedez.yaml"
)
class MyFederatedModel(FederatedTorch):
def __init__(self):
super().__init__()
# Your model initialization here
def get_weights(self):
# Return model weights
pass
def set_weights(self, weights):
# Load model weights
pass
def train_step(self):
# Local training logic
pass
def validate(self, weights):
# Validation logic
pass
When you run this script, the library automatically: - Connects to the ReFedEz-deployed server and clients - Coordinates federated training rounds - Handles model synchronization and aggregation
Library Definitions¶
FederatedNumpy
¶
Bases: ABC
Abstract base class for federated learning models using NumPy.
This class defines the interface for models that participate in federated learning workflows using NumPy arrays for weight representation.
Subclasses must implement the abstract methods to define the model's forward pass, weight retrieval and setting, and training step logic.
Source code in refedez/lib/backends/numpy.py
FederatedTorch
¶
Bases: Module
Base class for federated learning models using PyTorch.
This class extends PyTorch's nn.Module and provides the interface for models in federated learning scenarios. It handles the initialization and defines abstract methods for weight management and training.
Subclasses should implement the forward pass and training logic specific to their model architecture.
Source code in refedez/lib/backends/torch.py
Server
dataclass
¶
Configuration for the federated learning server.
Represents the central server in the federated learning system, responsible for coordinating the training process and aggregating model updates.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The unique name identifier for the server. |
save_model_path |
str | None
|
Optional path where the trained model should be saved. |
Source code in refedez/lib/config.py
Client
dataclass
¶
Configuration for a federated learning client.
Represents a client participant in the federated learning system, including its name and any environment variables required for execution.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The unique name identifier for the client. |
env_vars |
Dict[str, str]
|
Dictionary of environment variables to set for the client. |
Source code in refedez/lib/config.py
Federated(server, clients, refedez_config, num_rounds=1, backend=Backend.PYTORCH, algorithm=FedAlgorithm.FED_AVG)
¶
Runs a federated learning experiment using the specified backend and configuration.
This function orchestrates the communication and training rounds between a central
server and multiple clients according to the provided refedez_config file.
The backend determines the machine learning framework used (e.g., PyTorch or NumPy).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server
|
Server
|
The federated server responsible for aggregating model updates and coordinating the training process. |
required |
clients
|
List[Client]
|
A list of participating clients, each responsible for local model training and update submission. |
required |
refedez_config
|
str
|
Path to the ReFedEz configuration file (YAML or JSON) that defines the experiment setup, data paths, and hyperparameters. |
required |
num_rounds
|
int
|
Number of federated training rounds to execute. Defaults to 1. |
1
|
backend
|
Backend
|
Backend to use for computation. Can be one of the
supported frameworks in |
PYTORCH
|
algorithm
|
FedAlgorithm
|
Backend to use for computation. Can be one of the
supported frameworks in |
FED_AVG
|
Returns:
| Name | Type | Description |
|---|---|---|
None |
This function does not return a value directly, but performs training, |
|
|
logging, and potentially saves model artifacts or metrics as side effects. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the configuration file path is invalid or unreadable. |
RuntimeError
|
If the backend fails to initialize or training fails mid-process. |
ConnectionError
|
If communication with clients fails during aggregation rounds. |
Example:
Source code in refedez/lib/refedez.py
backends
¶
numpy
¶
FederatedNumpy
¶
Bases: ABC
Abstract base class for federated learning models using NumPy.
This class defines the interface for models that participate in federated learning workflows using NumPy arrays for weight representation.
Subclasses must implement the abstract methods to define the model's forward pass, weight retrieval and setting, and training step logic.
Source code in refedez/lib/backends/numpy.py
tensorflow
¶
FederatedTensorFlow
¶
Bases: Model
Base class for federated learning models using TensorFlow (Keras).
This class extends tf.keras.Model and provides an interface for models in federated learning scenarios. It handles initialization and defines abstract methods for weight management and local training.
Source code in refedez/lib/backends/tensorflow.py
torch
¶
FederatedTorch
¶
Bases: Module
Base class for federated learning models using PyTorch.
This class extends PyTorch's nn.Module and provides the interface for models in federated learning scenarios. It handles the initialization and defines abstract methods for weight management and training.
Subclasses should implement the forward pass and training logic specific to their model architecture.
Source code in refedez/lib/backends/torch.py
config
¶
Client
dataclass
¶
Configuration for a federated learning client.
Represents a client participant in the federated learning system, including its name and any environment variables required for execution.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The unique name identifier for the client. |
env_vars |
Dict[str, str]
|
Dictionary of environment variables to set for the client. |
Source code in refedez/lib/config.py
Server
dataclass
¶
Configuration for the federated learning server.
Represents the central server in the federated learning system, responsible for coordinating the training process and aggregating model updates.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The unique name identifier for the server. |
save_model_path |
str | None
|
Optional path where the trained model should be saved. |
Source code in refedez/lib/config.py
refedez
¶
Federated(server, clients, refedez_config, num_rounds=1, backend=Backend.PYTORCH, algorithm=FedAlgorithm.FED_AVG)
¶
Runs a federated learning experiment using the specified backend and configuration.
This function orchestrates the communication and training rounds between a central
server and multiple clients according to the provided refedez_config file.
The backend determines the machine learning framework used (e.g., PyTorch or NumPy).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server
|
Server
|
The federated server responsible for aggregating model updates and coordinating the training process. |
required |
clients
|
List[Client]
|
A list of participating clients, each responsible for local model training and update submission. |
required |
refedez_config
|
str
|
Path to the ReFedEz configuration file (YAML or JSON) that defines the experiment setup, data paths, and hyperparameters. |
required |
num_rounds
|
int
|
Number of federated training rounds to execute. Defaults to 1. |
1
|
backend
|
Backend
|
Backend to use for computation. Can be one of the
supported frameworks in |
PYTORCH
|
algorithm
|
FedAlgorithm
|
Backend to use for computation. Can be one of the
supported frameworks in |
FED_AVG
|
Returns:
| Name | Type | Description |
|---|---|---|
None |
This function does not return a value directly, but performs training, |
|
|
logging, and potentially saves model artifacts or metrics as side effects. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the configuration file path is invalid or unreadable. |
RuntimeError
|
If the backend fails to initialize or training fails mid-process. |
ConnectionError
|
If communication with clients fails during aggregation rounds. |
Example: