diff --git a/docs/Makefile b/docs/Makefile index 86f504687..20668cf5b 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -5,8 +5,5 @@ clean: rm -rf fedn.rst rm -rf _build/ -apidoc: - sphinx-apidoc --ext-autodoc --module-first -o . ../fedn ../*tests* ../fedn/cli* ../fedn/common* ../fedn/network/api/v1* ../fedn/network/grpc/fedn_pb2.py ../fedn/network/grpc/fedn_pb2_grpc.py ../fedn/network/api/server.py ../fedn/network/controller/controlbase.py - html: clean apidoc sphinx-build . _build \ No newline at end of file diff --git a/docs/README.md b/docs/README.md index 2e9595f58..0fb3bd3e5 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,5 +1,7 @@ -FEDn is using sphinx with reStructuredText. +Scaleout Edge is using sphinx with reStructuredText. +# Install sphinx +pip install -r requirements.txt # Updated build Script cd docs/ diff --git a/docs/aggregators.rst b/docs/aggregators.rst index 5c0e30aa7..d567197df 100644 --- a/docs/aggregators.rst +++ b/docs/aggregators.rst @@ -14,17 +14,15 @@ During a training session, the combiners will instantiate an Aggregator and use :align: center The figure above illustrates the overall workflow. When a client completes a model update, the model parameters are streamed to the combiner, -and a model update message is sent. The parameters are saved to a file on disk, and the update message is passed to a callback function, ``on_model_update``. -This function validates the model update and, if successful, places the update message in an aggregation queue. -The model parameters are saved to disk at a configurable storage location within the combiner to prevent exhausting RAM. -As multiple clients submit updates, the aggregation queue accumulates. Once specific criteria are met, another method, ``combine_models``, -begins processing the queue, aggregating models according to the specifics of the scheme (e.g., FedAvg, FedAdam). +and a model update message is sent. The model parameters are saved to disk at a configurable storage location within the combiner to prevent exhausting RAM. +As multiple clients submit updates, the aggregation queue accumulates. Once specific criteria are met, the combiner begins processing +the queue, aggregating models according to the specifics of the scheme (e.g., FedAvg, FedAdam). Using built-in Aggregators -------------------------- -FEDn supports the following aggregation algorithms: +Scaleout Edge supports the following aggregation algorithms: - FedAvg (default) - FedAdam (FedOpt) @@ -55,7 +53,7 @@ Training sessions can be configured to use a given aggregator. For example, to u .. note:: - The FedOpt family of methods use server-side momentum. FEDn resets the aggregator for each new session. + The FedOpt family of methods use server-side momentum. Scaleout Edge resets the aggregator for each new session. This means that the history will will also be reset, i.e. the momentum terms will be forgotten. When using FedAdam, FedYogi and FedAdaGrad, the user needs to strike a balance between the number of rounds in the session from a convergence and utility perspective. @@ -74,47 +72,16 @@ Several additional parameters that guide general behavior of the aggregation flo - Whether to retain or delete model update files after they have been processed (default is to delete them) -Extending FEDn with new Aggregators ------------------------------------ +Implement own Aggregators +------------------------- -A developer can extend FEDn with his/her own Aggregator(s) by implementing the interface specified in -:py:mod:`fedn.network.combiner.aggregators.aggregatorbase.AggregatorBase`. This involes implementing the two methods: +Scaleout Edge supports a flexible architecture that allows developers to implement custom aggregation logic beyond the built-in options. +To define and register your own aggregator, you should use the server functions interface, where server-side behavior can be customized to suit specific needs. -- ``on_model_update`` (perform model update validation before update is placed on queue, optional) -- ``combine_models`` (process the queue and aggregate updates) - -**on_model_update** - -The ``on_model_update`` callback recieves the model update messages from clients (including all metadata) and can be used to perform validation and -potential transformation of the model update before it is placed on the aggregation queue (see image above). -The base class implements a default callback that checks that all metadata assumed by the aggregation algorithms FedAvg and FedOpt is available. The callback could also be used to implement custom pre-processing and additional checks including strategies -to filter out updates that are suspected to be corrupted or malicious. - -**combine_models** - -When a certain criteria is met, e.g. if all clients have sent updates, or the round has times out, the ``combine_model_update`` method -processes the model update queue, producing an aggregated model. This is the main extension point where the -numerical details of the aggregation scheme is implemented. The best way to understand how to implement this method is to study the built-in aggregation algorithms: - -- :py:mod:`fedn.network.combiner.aggregators.fedavg` (weighted average of parameters) -- :py:mod:`fedn.network.combiner.aggregators.fedopt` (compute pseudo-gradients and apply a server-side optmizer) - -To add an aggregator plugin ``myaggregator``, the developer implements the interface and places a file called ‘myaggregator.py’ in the folder ‘fedn.network.combiner.aggregators’. -This extension can then simply be called as such: - -.. code:: python - - session_config = { - "helper": "numpyhelper", - "id": "experiment_myaggregator", - "aggregator": "myaggregator", - "rounds": 10 - } - - result_myaggregator = client.start_session(**session_config) +For detailed instructions and examples on how to implement new aggregators, see the section on :ref:`server-functions`. .. meta:: :description lang=en: Aggregators are responsible for combining client model updates into a combiner-level global model. - :keywords: Federated Learning, Aggregators, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems + :keywords: Federated Learning, Aggregators, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Scaleout Edge diff --git a/docs/apiclient.rst b/docs/apiclient.rst index 984af72a3..f2014278e 100644 --- a/docs/apiclient.rst +++ b/docs/apiclient.rst @@ -3,9 +3,9 @@ Using the API Client ==================== -FEDn comes with an *APIClient* - a Python3 library that is used to interact with FEDn programmatically. +Scaleout Edge comes with an *APIClient* - a Python3 library that is used to interact with your project programmatically. -This guide assumes that the user has aleady taken the :ref:`quickstart-label` tutorial. If this is not the case, please start there to learn how to set up a FEDn Studio project and learn +This guide assumes that the user has aleady taken the :ref:`quickstart-label` tutorial. If this is not the case, please start there to learn how to set up a Scaleout Edge project and learn to connect clients. In this guide we will build on that same PyTorch example (MNIST), showing how to use the APIClient to control training sessions, use different aggregators, and to retrieve models and metrics. **Installation** @@ -14,12 +14,12 @@ The APIClient is available as a Python package on PyPI, and can be installed usi .. code-block:: bash - $ pip install fedn + $ pip install scaleout -**Connect the APIClient to the FEDn project** +**Connect the APIClient to the Scaleout Edge project** To access the API you need the URL to the controller-host, as well as an admin API token. You -obtain these from your Studio project. Navigate to your "Project settings" and copy the "Project url", this is the controller host address: +obtain these from your Scaleout Edge project. Navigate to your "Project settings" and copy the "Project url", this is the controller host address: .. image:: img/find_controller_url.png @@ -27,34 +27,34 @@ To obtain an admin API token press "Generate" in the "Generate Admin token" sect .. image:: img/generate_admin_token.png -To initalize the connection to the FEDn REST API: +To initalize the connection to the Scaleout REST API: .. code-block:: python - >>> from fedn import APIClient + >>> from scaleout import APIClient >>> client = APIClient(host="", token="", secure=True, verify=True) Alternatively, the access token can be sourced from an environment variable. .. code-block:: bash - $ export FEDN_AUTH_TOKEN= + $ export SCALEOUT_AUTH_TOKEN= Then passing a token as an argument is not required. .. code-block:: python - >>> from fedn import APIClient + >>> from scaleout import APIClient >>> client = APIClient(host="", secure=True, verify=True) We are now ready to work with the API. We here assume that you have worked through steps 1-2 in the quisktart tutorial, i.e. that you have created the compute package and seed model on your local machine. -In the next step, we will use the API to upload these objects to the Studio project (corresponding to step 3 in the quickstart tutorial). +In the next step, we will use the API to upload these objects to the Scaleout Edge project (corresponding to step 3 in the quickstart tutorial). **Set the active compute package and seed model** -To set the active compute package in the FEDn Studio Project: +To set the active compute package in the Scaleout Edge Project: .. code:: python @@ -78,7 +78,7 @@ using the default aggregator (FedAvg): >>> model_id = models[-1]['model'] >>> validations = client.get_validations(model_id=model_id) -You can follow the progress of the training in the Studio UI. +You can follow the progress of the training in the Scaleout Edge UI. To run a session using the FedAdam aggregator using custom hyperparamters: @@ -143,14 +143,14 @@ To get a specific session: >>> session = client.get_session(id="session_id") -For more information on how to use the APIClient, see the :py:mod:`fedn.network.api.client`. +For more information on how to use the APIClient, see the :py:mod:`scaleout-client.scaleout.network.api.client`. There is also a collection of Jupyter Notebooks showcasing more advanced use of the API, including how to work with other built-in aggregators and how to automate hyperparameter tuning: -- `API Example `_ . +- `API Example `_ . .. meta:: :description lang=en: - FEDn comes with an APIClient - a Python3 library that can be used to interact with FEDn programmatically. - :keywords: Federated Learning, APIClient, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems + Scaleout Edge comes with an APIClient - a Python3 library that can be used to interact with Scaleout Edge programmatically. + :keywords: Federated Learning, APIClient, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Scaleout Edge diff --git a/docs/apiref/cli.rst b/docs/apiref/cli.rst new file mode 100644 index 000000000..a4b6bddf1 --- /dev/null +++ b/docs/apiref/cli.rst @@ -0,0 +1,48 @@ +Scaleout Edge CLI +================= + +The Scaleout Edge Command-Line Interface (CLI) provides tools for managing +projects, clients, compute packages, and training sessions from the terminal. +It is designed to simplify common operations such as starting clients, uploading +models, monitoring training progress, and interacting with the Scaleout Edge +control plane. + +.. click:: scaleout.cli:client_cmd + :prog: client-cmd + :show-nested: + +.. click:: scaleout.cli:combiner_cmd + :prog: combiner-cmd + :show-nested: + +.. click:: scaleout.cli:model_cmd + :prog: model-cmd + :show-nested: + +.. click:: scaleout.cli:package_cmd + :prog: package-cmd + :show-nested: + +.. click:: scaleout.cli:round_cmd + :prog: round-cmd + :show-nested: + +.. click:: scaleout.cli:run_cmd + :prog: run-cmd + :show-nested: + +.. click:: scaleout.cli:session_cmd + :prog: session-cmd + :show-nested: + +.. click:: scaleout.cli:status_cmd + :prog: status-cmd + :show-nested: + +.. click:: scaleout.cli:validation_cmd + :prog: validation-cmd + :show-nested: + +.. click:: scaleout.cli:main + :prog: main + :show-nested: diff --git a/docs/apiref/clients.rst b/docs/apiref/clients.rst new file mode 100644 index 000000000..f3871a009 --- /dev/null +++ b/docs/apiref/clients.rst @@ -0,0 +1,26 @@ +Scaleout Edge Clients +===================== + +This section documents the client implementations available for Scaleout Edge. +Clients run on edge devices and communicate with the Scaleout Edge network to +perform local training, evaluation, and model exchange. Multiple client +implementations exist to support different environments and programming +languages. + +Python Client +------------- + +The Python client provides a high-level API for integrating local training code +with the Scaleout Edge network. It is suitable for servers, development +machines, notebooks, and lightweight edge devices. + +.. automodule:: scaleout.client.fedn_client + :members: + :undoc-members: + :show-inheritance: + +Additional Clients +------------------ + +Support for additional client implementations (e.g., C++ and Kotlin) will be +included here in future versions of the documentation. diff --git a/docs/modules.rst b/docs/apiref/index.rst similarity index 50% rename from docs/modules.rst rename to docs/apiref/index.rst index b23bd1e9a..bf7c1a270 100644 --- a/docs/modules.rst +++ b/docs/apiref/index.rst @@ -1,13 +1,16 @@ API Reference -============ +============= .. toctree:: - :maxdepth: 4 + :maxdepth: 2 - fedn + clients + utils + cli + serverapi .. meta:: :description lang=en: - API reference for FEDn, a federated learning platform that is secure, scalable and easy-to-use. - :keywords: Federated Learning, API reference, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems + API reference for Scaleout, a federated learning platform that is secure, scalable and easy-to-use. + :keywords: Federated Learning, API reference, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Scaleout Edge \ No newline at end of file diff --git a/docs/apiref/serverapi.rst b/docs/apiref/serverapi.rst new file mode 100644 index 000000000..71493fef0 --- /dev/null +++ b/docs/apiref/serverapi.rst @@ -0,0 +1,121 @@ +Server API +========== + +This section documents the server-side API exposed by the Scaleout Edge control +plane. These endpoints are used by clients, combiners, and external integrations +to interact with the Scaleout Edge network. The API includes operations for +project management, training orchestration, model handling, metrics, telemetry, +and system-level metadata. + +Authentication and Control +-------------------------- + +Endpoints related to authentication, authorization, and high-level control of +the federated network. + +.. automodule:: auth_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: control_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: attribute_routes + :members: + :undoc-members: + :show-inheritance: + + +Clients and Combiners +--------------------- + +Endpoints for managing clients, combiners, and their runtime state in the +federated network. + +.. automodule:: client_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: combiner_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: status_routes + :members: + :undoc-members: + :show-inheritance: + + +Sessions, Rounds, and Runs +-------------------------- + +Endpoints that control training sessions, orchestration rounds, and execution +runs. + +.. automodule:: session_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: round_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: run_routes + :members: + :undoc-members: + :show-inheritance: + + +Models, Packages, and Predictions +--------------------------------- + +Endpoints for handling model artifacts, compute packages, predictions, and +validation. + +.. automodule:: model_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: package_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: prediction_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: validation_routes + :members: + :undoc-members: + :show-inheritance: + + +Metrics, Telemetry, and Helpers +------------------------------- + +Endpoints for metrics, telemetry, and supporting helper operations. + +.. automodule:: metric_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: telemetry_routes + :members: + :undoc-members: + :show-inheritance: + +.. automodule:: helper_routes + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/apiref/utils.rst b/docs/apiref/utils.rst new file mode 100644 index 000000000..4c7a087ee --- /dev/null +++ b/docs/apiref/utils.rst @@ -0,0 +1,20 @@ +Scaleout Edge API Client +======================== + +This section documents the **Scaleout Edge API Client**, the high-level +interface used by external applications, orchestration layers, and tools to +interact with a running Scaleout Edge network. The API Client provides +convenient methods for: + +- querying network state +- managing clients, attributes, and sessions +- submitting models and retrieving results +- interacting with controllers, combiners, and server functions + +It acts as the primary Python entry point for programmatic communication with +the Scaleout Edge backend. + +.. automodule:: scaleoututil.api.client + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/architecture.rst b/docs/architecture.rst index df6f4a2fc..07892e527 100644 --- a/docs/architecture.rst +++ b/docs/architecture.rst @@ -3,57 +3,104 @@ Architecture overview ===================== -Constructing a federated model with FEDn amounts to a) specifying the details of the client-side training code and data integrations, and b) deploying the federated network. A FEDn network, as illustrated in the picture below, is made up of components into three different tiers: the *Controller* tier (3), one or more *Combiners* in second tier (2), and a number of *Clients* in tier (1). -The combiners forms the backbone of the federated ML orchestration mechanism, while the Controller tier provides discovery services and controls to coordinate training over the federated network. -By horizontally scaling the number of combiners, one can meet the needs of a growing number of clients. +This page provides an overview of the **Scaleout Edge architecture**. What +follows is a conceptual description of the components that make up a Scaleout +Edge network and how they interact during a federated training session. + +A Scaleout Edge network consists of three tiers: + +- **Tier 1: Clients** +- **Tier 2: Combiners** +- **Tier 3: Controller and supporting services** -.. image:: img/FEDn_network.png - :alt: FEDn network +.. image:: img/Scaleout_Edge_network.png + :alt: Scaleout Edge network :width: 100% :align: center +Tier 1 — Clients +---------------- + +A **Client** (gRPC client) is a data node holding private data and connecting to +a Combiner (gRPC server) to receive training tasks and validation requests during +federated sessions. + +Key characteristics: + +- Clients communicate **outbound only** using RPC. + No inbound or publicly exposed ports are required. +- Upon connecting to the network, a client receives a **compute package** from the + Controller or uses one that is locally available for the client. This package + contains training and validation code to execute locally. +- The compute package is defined by entry points in the client code and can be + customized to support various model types, frameworks, and even programming + languages. + +Python, C++ and Kotlin client implementations are provided out-of-the-box, but clients may be +implemented in any language to suit specific hardware or software environments. + +Tier 2 — Combiners +------------------ + +A **Combiner** orchestrates and aggregates model updates coming from its +group of clients. It is responsible for the mid-level federated learning workflow. + +Key responsibilities: +- Running a dedicated gRPC server for interacting with clients and the Controller. +- Executing the orchestration plan defined in the global **compute plan** + provided by the Controller. +- Reducing client model updates into a single **combiner-level model**. +Because each Combiner operates independently, the total number of clients that +can be supported scales with the number of deployed Combiners. Combiners may be +placed in the cloud, on fog/edge nodes, or in any environment suited for running +the aggregation service. -**The clients: tier 1** +Tier 3 — Controller and base services +------------------------------------- -A Client (gRPC client) is a data node, holding private data and connecting to a Combiner (gRPC server) to receive model update requests and model validation requests during training sessions. -Importantly, clients uses remote procedure calls (RPC) to ask for model updates tasks, thus the clients not require any open ingress ports! A client receives the code (called package or compute package) to be executed from the *Controller* -upon connecting to the network, and thus they only need to be configured prior to connection to read the local datasets during training and validation. The package is based on entry points in the client code, and can be customized to fit the needs of the user. -This allows for a high degree of flexibility in terms of what kind of training and validation tasks that can be performed on the client side. Such as different types of machine learning models and framework, and even programming languages. -A python3 client implementation is provided out of the box, and it is possible to write clients in a variety of languages to target different software and hardware requirements. +Tier 3 contains several services, with the **Controller** being the central +component coordinating global training. The Controller has three primary roles: -**The combiners: tier 2** +1. **Global orchestration** + It defines the overall training strategy, distributes the compute plan, and + specifies how combiner-level models should be combined into a global model. -A combiner is an actor whose main role is to orchestrate and aggregate model updates from a number of clients during a training session. -When and how to trigger such orchestration are specified in the overall *compute plan* laid out by the *Controller*. -Each combiner in the network runs an independent gRPC server, providing RPCs for interacting with the federated network it controls. -Hence, the total number of clients that can be accommodated in a FEDn network is proportional to the number of active combiners in the FEDn network. -Combiners can be deployed anywhere, e.g. in a cloud or on a fog node to provide aggregation services near the cloud edge. +2. **Global state management** + The Controller maintains the **model trail**—an immutable record of global + model updates forming the training timeline. -**The controller: tier 3** +3. **Discovery and connectivity** + It provides discovery services and mediates connections between clients and + combiners. For this purpose, the Controller exposes a standard REST API used + by RPC clients/servers and by user interfaces. -Tier 3 does actually contain several components and services, but we tend to associate it with the *Controller* the most. The *Controller* fills three main roles in the FEDn network: +Additional Tier 3 services include: -1. it lays out the overall, global training strategy and communicates that to the combiner network. -It also dictates the strategy to aggregate model updates from individual combiners into a single global model, -2. it handles global state and maintains the *model trail* - an immutable trail of global model updates uniquely defining the federated ML training timeline, and -3. it provides discovery services, mediating connections between clients and combiners. For this purpose, the *Controller* exposes a standard REST API both for RPC clients and servers, but also for user interfaces and other services. +- **Reducer** + Aggregates the combiner-level models into a single global model. -Tier 3 also contain a *Reducer* component, which is responsible for aggregating combiner-level models into a single global model. Further, it contains a *StateStore* database, -which is responsible for storing various states of the network and training sessions. The final global model trail from a traning session is stored in the *ModelRegistry* database. +- **StateStore** + Stores the state of the network, training sessions, and metadata. +- **ModelRegistry** + Stores the final global model trail after a completed training session. -**Notes on aggregating algorithms** +Notes on aggregation algorithms +------------------------------- -FEDn is designed to allow customization of the FedML algorithm, following a specified pattern, or programming model. -Model aggregation happens on two levels in the network. First, each Combiner can be configured with a custom orchestration and aggregation implementation, that reduces model updates from Clients into a single, *combiner level* model. -Then, a configurable aggregation protocol on the *Controller* level is responsible for combining the combiner-level models into a global model. By varying the aggregation schemes on the two levels in the system, -many different possible outcomes can be achieved. Good starting configurations are provided out-of-the-box to help the user get started. See :ref:`agg-label` and API reference for more details. +Scaleout Edge includes several **built-in aggregators** for common FL workflows +(see :ref:`agg-label`). For advanced scenarios, users may override the +Combiner-level behavior using **server functions** (:ref:`server-functions`), +allowing custom orchestration or aggregation logic. +Aggregation happens in two stages: +1) each Combiner reduces client updates into a *combiner-level model*, and +2) the Controller (Reducer) combines these into the final global model. .. meta:: :description lang=en: - Architecture overview - An overview of the FEDn federated learning platform architecture. - :keywords: Federated Learning, Architecture, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems + Architecture overview - An overview of the Scaleout Edge federated learning platform architecture. + :keywords: Federated Learning, Architecture, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Scaleout Edge diff --git a/docs/cli.rst b/docs/cli.rst index 47ae39af1..234f806c3 100644 --- a/docs/cli.rst +++ b/docs/cli.rst @@ -3,189 +3,159 @@ CLI ================================= -The FEDN Command-Line Interface (CLI) is a powerful tool that allows users to interact with the FEDN platform. It provides a comprehensive set of commands to manage and operate various components of the FEDN network, including starting services, managing sessions, and retrieving data. +The Scaleout Edge Command-Line Interface (CLI) is designed to streamline management of the Scaleout Edge platform, making it easier for users to deploy, monitor and interact with their federated learning networks. -With the FEDN CLI, users can: +With the Scaleout Edge CLI, users can: -- Start and manage FEDN services such as the **combiner**, **controller**, and **clients**. +- Start and manage Scaleout Edge services such as the **combiner**, **controller**, and **clients**. - Interact with the **controller** to: - Manage sessions, including starting, stopping, and monitoring their progress. - Retrieve data and results related to sessions, such as aggregated models and validation metrics. - Query the state of the network, including the status of connected combiners and clients. -- Test entry points in a FEDN package: - - For example, use the CLI to test the script defined in the `train` entry point of a FEDN package. This allows users to validate and debug their training scripts in isolation before deploying them in a federated learning session. +- Test entry points in a Scaleout Edge package: + - For example, use the CLI to test the script defined in the `train` entry point of a Scaleout Edge package. This allows users to validate and debug their training scripts in isolation before deploying them in a federated learning session. -The FEDN CLI is designed to streamline the management of the FEDN platform, making it easier for users to deploy, monitor, and interact with their federated learning networks. +The Scaleout Edge CLI is designed to streamline the management of the Scaleout Edge platform, making it easier for users to deploy, monitor, and interact with their federated learning networks. For detailed usage and examples, refer to the sections below. -Client +Login ------ -The `fedn client` commands allow users to start and manage FEDN clients. Clients are the entities that participate in federated learning sessions and contribute their local data and models to the network. +The `scaleout` commands allow users to log in to Scaleout Edge and interact with the platform. **Commands:** -- **fedn client start** - Start a FEDN client using a specified configuration file or package. Example: - -.. code-block:: bash - - fedn client start --init client_config.yaml --local-package - -- **fedn client list** - List all active FEDN clients in the network. Example: +- **scaleout login** - Log in to the Scaleout Edge using a username, password, and host. Example: .. code-block:: bash - - fedn client list -- **fedn client get-config** - Get the configuration of a specific FEDN client from Studio, including the client's token and other details. Example: + scaleout login -u username -P password -H host -.. code-block:: bash - - fedn client get-config --name test-client - -Combiner --------- +Client +------ -The `fedn combiner` commands allow users to start and manage combiners, which aggregate models from clients in the network. +The `scaleout client` commands allow users to start and manage Scaleout Edge clients. Clients are the entities that participate in federated learning sessions and contribute their local data and models to the network. **Commands:** -- **fedn combiner start** - Start a FEDN combiner using a specified configuration file. Example: - +- **scaleout client start** - Start a Scaleout Edge client using a specified configuration file or package. Example: + .. code-block:: bash - fedn combiner start --config combiner_config.yaml - -Controller ----------- - -The `fedn controller` commands allow users to start and manage the FEDN controller, which orchestrates the entire federated learning process. - -**Commands:** + scaleout client start --init client_config.yaml --local-package -- **fedn controller start** - Start the FEDN controller using a specified configuration file. Example: +- **scaleout client list** - List all active Scaleout Edge clients in the network. Example: .. code-block:: bash + + scaleout client list - fedn controller start --config controller_config.yaml - -Studio ------- - -The `fedn studio` commands allow users to log in to the FEDN Studio and interact with the platform. - -**Commands:** - -- **fedn studio login** - Log in to the FEDN Studio using a username, password, and host. Example: +- **scaleout client get-config** - Get the configuration of a specific client from Scaleout Edge, including the client's token and other details. Example: .. code-block:: bash + + scaleout client get-config --name test-client - fedn studio login -u username -P password -H studio_host -Project -------- +Combiner +-------- -The `fedn project` commands allow users to create, delete, list, and set the context for projects in the FEDN Studio. +The `scaleout combiner` commands allow users to start and manage combiners, which aggregate models from clients in the network. **Commands:** -- **fedn project create** - Create a new project in the FEDN Studio. Example: - -.. code-block:: bash - - fedn project create -n project_name -H studio_host - -- **fedn project delete** - Delete an existing project. Example: +- **scaleout combiner start** - Start a Scaleout Edge combiner using a specified configuration file. Example: .. code-block:: bash - fedn project delete -id project_id -H studio_host + scaleout combiner start --config combiner_config.yaml -- **fedn project list** - List all projects in the FEDN Studio. Example: +Controller +---------- -.. code-block:: bash +The `scaleout controller` commands allow users to start and manage the Scaleout Edge controller, which orchestrates the entire federated learning process. - fedn project list -H studio_host +**Commands:** -- **fedn project set-context** - Set the context for a specific project. Example: +- **scaleout controller start** - Start the Scaleout Edge controller using a specified configuration file. Example: .. code-block:: bash - fedn project set-context -id project_id -H studio_host + scaleout controller start --config controller_config.yaml Model ----- -The `fedn model` commands allow users to manage models in the FEDN Studio. +The `scaleout model` commands allow users to manage models in your Scaleout Edge project. **Commands:** -- **fedn model set-active** - Set a specific model as the active model for a project. Example: +- **scaleout model set-active** - Set a specific model as the active model for a project. Example: .. code-block:: bash - fedn model set-active -f model_file.npz -H studio_host + scaleout model set-active -f model_file.npz -H host -- **fedn model list** - List all models in the FEDN Studio. Example: +- **scaleout model list** - List all models in your Scaleout Edge project. Example: .. code-block:: bash - fedn model list -H studio_host + scaleout model list -H host Package ------- -The `fedn package` commands allow users to create and list packages in the FEDN Studio. +The `scaleout package` commands allow users to create and list packages in Scaleout Edge. **Commands:** -- **fedn package create** - Create a new package for a project. Example: +- **scaleout package create** - Create a new package. Example: .. code-block:: bash - fedn package create -n package_name -H studio_host + scaleout package create -n package_name -H host -- **fedn package list** - List all packages in the FEDN Studio. Example: +- **scaleout package list** - List all packages in your Scaleout Edge project. Example: .. code-block:: bash - fedn package list -H studio_host + scaleout package list -H host Session ------- -The `fedn session` commands allow users to start and list sessions in the FEDN Studio. +The `scaleout session` commands allow users to start and list sessions in Scaleout Edge. **Commands:** -- **fedn session start** - Start a new session for a project. Example: +- **scaleout session start** - Start a new session for a project. Example: .. code-block:: bash - fedn session start -n session_name -H studio_host + scaleout session start -n session_name -H host -- **fedn session list** - List all sessions in the FEDN Studio. Example: +- **scaleout session list** - List all sessions in your Scaleout Edge project. Example: .. code-block:: bash - fedn session list -H studio_host + scaleout session list -H host Validation ---------- -The `fedn validation` commands allow users to retrieve and list validation results. +The `scaleout validation` commands allow users to retrieve and list validation results. **Commands:** -- **fedn validation get** - Retrieve validation results for a specific round. Example: +- **scaleout validation get** - Retrieve validation results for a specific round. Example: .. code-block:: bash - fedn validation get -r round_number -H studio_host + scaleout validation get -r round_number -H host -- **fedn validation list** - List all validation results for a project. Example: +- **scaleout validation list** - List all validation results for a project. Example: .. code-block:: bash - fedn validation list -H studio_host + scaleout validation list -H host diff --git a/docs/conf.py b/docs/conf.py index f791a79e3..a3036bd0e 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -1,13 +1,16 @@ import os import sys -import sphinx_rtd_theme # noqa: F401 -# Insert path -sys.path.insert(0, os.path.abspath("..")) +# Insert paths +# sys.path.insert(0, os.path.abspath("..")) # repo root +sys.path.insert(0, os.path.abspath("../scaleout-core")) +sys.path.insert(0, os.path.abspath("../scaleout-core/scaleoutcore/network/api/v1")) +sys.path.insert(0, os.path.abspath("../scaleout-client-python")) +sys.path.insert(0, os.path.abspath("../scaleout-util")) # Project info -project = "FEDn" +project = "Scaleout Edge" author = "Scaleout Systems AB" # The full version, including alpha/beta/rc tags @@ -15,6 +18,7 @@ # Add any Sphinx extension module names here, as strings extensions = [ + "sphinx_click.ext", "sphinx.ext.autodoc", "sphinx.ext.napoleon", "sphinx.ext.doctest", @@ -28,9 +32,11 @@ "sphinx_copybutton", ] +autosummary_generate = True + # SEO configuration -html_title = "FEDn Documentation - Scalable Federated Learning Framework" -html_short_title = "FEDn Docs" +html_title = "Scaleout Edge Documentation - Scalable Federated Learning Framework" +html_short_title = "Scaleout Edge Docs" # The master toctree document. master_doc = "index" @@ -41,7 +47,7 @@ # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. -exclude_patterns = [] +exclude_patterns = ["_build", ".venv", "venv", "Thumbs.db", ".DS_Store"] # The theme to use for HTML and HTML Help pages. html_theme = "sphinx_rtd_theme" @@ -55,6 +61,39 @@ html_use_index = True html_split_index = False +# mock imports +autodoc_mock_imports = [ + "click", + "psutil", + "grpc", + "flask", + "numpy", + "pymongo", + "jwt", + "pydantic", + "sqlalchemy", + "psycopg2", + "requests", + "boto3", + "minio", + "redis", + "yaml", + "werkzeug", + "fastapi", + "uvicorn", + "google", + "alembic", + "alembic.config", + "opentelemetry", + "opentelemetry.trace", + "opentelemetry.instrumentation", + "opentelemetry.sdk", + "scaleoututil.grpc.scaleout_pb2", + "scaleoututil.grpc.scaleout_pb2_grpc", + "scaleoutcore.network.grpc.server_pb2", + "scaleoutcore.network.grpc.server_pb2_grpc", +] + # Allow search engines to index the documentation # Remove any robots restrictions html_extra_path = ["robots.txt"] @@ -65,11 +104,11 @@ html_static_path = ["_static"] # Output file base name for HTML help builder. -htmlhelp_basename = "fedndocs" +htmlhelp_basename = "scaleoutdocs" # If defined shows an image instead of project name on page top-left (link to index page) html_logo = "_static/images/scaleout_logo_flat_dark.svg" -# FEDn logo looks ugly on rtd theme +# Scaleout Edge logo looks ugly on rtd theme html_favicon = "favicon.png" @@ -114,18 +153,18 @@ # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ - (master_doc, "fedn.tex", "FEDn Documentation", "Scaleout Systems AB", "manual"), + (master_doc, "scaleout.tex", "Scaleout Edge Documentation", "Scaleout Systems AB", "manual"), ] # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [(master_doc, "fedn", "FEDn Documentation", [author], 1)] +man_pages = [(master_doc, "scaleout", "Scaleout Edge Documentation", [author], 1)] # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ - (master_doc, "fedn", "FEDn Documentation", author, "fedn", "One line description of project.", "Miscellaneous"), + (master_doc, "scaleout", "Scaleout Edge Documentation", author, "scaleout", "One line description of project.", "Miscellaneous"), ] # Bibliographic Dublin Core info. diff --git a/docs/developer.rst b/docs/developer.rst deleted file mode 100644 index 31aa191b6..000000000 --- a/docs/developer.rst +++ /dev/null @@ -1,294 +0,0 @@ -.. _developer-label: - -================ -Developer guide -================ - - -Pseudo-distributed sandbox -=========================== - -.. note:: - These instructions are for users wanting to set up a bare-minimum local deployment of FEDn (without FEDn Studio). - We here assume practical knowledge of Docker and docker-compose. We recommend all new users of FEDn to start - by taking the Getting Started tutorial: :ref:`quickstart-label` - -During development on FEDn, and when working on own extentions including aggregators and helpers, it is -useful to have a local development setup of the core FEDn server-side services (controller, combiner, database, object store). -We provide Dockerfiles and docker-compose template for an all-in-one local sandbox: - -.. code-block:: - - git clone https://github.com/scaleoutsystems/fedn.git - cd fedn - docker compose up -d - -This starts up local services for MongoDB, Minio, the API Server (Controller), one Combiner. -You can verify the deployment on localhost using these urls: - -- API Server: http://localhost:8092/get_controller_status -- Minio: http://localhost:9000 -- Mongo Express: http://localhost:8081 - -To run a client in this setup, you can use the CLI to connect to the API Server. - -.. code-block:: - - pip install -e . - cd examples/mnist-pytorch - fedn run client --api-url http://localhost:8092 --local-package - -The --local-package flag is used to indicate that the package is available locally in the current directory. -This will enable you to modify the machine learning scripts in the client folder while the client is running. -In otrher words, you don't need to rebuild and upload the compute package every time you make a change. -Obs that this feature is also available in FEDn Studio. - -You can also connect directly to the Combiner (gRPC) without using the API Server (REST-API): - -.. code-block:: - - fedn run client --combiner=localhost --combiner-port=12080 --local-package - -Observe that you need to create an initial model seed.npz in the current directory before starting any new session: - -.. code-block:: - - fedn run build --path client - -Please observe that this local sandbox deployment does not include any of the security and authentication features available in a Studio Project, -so we will not require authentication of clients (insecure mode) when using the APIClient: - -.. code-block:: - - from fedn import APIClient - client = APIClient(host="localhost", port=8092) - client.set_active_model("seed.npz") - client.start_session(rounds=10, timeout=60) - - - -Access message logs and validation data from MongoDB ------------------------------------------------------- -You can access and download event logs and validation data via the API, and you can also as a developer obtain -the MongoDB backend data using pymongo or via the MongoExpress interface: - -- http://localhost:8081/db/fedn-network/ - -Username and password are found in 'docker-compose.yaml'. - -Access global models ------------------------------------------------------- - -You can obtain global model updates from the 'fedn-models' bucket in Minio: - -- http://localhost:9000 - -Username and password are found in 'docker-compose.yaml'. - -Reset the FEDn deployment ------------------------------------------------------- - -To purge all data from a deployment incuding all session and round data, access the MongoExpress UI interface and -delete the entire ``fedn-network`` collection. Then restart all services. - -Clean up ------------------------------------------------------- -You can clean up by running - -.. code-block:: - - docker compose down -v - -Connecting clients using Docker: ------------------------------------------------------- - -If you like to run the client in docker as well we have added an extra docker-compose file in the examples folders for this purpose. -This will allow you to run the client in a separate container and connect to the API server using the service name `api-server`: - -.. code-block:: - - docker compose \ - -f ../../docker-compose.yaml \ - -f docker-compose.override.yaml \ - up - - - -Distributed deployment on a local network -========================================= - -You can use different hosts for the various FEDn services. These instructions shows how to set up FEDn on a **local network** using a single workstation or laptop as -the host for the servier-side components, and other hosts or devices as clients. - -.. note:: - For a secure and production-grade deployment solution over **public networks**, explore the FEDn Studio service at - **fedn.scaleoutsystems.com**. - - Alternatively follow this tutorial substituting the hosts local IP with your public IP, open the neccesary - ports (see which ports are used in docker-compose.yaml), and ensure you have taken additional neccesary security - precautions. - -**Prerequisites** -- `One host workstation and atleast one client device` -- `Python 3.9, 3.10, 3.11 or 3.12 `__ -- `Docker `__ -- `Docker Compose `__ - -Launch a distributed FEDn Network ---------------------------------- - -Start by noting your host's local IP address, used within your network. Discover it by running ifconfig on UNIX or -ipconfig on Windows, typically listed under inet for Unix and IPv4 for Windows. - -Continue by following the standard procedure to initiate a FEDn network, for example using the provided docker-compose template. -Once the network is active, upload your compute package and seed (for comprehensive details, see the quickstart tutorials). - -.. note:: - This guide covers general local networks where server and client may be on different hosts but able to communicate on their private IPs. - A common scenario is also to run fedn and the clients on **localhost** on a single machine. In that case, you can replace - by "127.0.0.1" below. - -Configuring and Attaching Clients ---------------------------------- - -On your client device, continue with initializing your client. To connect to the host machine we need to ensure we are -routing the correct DNS to our hosts local IP address. We can do this using the standard FEDn `client.yaml`: - -.. code-block:: - - network_id: fedn-network - discover_host: api-server - discover_port: 8092 - - -We can then run a client using docker by adding the hostname:ip mapping in the docker run command: - -.. code-block:: - - docker run \ - -v $PWD/client.yaml: \ - - —add-host=api-server: \ - —add-host=combiner: \ - client start -in client.yaml --name client1 - - -Alternatively updating the `/etc/hosts` file, appending the following lines for running naitively: - -.. code-block:: - - api-server - combiner - - -.. _auth-label: - -Authentication and Authorization (RBAC) -======================================== - -.. warning:: The FEDn RBAC system is an experimental feature and may change in the future. - -FEDn supports Role-Based Access Control (RBAC) for controlling access to the FEDn API and gRPC endpoints. The RBAC system is based on JSON Web Tokens (JWT) and is implemented using the `jwt` package. The JWT tokens are used to authenticate users and to control access to the FEDn API. -There are two types of JWT tokens used in the FEDn RBAC system: -- Access tokens: Used to authenticate access to the FEDn API. -- Refresh tokens: Used to obtain new access tokens when the old ones expire. - -.. note:: Please note that the FEDn RBAC system is not enabled by default and does not issue JWT tokens. It is used to integrate with external authentication and authorization systems such as FEDn Studio. - -FEDn RBAC system is by default configured with four types of roles: -- `admin`: Has full access to the FEDn API. This role is used to manage the FEDn network using the API client or the FEDn CLI. -- `combiner`: Has access to the /add_combiner endpoint in the API. -- `client`: Has access to the /add_client endpoint in the API and various gRPC endpoint to participate in federated learning sessions. - -A full list of the "roles to endpoint" mappings for gRPC can be found in the `fedn/network/grpc/auth.py`. For the API, the mappings are defined using custom decorators defined in `fedn/network/api/auth.py`. - -.. note:: The roles are handled by a custom claim in the JWT token called `role`. The claim is used to control access to the FEDn API and gRPC endpoints. - -To enable the FEDn RBAC system, you need to set the following environment variables in the controller and combiner: - -Authentication Environment Variables -------------------------------------- - -.. line-block:: - - **FEDN_JWT_SECRET_KEY** - - **Type:** str - - **Required:** yes - - **Default:** None - - **Description:** The secret key used for JWT token encryption. - - **FEDN_JWT_ALGORITHM** - - **Type:** str - - **Required:** no - - **Default:** "HS256" - - **Description:** The algorithm used for JWT token encryption. - - **FEDN_AUTH_SCHEME** - - **Type:** str - - **Required:** no - - **Default:** "Token" - - **Description:** The authentication scheme used in the FEDn API and gRPC interceptors. - -Additional Environment Variables --------------------------------- - -For further flexibility, you can also set the following environment variables: - -.. line-block:: - - **FEDN_CUSTOM_URL_PREFIX** - - **Type:** str - - **Required:** no - - **Default:** None - - **Description:** Add a custom URL prefix used in the FEDn API, such as /internal or /v1. - - **FEDN_AUTH_WHITELIST_URL** - - **Type:** str - - **Required:** no - - **Default:** None - - **Description:** A URL pattern to the API that should be excluded from the FEDn RBAC system. For example, /internal (to enable internal API calls). - - **FEDN_JWT_CUSTOM_CLAIM_KEY** - - **Type:** str - - **Required:** no - - **Default:** None - - **Description:** The custom claim key used in the JWT token. - - **FEDN_JWT_CUSTOM_CLAIM_VALUE** - - **Type:** str - - **Required:** no - - **Default:** None - - **Description:** The custom claim value used in the JWT token. - -Client Environment Variables ------------------------------ - -For the client, you need to set the following environment variables: - -.. line-block:: - - **FEDN_AUTH_REFRESH_TOKEN_URI** - - **Type:** str - - **Required:** no - - **Default:** None - - **Description:** The URI used to obtain new access tokens when the old ones expire. - - **FEDN_AUTH_REFRESH_TOKEN** - - **Type:** str - - **Required:** no - - **Default:** None - - **Description:** The refresh token used to obtain new access tokens when the old ones expire. - - **FEDN_AUTH_SCHEME** - - **Type:** str - - **Required:** no - - **Default:** "Token" - - **Description:** The authentication scheme used in the FEDn API and gRPC interceptors. - -You can use `--token` flags in the FEDn CLI to set the access token. - -.. meta:: - :description lang=en: - During development on FEDn, and when working on own extentions including aggregators and helpers, it is useful to have a local development setup. - :keywords: Federated Learning, Developer guide, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems - \ No newline at end of file diff --git a/docs/faq.rst b/docs/faq.rst index 0381aa1a3..9972c4972 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -12,8 +12,8 @@ However, during development of a new model it will be necessary to reinitialize. .. code:: python - >>> from fedn import APIClient - >>> client = APIClient(host="localhost", port=8092) + >>> from scaleout import APIClient + >>> client = APIClient(host="", token="", secure=True, verify=True) >>> client.set_package("package.tgz", helper="numpyhelper") >>> client.set_initial_model("seed.npz") @@ -26,47 +26,40 @@ Yes, to facilitate interactive development of the compute package you can start .. code-block:: bash - fedn client start --remote=False -in client.yaml + scaleout client start --local-package -Note that in production federations this options should in most cases be disallowed. +Note that in production federations the remote compute package option should in most cases be disallowed. -Q: How can other aggregation algorithms can be defined? -------------------------------------------------------- +Q: How can I define custom aggregation algorithms? +-------------------------------------------------- -There is a plugin interface for extending the framework with new aggregators. See +Scaleout Edge provides several built-in aggregators, but custom aggregation or +server-side behavior can be implemented through the **server functions** +interface. This allows you to override or extend the Combiner-level logic as +needed. -:ref:`agg-label` +See :ref:`agg-label` and :ref:`server-functions` for details. -Q: What is needed to include additional ML frameworks in FEDn? +Q: What is needed to include additional ML frameworks in Scaleout Edge? ------------------------------------------------------------------------------------- -You need to make sure that FEDn knows how to serialize and deserialize the model object. If you can +You need to make sure that Scaleout Edge knows how to serialize and deserialize the model object. If you can serialize to a list of numpy ndarrays in your compute package entrypoint (see the Quickstart Tutorial code), you can use the built in "numpyhelper". If this is not possible, you can extend the framework with a custom helper, see the section about model marshaling: :ref:`helper-label` -Q: Can I start a client listening only to training requests or only on validation requests?: --------------------------------------------------------------------------------------------- - -Yes! You can toggle which message streams a client subscribes to when starting the client. For example, to start a pure validation client: - -.. code-block:: bash - - fedn client start --trainer=False -in client.yaml - - Q: How do you approach the question of output privacy? ---------------------------------------------------------------------------------- We take security in (federated) machine learning seriously. Federated learning is a foundational technology that improves input privacy -in machine learning by allowing datasets to stay local and private, and not copied to a server. FEDn is designed to provide an industry grade +in machine learning by allowing datasets to stay local and private, and not copied to a server. Scaleout Edge is designed to provide an industry grade implementation of the core communication and aggregation layers of federated learning, as well as configurable modules for traceability, logging -etc, to allow the developer balance between privacy and auditability. With `FEDn Studio `__ we add -functionality for user authentication, authorization, and federated client identity management. As such, The FEDn Framework provides +etc, to allow the developer balance between privacy and auditability. With `Scaleout Edge `__ we add +functionality for user authentication, authorization, and federated client identity management. As such, The Scaleout Edge Framework provides a comprehensive software suite for implementing secure federated learning following industry best-practices. Going beyond input privacy, there are several additional considerations relating to output privacy and potential attacks on (federated) machine learning systems. @@ -90,5 +83,5 @@ with the Scaleout team. .. meta:: :description lang=en: How do you approach the question of output privacy? We take security in (federated) machine learning seriously. - :keywords: Federated Learning, FAQ, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems + :keywords: Federated Learning, FAQ, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Scaleout Edge diff --git a/docs/helpers.rst b/docs/helpers.rst index 1d718c576..9c6a88d2c 100644 --- a/docs/helpers.rst +++ b/docs/helpers.rst @@ -9,23 +9,20 @@ to/from disk, for example to transiently store updates during training rounds. Furthermore, aggregation algorithms need to perform a range of numerical operations on the model updates (addition, multiplication, etc). Since different ML frameworks (TF, Torch, etc) have different internal ways to represent model parameters, there is a need to inform the -framework how to handle models of a given type. In FEDn, this compatibility layer is the +framework how to handle models of a given type. In Scaleout Edge, this compatibility layer is the task of Helpers. -A helper is defined by the interface in :py:mod:`fedn.utils.helpers.helperbase.HelperBase`. +A helper is defined by the interface in :py:mod:`scaleout-util.scaleoututils.helpers.helperbase.HelperBase`. By implementing a helper plugin, a developer can extend the framework with support for new ML frameworks and numerical operations. -FEDn ships with a default helper implementation, ``numpyhelper``. +Scaleout Edge ships with a default helper implementation, ``numpyhelper``. This helper relies on the assumption that the model update is made up of parameters represented by a list of :py:class:`numpy.ndarray` arrays. Since most ML frameworks have good numpy support it should in most cases be sufficient to use this helper. Both TF/Keras and PyTorch models can be readily serialized in this way. -To add a helper plugin “myhelper” you implement the interface and place a -file called ‘myhelper.py’ in the folder fedn.utils.helpers.plugins. - -See the Keras and PyTorch quickstart examples and :py:mod:`fedn.utils.helpers.plugins.numpyhelper` +See the Keras and PyTorch quickstart examples and :py:mod:`scaleout-util.scaleoututil.helpers.plugins.numpyhelper` for further details. .. meta:: diff --git a/docs/img/FEDn_network.png b/docs/img/Scaleout_Edge_network.png similarity index 100% rename from docs/img/FEDn_network.png rename to docs/img/Scaleout_Edge_network.png diff --git a/docs/index.rst b/docs/index.rst index 397634145..295db9213 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,40 +1,44 @@ -Welcome to FEDn Documentation -============================= +Welcome to Scaleout Edge Documentation +====================================== -FEDn is an open-source framework for scalable federated learning. This documentation covers architecture, setup, deployment, API references, and troubleshooting guidance. Quickly locate configuration examples, technical concepts, or operational details you need to deploy federated models efficiently in production environments. +Scaleout Edge is a framework for scalable federated learning. This documentation covers architecture, setup, deployment, API references, and troubleshooting guidance. Quickly locate configuration examples, technical concepts, or operational details you need to deploy federated models efficiently in production environments. .. toctree:: :maxdepth: 1 - :caption: Getting Started + :caption: Get Started - introduction - My First FEDn Project + overview + My First Scaleout Edge Project .. toctree:: :maxdepth: 1 - :caption: Tutorials and Examples + :caption: Core Concepts - + introduction + Architecture Overview .. toctree:: :maxdepth: 1 - :caption: Developer Resources + :caption: Developers Guides - A Guide to the FEDn Project Structure - Local Development Guide - API Reference + A Guide to the Scaleout Edge Project Structure + cli + apiclient + localcompute .. toctree:: :maxdepth: 1 - :caption: Usage Guides + :caption: Advanced Configuration - Architecture Overview aggregators - cli - helpers - apiclient serverfunctions - localcompute + helpers + +.. toctree:: + :maxdepth: 1 + :caption: API Reference + + apiref/index .. toctree:: :maxdepth: 1 @@ -51,13 +55,13 @@ Indices and tables .. meta:: :description lang=en: - FEDn is a framework for scalable federated learning. Deploy secure, distributed machine learning models efficiently in production environments with comprehensive documentation, tutorials, and API references. - :keywords: Federated Learning, Machine Learning, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Distributed Learning, Privacy-Preserving ML + Scaleout Edge is a framework for scalable federated learning. Deploy secure, distributed machine learning models efficiently in production environments with comprehensive documentation, tutorials, and API references. + :keywords: Federated Learning, Machine Learning, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Distributed Learning, Privacy-Preserving ML, Scaleout Edge :robots: index, follow :author: Scaleout Systems AB - :og:title: FEDn Documentation - Scalable Federated Learning Framework - :og:description: Complete documentation for FEDn, the federated learning platform. Learn architecture, setup, deployment, and API usage. + :og:title: Scaleout Edge Documentation - Scalable Federated Learning Framework + :og:description: Complete documentation for Scaleout Edge, the federated learning platform. Learn architecture, setup, deployment, and API usage. :og:type: website :twitter:card: summary_large_image - :twitter:title: FEDn Documentation - Scalable Federated Learning - :twitter:description: Framework for scalable federated learning with comprehensive docs and examples. + :twitter:title: Scaleout Edge Documentation - Scalable Federated Learning + :twitter:description: Framework for scalable federated learning with comprehensive docs and examples. \ No newline at end of file diff --git a/docs/introduction.rst b/docs/introduction.rst index bdbb1d171..a90e77f9f 100644 --- a/docs/introduction.rst +++ b/docs/introduction.rst @@ -8,7 +8,7 @@ Traditional machine learning Traditional machine learning is centralized. Data from various sources is collected into a single location - typically a cloud platform or data center — and training models on that combined dataset. -This method works well in many cases, but it’s increasingly limited. The rapid growth of connected devices, sensors and distributed data sources has led to an exponential increase in data volume and complexity. Meanwhile, privacy regulations and security concerns make centralizing this data difficult and expensive. +This method works well in many cases, but it's increasingly limited. The rapid growth of connected devices, sensors and distributed data sources has led to an exponential increase in data volume and complexity. Meanwhile, privacy regulations and security concerns make centralizing this data difficult and expensive. Often, the data needed for training exists across many devices, organizations, or locations. Centralizing it is challenging due to privacy risks and high transfer costs. @@ -20,18 +20,18 @@ How federated learning works In federated learning, models are trained across multiple devices or servers (called client nodes) without moving the data. Here's how it works: 1. **Initialize the global model -** A central server starts with an initial global model—like a neural network or decision tree. -2. **Sending to clients -** The model's parameters are sent to selected clients. Each client keeps its local dataset private. +2. **Model retrieval -** Selected clients download the current model parameters from the server. Their local datasets remain private. 3. **Local training -** Each client updates the model using its local data. This training is repeated in several rounds — not to completion. 4. **Combining the updates -** The updated models from each client are sent back to the central server, where they are combined. This cycle repeats until the global model reaches the desired accuracy. -The FEDn framework --------------------- +The Scaleout Edge framework +--------------------------- -FEDn is a federated learning framework focused on security, scalability, and ease of use. It supports the full development lifecycle—from early experiments to production deployments—with minimal code changes. Key design goals include: +Scaleout Edge focuses on security, scalability, and ease of use. It supports the full development lifecycle—from early experiments to production deployments—with minimal code changes. Key design goals include: -- **Minimal server-side complexity for the end-user**. FEDn Studio handles orchestration, providing a UI, REST API, and Python interface for managing experiments and tracking metrics in real time. +- **Minimal server-side complexity for the end-user**. Scaleout Edge handles orchestration, providing a UI, REST API, and Python interface for managing experiments and tracking metrics in real time. - **Secure by design.** Clients never need to open inbound ports. gRPC, token-based authentication (JWT) and RBAC provides flexible and secure integration. @@ -39,7 +39,7 @@ FEDn is a federated learning framework focused on security, scalability, and eas - **Cloud native.** Deploy on public cloud, private cloud, or on-prem infrastructure. -- **Scalability and resilience.** Multiple combiners can balance load. FEDn handles failures in all critical components and manages intermittent client-connections. +- **Scalability and resilience.** Multiple combiners can balance load. Scaleout Edge handles failures in all critical components and manages intermittent client-connections. - **Developer and DevOps friendly.** Logging, tracing, and plugin architecture simplify monitoring, debugging, and extending the system. @@ -57,29 +57,65 @@ Federated learning: - No inbound ports required on client devices -From development to FL in production: - -- Secure deployment of server-side / control-plane on Kubernetes. -- UI with dashboards for orchestrating FL experiments and for visualizing results -- Team features - collaborate with other users in shared project workspaces. -- Features for the trusted-third party: Manage access to the FL network, FL clients and training progress. -- REST API for handling experiments/jobs. -- View and export logging and tracing information. -- Public cloud, dedicated cloud and on-premise deployment options. +From development to FL in production +------------------------------------ + +Scaleout Edge provides a complete operational toolkit for moving federated +learning from early prototypes to production deployments. The platform’s +capabilities can be grouped into the following categories: + +ModelOps / FL Ops +~~~~~~~~~~~~~~~~~ +- UI and dashboards for orchestrating FL experiments and monitoring training + progress. +- REST API for managing experiments and jobs. +- Support for multi-round orchestration and model lifecycle management. +- Plug-in architecture for extending aggregators, storage backends, load + balancers, and orchestration components. + +Observability & Telemetry +~~~~~~~~~~~~~~~~~~~~~~~~~ +- Built-in logging, tracing, and experiment metrics. +- Export and integration options for external observability systems. +- Visual dashboards showing experiment status, model performance, client + activity, and system health. + +Security & Trust +~~~~~~~~~~~~~~~~ +- Secure, cloud-native control plane deployed on Kubernetes. +- Token-based authentication (JWT) and role-based access control (RBAC). +- Outbound-only connectivity for clients (no inbound ports required). +- Trusted third-party features: manage access to the FL network, clients, + and training progress. + +Collaboration & Governance +~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Shared project workspaces for collaborative experimentation. +- User and role management for multi-team or multi-organization setups. +- Clear separation of responsibilities between data owners, model owners, + and infrastructure operators. + +Deployment & Infrastructure +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Flexible deployment options: public cloud, private cloud, dedicated cloud, + or fully on-premise. +- Horizontal scalability through multiple combiners. +- Resilience to intermittent client availability and failures across + critical components. Available client APIs: -- Python client (`FEDn C++ client `__) -- C++ client (`FEDn C++ client `__) -- Android Kotlin client (`FEDn Kotlin client `__) +- Python client (`Scaleout Edge C++ client `__) +- C++ client (`Scaleout Edge C++ client `__) +- Android Kotlin client (`Scaleout Edge Kotlin client `__) Support -------- -Community support in available in our `Discord +Community support is available in our `Discord server `__. -For professionals / Enteprise, we offer `Dedicated support `__. +For professionals / Enterprise, we offer `Dedicated support `__. .. meta:: :description lang=en: @@ -87,6 +123,5 @@ For professionals / Enteprise, we offer `Dedicated support `_ -to get started. +**Local compute** is the alternative execution mode where the client **decides its +own training code locally**, without downloading or executing server-provided +packages. Instead, the code remains fully controlled by the client owner, and +the federated workflow (training rounds, orchestration, and model exchange) +still operates through the Scaleout Edge network. +Local compute is enabled using the ``--local-package`` flag when starting a +client, for example: + +.. code-block:: bash + + scaleout client start --init client_config.yaml --local-package -Steps involved --------------- -- Create a seed model and upload it to FEDn Studio -- In your script, define all code needed for client-side training (and validation) -- Connect the client to the FEDn project by running it locally in your script -- Start a training session through FEDn Studio +Advantages of local compute include: +- Full control and auditability of executed code +- Improved security posture for production or regulated environments +- No need to package, upload, or manage compute code on the server +- Easy to prototype in notebooks or standalone Python files (e.g. Google Colab) +Follow the tutorial in `this notebook `_ +to get started. + +Steps involved +-------------- +1. Create and upload a seed model to Scaleout Edge +2. Define all client-side training (and optional validation) logic locally +3. Start the client with ``--local-package`` and connect it to your project +4. Start a training session in Scaleout Edge and let your local code run diff --git a/docs/overview.rst b/docs/overview.rst new file mode 100644 index 000000000..c7e937f70 --- /dev/null +++ b/docs/overview.rst @@ -0,0 +1,69 @@ +.. _scaleout_edge_overview: + +Scaleout Edge Overview +====================== + +Scaleout Edge is a platform for **distributed MLOps and DataOps**. It enables organizations to train, deploy, and govern machine learning models across decentralized infrastructure—from on-premise data centers to edge devices—without ever moving the raw training data. + +Traditional machine learning pipelines are centralized: data is collected, moved to a central lake, and processed on a cluster. Scaleout Edge **inverts this workflow**. It allows you to **bring the model to the data**. + +By managing a "Global Model" that travels to your devices, learns from local data, and sends back only mathematical updates, Scaleout Edge solves the fundamental challenges of data gravity, privacy, and network constraints. + + +The Scaleout Platform +--------------------- + +Scaleout Edge provides the orchestration, security, and aggregation layers needed to run distributed AI at scale. It is designed to manage the full lifecycle of a decentralized project through three core functions: + +* **Orchestrate:** Coordinate thousands of devices to participate in training rounds automatically. +* **Aggregate:** Securely combine model updates into a global model using hierarchical aggregation. +* **Govern:** Track lineage, versioning, and security across the entire network. + + +What Can I Use Scaleout Edge For? +--------------------------------- + +Scaleout Edge addresses critical modern ML deployment challenges: + +* **Data Sovereignty and Privacy** + Train models on sensitive data (healthcare records, financial transactions, proprietary IP) that strictly cannot leave the device or premise due to **GDPR**, **HIPAA**, or internal compliance. The raw data never crosses the network; only the model weights do. + +* **Bandwidth-Efficient Operations** + In edge environments (factories, satellites, mobile fleets), uploading terabytes of raw video or sensor data is cost-prohibitive or technically impossible. Scaleout Edge processes data locally and transmits only small model updates, **reducing network load by orders of magnitude**. + +* **Resilient, Continuous Learning** + Models degrade over time. Instead of manually collecting new datasets to retrain, Scaleout Edge enables a **continuous loop** where devices constantly refine the model based on fresh, real-world data they encounter. + + +Scaleout Architecture +--------------------- + +Scaleout Edge uses a unique **three-tier architecture** designed for scalability and resilience in unstable network conditions. Unlike simple client-server setups, Scaleout introduces an aggregation layer to handle the complexity of the edge. + +The architecture consists of: + +* **The Controller (The Brain)** + The Controller is the central management service. It manages the **Global Model**, coordinates training rounds, and handles authentication. It acts as the **control plane** for the entire network. You interact with the Controller via the Web UI, CLI, or API. + +* **The Combiner (The Aggregator)** + The Combiner is the **scalability engine** of the platform. It sits between the Controller and the Clients. Combiners can be deployed in the cloud or on edge gateways (near-edge). Their job is to receive model updates from devices, **aggregate them**, and send a single update up the stack. This hierarchical approach allows the system to scale to thousands of clients without bottling-necking the central server. + +* **The Client (The Worker)** + The Client acts as the interface between the Scaleout platform and your local data. It runs on the edge device (IoT device, server, laptop). The Client executes the training code locally, manages on-device data access, and communicates with the Combiner. + + +Key Concepts +------------ + +* **The Project:** A Project is the workspace for a specific machine learning objective. It defines the network of clients, the machine learning framework being used, and the configuration for how training should proceed. + +* **The Compute Package:** To train a model, you upload a Compute Package. This is a code bundle (typically Python) that contains your model definition and training logic. Scaleout Edge distributes this package to selected clients automatically at the start of a session. + +* **The Round:** Training happens in rounds. In a single round: + 1. The Controller instructs clients to train. + 2. Clients download the latest Global Model and the Compute Package. + 3. Clients train on their local data and upload a model update. + 4. Combiners aggregate these updates. + 5. A new Global Model is committed. + +* **The Global Model:** The Global Model is the shared intelligence of the network. It is the result of aggregating updates from all participating clients. It serves as the "**master**" version that is versioned, tracked, and eventually deployed for inference. diff --git a/docs/projects.rst b/docs/projects.rst index 01a480b6e..e8d666080 100644 --- a/docs/projects.rst +++ b/docs/projects.rst @@ -1,31 +1,31 @@ .. _projects-label: ================================================ -Develop a FEDn project +Develop a Scaleout Edge project ================================================ -This guide explains how a FEDn project is structured, and details how to develop your own -project. We assume knowledge of how to run a federated learning project with FEDn, corresponding to +This guide explains how a Scaleout Edge project is structured, and details how to develop your own +project. We assume knowledge of how to run a federated learning project with Scaleout Edge, corresponding to the tutorial: :ref:`quickstart-label`. Overview ========== -A FEDn project is a convention for packaging/wrapping machine learning code to be used for federated learning with FEDn. At the core, -a project is a directory of files (often a Git repository), containing your machine learning code, FEDn entry points, and a specification -of the runtime environment for the client (python environment or a Docker image). The FEDn API and command-line tools provide functionality +A Scaleout Edge project is a convention for packaging/wrapping machine learning code to be used for federated learning with Scaleout Edge. At the core, +a project is a directory of files (often a Git repository), containing your machine learning code, Scaleout Edge entry points, and a specification +of the runtime environment for the client (python environment or a Docker image). The Scaleout Edge API and command-line tools provide functionality to help a user automate deployment and management of a project that follows the conventions. -The structure of a FEDn project -================================ +The structure of a Scaleout Edge project +======================================== We recommend that projects have the following folder and file structure, here illustrated by the 'mnist-pytorch' example from the Getting Started Guide: | project | ├ client -| │ ├ fedn.yaml +| │ ├ scaleout.yaml | │ ├ python_env.yaml | │ ├ model.py | │ ├ data.py @@ -40,24 +40,24 @@ the Getting Started Guide: | The content of the ``client`` folder is what we commonly refer to as the *compute package*. It contains modules and files specifying the logic of a single client. -The file ``fedn.yaml`` is the FEDn Project File. It is used by FEDn to get information about the specific commands to run when building the initial 'seed model', +The file ``scaleout.yaml`` is the Scaleout Edge Project File. It is used by Scaleout Edge to get information about the specific commands to run when building the initial 'seed model', and when a client recieves a training request or a validation request from the server. These commmands are referred to as the ``entry points``. The compute package (client folder) ==================================== -**The Project File (fedn.yaml)** +**The Project File (scaleout.yaml)** -FEDn uses a project file 'fedn.yaml' to specify which entry points to execute when the client recieves a training or validation request, +Scaleout Edge uses a project file 'scaleout.yaml' to specify which entry points to execute when the client recieves a training or validation request, and (optionally) what runtime environment to execute those entry points in. There are up to four entry points: - **build** - used for any kind of setup that needs to be done before the client starts up, such as initializing the global seed model. - **startup** - invoked immediately after the client starts up and the environment has been initalized. -- **train** - invoked by the FEDn client to perform a model update. -- **validate** - invoked by the FEDn client to perform a model validation. +- **train** - invoked by the Scaleout Edge client to perform a model update. +- **validate** - invoked by the Scaleout Edge client to perform a model validation. -To illustrate this, we look the ``fedn.yaml`` from the 'mnist-pytorch' project used in the Getting Started Guide: +To illustrate this, we look the ``scaleout.yaml`` from the 'mnist-pytorch' project used in the Getting Started Guide: .. code-block:: yaml @@ -74,21 +74,21 @@ To illustrate this, we look the ``fedn.yaml`` from the 'mnist-pytorch' project u command: python validate.py In this example, all entrypoints are python scripts (model.py, data.py, train.py and validate.py). -They are executed by FEDn using the system default python interpreter 'python', in an environment with dependencies specified by "python_env.yaml". +They are executed by Scaleout Edge using the system default python interpreter 'python', in an environment with dependencies specified by "python_env.yaml". Next, we look at the environment specification and each entry point in more detail. **Environment (python_env.yaml)** -FEDn assumes that all entry points (build, startup, train, validate) are executable within the client's runtime environment. You have two main options +Scaleout Edge assumes that all entry points (build, startup, train, validate) are executable within the client's runtime environment. You have two main options to handle the environment: - 1. Let FEDn create and initalize the environment automatically by specifying ``python_env``. FEDn will then create an isolated virtual environment and install the dependencies specified in ``python_env.yaml`` into it before starting up the client. FEDn currently supports Virtualenv environments, with packages on PyPI. - 2. Manage the environment manually. Here you have several options, such as managing your own virtualenv, running in a Docker container, etc. Remove the ``python_env`` tag from ``fedn.yaml`` to handle the environment manually. + 1. Let Scaleout Edge create and initalize the environment automatically by specifying ``python_env``. Scaleout Edge will then create an isolated virtual environment and install the dependencies specified in ``python_env.yaml`` into it before starting up the client. Scaleout Edge currently supports Virtualenv environments, with packages on PyPI. + 2. Manage the environment manually. Here you have several options, such as managing your own virtualenv, running in a Docker container, etc. Remove the ``python_env`` tag from ``scaleout.yaml`` to handle the environment manually. **build (optional):** -This entry point is used for any kind of setup that **needs to be done to initialize FEDn prior to federated training**. +This entry point is used for any kind of setup that **needs to be done to initialize Scaleout Edge prior to federated training**. This is the only entrypoint not used by the client during global training rounds - rather it is used by the project initator. Most often it is used to build the seed model. @@ -102,7 +102,7 @@ that instantiates a model object (with random weights), exctracts its parameters import torch - from fedn.utils.helpers.helpers import get_helper + from scaleoututil.helpers.helpers import get_helper HELPER_MODULE = "numpyhelper" helper = get_helper(HELPER_MODULE) @@ -188,14 +188,14 @@ a publicly available dataset. However, in real-world settings with truly private **train (mandatory):** This entry point is invoked when the client recieves a new model update (training) request from the server. The training entry point must be a single-input single-output (SISO) program. -Upon recipt of a traing request, the FEDn client will download the latest version of the global model, write it to a (temporary) file and execute the command specified in the entrypoint: +Upon recipt of a traing request, the Scaleout Edge client will download the latest version of the global model, write it to a (temporary) file and execute the command specified in the entrypoint: .. code-block:: python python train.py model_in model_out -where 'model_in' is the **file** containing the current global model (parameters) to be updated, and 'model_out' is a **path** to write the new model update to (FEDn substitutes this path for tempfile location). -When a traing update is complete, FEDn reads the updated paramters from 'model_out' and streams them back to the server for aggregation. +where 'model_in' is the **file** containing the current global model (parameters) to be updated, and 'model_out' is a **path** to write the new model update to (Scaleout Edge substitutes this path for tempfile location). +When a traing update is complete, Scaleout Edge reads the updated paramters from 'model_out' and streams them back to the server for aggregation. .. note:: The training entrypoint must also write metadata to a json-file. The entry ``num_example`` is mandatory - it is used by the aggregators to compute a weighted average. The user can in addition choose to log other variables such as hyperparamters. These will then be stored in the backend database and accessible via the API and UI. @@ -203,7 +203,7 @@ When a traing update is complete, FEDn reads the updated paramters from 'model_o In our 'mnist-pytorch' example, upon startup a client downloads the MNIST image dataset and creates partitions (one for each client). This partition is in turn divided into a train/test split. The file 'train.py' (shown below) reads the train split, runs an epoch of training and writes the updated paramters to file. -To learn more about how model serialization and model marshalling works in FEDn, see :ref:`helper-label` and :ref:`agg-label`. +To learn more about how model serialization and model marshalling works in Scaleout Edge, see :ref:`helper-label` and :ref:`agg-label`. .. code-block:: python @@ -215,7 +215,7 @@ To learn more about how model serialization and model marshalling works in FEDn, from model import load_parameters, save_parameters from data import load_data - from fedn.utils.helpers.helpers import save_metadata + from scaleoututil.helpers.helpers import save_metadata dir_path = os.path.dirname(os.path.realpath(__file__)) sys.path.append(os.path.abspath(dir_path)) @@ -224,9 +224,9 @@ To learn more about how model serialization and model marshalling works in FEDn, def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1, lr=0.01): """Complete a model update. - Load model paramters from in_model_path (managed by the FEDn client), + Load model paramters from in_model_path (managed by the Scaleout Edge client), perform a model update, and write updated paramters - to out_model_path (picked up by the FEDn client). + to out_model_path (picked up by the Scaleout Edge client). :param in_model_path: The path to the input model. :type in_model_path: str @@ -288,7 +288,7 @@ To learn more about how model serialization and model marshalling works in FEDn, **validate (optional):** -When training a global model with FEDn, the data scientist can choose to ask clients to perform local model validation of each new global model version +When training a global model with Scaleout Edge, the data scientist can choose to ask clients to perform local model validation of each new global model version by specifying an entry point called 'validate'. Similar to the training entrypoint, the validation entry point must be a SISO program. It should reads a model update from file, validate it (in any way suitable to the user), and write a **json file** containing validation data: @@ -297,13 +297,13 @@ Similar to the training entrypoint, the validation entry point must be a SISO pr python validate.py model_in validations.json -The content of the file 'validations.json' is captured by FEDn, passed on to the server and then stored in the database backend. The validate entry point is optional. +The content of the file 'validations.json' is captured by Scaleout Edge, passed on to the server and then stored in the database backend. The validate entry point is optional. In our 'mnist-pytorch' example, upon startup a client downloads the MNIST image dataset and creates partitions (one for each client). This partition is in turn divided into a train/test split. The file 'validate.py' (shown below) reads both the train and test splits and computes accuracy scores and the loss. -It is a requirement that the output of validate.py is valid json. Furthermore, the FEDn Studio UI will be able to capture and visualize all **scalar metrics** -specified in this file. The entire conent of the json file will be retrievable programatically using the FEDn APIClient, and can be downloaded from the Studio UI. +It is a requirement that the output of validate.py is valid json. Furthermore, the Scaleout Edge UI will be able to capture and visualize all **scalar metrics** +specified in this file. The entire conent of the json file will be retrievable programatically using the Scaleout Edge APIClient, and can be downloaded from the Scaleout Edge UI. .. code-block:: python @@ -314,7 +314,7 @@ specified in this file. The entire conent of the json file will be retrievable p from model import load_parameters from data import load_data - from fedn.utils.helpers.helpers import save_metrics + from scaleoututil.helpers.helpers import save_metrics dir_path = os.path.dirname(os.path.realpath(__file__)) sys.path.append(os.path.abspath(dir_path)) @@ -366,18 +366,18 @@ specified in this file. The entire conent of the json file will be retrievable p Testing the entrypoints ======================= -We recommend you to test your training and validation entry points locally before creating the compute package and uploading it to Studio. +We recommend you to test your training and validation entry points locally before creating the compute package and uploading it to the Scaleout Edge server. To run the 'build' entrypoint and create the seed model (deafult filename 'seed.npz'): .. code-block:: python - fedn run build --path client + scaleout run build --path client Run the 'startup' entrypoint to download the dataset: .. code-block:: python - fedn run startup --path client + scaleout run startup --path client Then, standing inside the 'client folder', you can test *train* and *validate* by: @@ -388,35 +388,35 @@ Then, standing inside the 'client folder', you can test *train* and *validate* b You can also test *train* and *validate* entrypoint using CLI command: -.. note:: Before running the fedn run train or fedn run validate commands, make sure to download the training and test data. The downloads are usually handled by the "fedn run startup" command in the examples provided by FEDn. +.. note:: Before running the scaleout run train or scaleout run validate commands, make sure to download the training and test data. The downloads are usually handled by the "scaleout run startup" command in the examples provided by Scaleout Edge. .. code-block:: bash - fedn run train --path client --input --output - fedn run validate --path client --input --output + scaleout run train --path client --input --output + scaleout run validate --path client --input --output -Packaging for training on FEDn -=============================== +Packaging for training on Scaleout Edge +======================================= -To run a project on FEDn we compress the entire client folder as a .tgz file. There is a utility command in the FEDn CLI to do this: +To run a project on Scaleout Edge we compress the entire client folder as a .tgz file. There is a utility command in the Scaleout Edge CLI to do this: .. code-block:: bash - fedn package create --path client + scaleout package create --path client You can include a .ignore file in the client folder to exclude files from the package. This is useful for excluding large data files, temporary files, etc. -To learn how to initialize FEDn with the package seed model, see :ref:`quickstart-label`. +To learn how to initialize Scaleout Edge with the package seed model, see :ref:`quickstart-label`. -How is FEDn using the project? -=============================== +How is Scaleout Edge using the project? +======================================= -With an understanding of the FEDn project, the compute package (entrypoints), we can take a closer look at how FEDn +With an understanding of the Scaleout Edge project, the compute package (entrypoints), we can take a closer look at how Scaleout Edge is using the project during federated training. The figure below shows the logical view of how a training request is handled. A training round is initiated by the controller. It asks a Combiner for a model update. The model in turn asks clients to compute a model update, by publishing a training request -to its request stream. The FEDn Client, :py:mod:`fedn.network.client`, subscribes to the stream and picks up the request. It then calls upon the Dispatcher, :py:mod:`fedn.utils.Dispatcher`. -The dispatcher reads the Project File, 'fedn.yaml', looking up the entry point definition and executes that command. Upon successful execution, the FEDn Client reads the +to its request stream. The Scaleout Edge Client, :py:mod:`scaleout-client.python.client.scaleout_client`, subscribes to the stream and picks up the request. It then calls upon the Dispatcher, :py:mod:`scaleout-util.scaleoututil.utils.dispatcher.Dispatcher`. +The dispatcher reads the Project File, 'scaleout.yaml', looking up the entry point definition and executes that command. Upon successful execution, the Scaleout Edge Client reads the model update and metadata from file, and streams the content back to the combiner for aggregration. .. image:: img/ComputePackageOverview.png @@ -428,19 +428,19 @@ model update and metadata from file, and streams the content back to the combine Where to go from here? ====================== -With an understanding of how FEDn Projects are structured and created, you can explore our library of example projects. They demonstrate different use case scenarios of FEDn +With an understanding of how Scaleout Edge Projects are structured and created, you can explore our library of example projects. They demonstrate different use case scenarios of Scaleout Edge and its integration with popular machine learning frameworks like PyTorch and TensorFlow. -- `FEDn + PyTorch `__ -- `FEDn + Tensforflow/Keras `__ -- `FEDn + MONAI `__ -- `FEDn + Hugging Face `__ -- `FEDn + Flower `__ -- `FEDN + Self-supervised learning `__ +- `Scaleout Edge + PyTorch `__ +- `Scaleout Edge + Tensforflow/Keras `__ +- `Scaleout Edge + MONAI `__ +- `Scaleout Edge + Hugging Face `__ +- `Scaleout Edge + Flower `__ +- `Scaleout Edge + Self-supervised learning `__ .. meta:: :description lang=en: - A FEDn project is a convention for packaging/wrapping machine learning code to be used for federated learning with FEDn. - :keywords: Federated Learning, Machine Learning, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems + A Scaleout Edge project is a convention for packaging/wrapping machine learning code to be used for federated learning with Scaleout Edge. + :keywords: Federated Learning, Machine Learning, Federated Learning Framework, Federated Learning Platform, FEDn, Scaleout Systems, Scaleout Edge \ No newline at end of file diff --git a/docs/quickstart.rst b/docs/quickstart.rst index b57d95524..57a2fb269 100644 --- a/docs/quickstart.rst +++ b/docs/quickstart.rst @@ -1,40 +1,34 @@ .. _quickstart-label: -Getting started with FEDn -========================= +Getting started with Scaleout Edge +================================== .. note:: - This tutorial is a quickstart guide to FEDn based on a pre-made FEDn Project. It is designed to serve as a starting point for new developers. - To learn how to develop your own project from scratch, see :ref:`projects-label`. - + This quickstart guide will help you get started with the Scaleout Edge platform using an existing or pre-provisioned deployment. + If you don't yet have access to a project, follow the steps below to request one. + To learn how to develop and configure your own project from scratch, see :ref:`projects-label`. + **Prerequisites** - `Python >=3.9, <=3.12 `__ -1. Set up project +1. Get a project ----------------- -#. Create a FEDn account. Sign up at `fedn.scaleoutsystems.com/signup `_. -#. Verify your email. Check your inbox for a verification email and click the link to activate your account. -#. Log in and create a project. Once your account is activated, log in to the Studio and create a new project. -#. Manage your projects. If you have multiple projects, you can view and manage them here: `fedn.scaleoutsystems.com/projects `_. - -.. tip:: - - You can also create a project using our CLI tool. Run the following command: - For more details, see :doc:`cli`. - - .. code-block:: bash - - fedn project create --name "My Project" - - Replace `"My Project"` with your desired project name. +Before you can start using Scaleout Edge, you’ll need access to a project. +Projects define the environment where your federated applications run and are hosted and managed by Scaleout Systems, unless you choose an on-prem deployment. You can request a new project through our online request form. +#. **Request a project.** Fill out the form at `scaleoutsystems.com/request-project `_ to tell us about your use case and preferred deployment option. +#. **Choose your deployment.** We offer several hosting options: + - **Academic (free)** — for research and educational collaborations, hosted by Scaleout Systems. + - **Enterprise** — for organizations that require dedicated infrastructure. Enterprise projects can be **on-prem** (self-hosted) or **fully managed** by Scaleout Systems. +#. **Wait for confirmation.** Our team will review your request and contact you with details on setup and access. +#. **Start collaborating.** Once your project is approved and provisioned, you’ll receive credentials and connection details to begin integrating your clients and nodes. 1.5 Set up a Virtual environment (Recommended) ---------------------------------------------- -Before installing FEDn using pip, we recommend creating a virtual environment. This helps isolate dependencies and avoids conflicts with other Python projects on your machine. +Before installing Scaleout Edge using pip, we recommend creating a virtual environment. This helps isolate dependencies and avoids conflicts with other Python projects on your machine. You can set up and activate a virtual environment using the following steps: @@ -45,21 +39,21 @@ You can set up and activate a virtual environment using the following steps: .. code-tab:: bash :caption: Unix/MacOS - python3 -m venv fedn_env - source fedn_env/bin/activate + python3 -m venv scaleout_env + source scaleout_env/bin/activate .. code-tab:: bash :caption: Windows (PowerShell) - python -m venv fedn_env + python -m venv scaleout_env Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser - fedn_env\Scripts\Activate.ps1 + scaleout_env\Scripts\Activate.ps1 .. code-tab:: bash :caption: Windows (CMD.exe) - python -m venv fedn_env - fedn_env\Scripts\activate.bat + python -m venv scaleout_env + scaleout_env\Scripts\activate.bat For additional information visit the `Python venv documentation `_. @@ -70,26 +64,28 @@ After activating the virtual environment, you can proceed with the next steps. --------------------------------------------------- Next, we will prepare and package the ML code to be executed by each client and create a first version of the global model (seed model). -We will work with one of the pre-defined projects in the FEDn repository, ``mnist-pytorch``. +We will work with one of the pre-defined projects in the Scaleout client repository, ``mnist-pytorch``. -First install the FEDn API on your local machine (client): +First install the Scaleout Edge API on your local machine (client): **Using pip** -On you local machine/client, install the FEDn package using pip: +On you local machine/client, install the Scaleout Edge package using pip: .. code-block:: bash - pip install fedn + pip install scaleout **From source** -Clone the FEDn repository and install the package: +Clone the Scaleout Client repository and install the util package followd by the client package: .. code-block:: bash - git clone https://github.com/scaleoutsystems/fedn.git - cd fedn + git clone https://github.com/scaleoutsystems/scaleout-client.git + cd scaleout/scaleout-util + pip install . + cd ../scaleout-client pip install . @@ -97,45 +93,47 @@ Clone the FEDn repository and install the package: **Create the compute package and seed model** -In order to train a federated model using FEDn, your Studio project needs to be initialized with a ``compute package`` and a ``seed model``. The compute package is a code bundle containing the +In order to train a federated model using Scaleout Edge, your project needs to be initialized with a ``compute package`` and a ``seed model``. The compute package is a code bundle containing the code used by the client to execute local training and local validation. The seed model is a first version of the global model. For a detailed explaination of the compute package and seed model, see this guide: :ref:`projects-label` -To work through this quick start you need a local copy of the ``mnist-pytorch`` example project contained in the main FEDn Git repository. +To work through this quick start you need a local copy of the ``mnist-pytorch`` example project contained in the main Scaleout Edge Git repository. Clone the repository using the following command, if you didn't already do it in the previous step: .. code-block:: bash - git clone https://github.com/scaleoutsystems/fedn.git + git clone https://github.com/scaleoutsystems/scaleout-client.git -Navigate to the ``fedn/examples/mnist-pytorch`` folder. The compute package is located in the folder ``client``. +Navigate to the ``scaleout-client/python/examples/mnist-pytorch`` folder. The compute package is located in the folder ``client``. Create a compute package: .. code-block:: - fedn package create --path client + scaleout package create --path client This will create a file called ``package.tgz`` in the root of the project. -Next, create the seed model: +Next, create the seed model. For this to work we need to install the dependencies required by the client code. These dependencies are listed in ``python_env.yaml`` located in the ``client`` folder. +Install the dependencies into the current python environement using the following command and then create the seed model: .. code-block:: - - fedn run build --path client + + scaleout run install --path client + scaleout run build --path client This will create a file called ``seed.npz`` in the root of the project. .. note:: This example automatically creates the runtime environment for the compute package using Virtualenv. - When you first exectue the above commands, FEDn will build a venv, and this takes + When you first exectue the above commands, Scaleout Edge will build a venv, and this takes a bit of time. For more information on the various options to manage the environement, see :ref:`projects-label`. -Next will now upload these files to your Studio project. +Next you will now upload these files to your Scaleout Edge project. 3. Initialize the server-side ------------------------------ -The next step is to initialize the server side with the client code and the initial global model. In the Studio UI, +The next step is to initialize the server side with the client code and the initial global model. In the deployment UI, **Upload the compute package** @@ -150,7 +148,7 @@ The next step is to initialize the server side with the client code and the init **Upload the seed model** -#. Navigate to your project from Step 1 and click Models in the sidebar. +#. Click on the Models tab in the sidebar. #. Click Add Model. #. In the form that appears, upload the generated seed model file. @@ -167,7 +165,7 @@ Before starting the clients, we need to configure what data partition the client **Manage Data Splits for MNIST-PyTorch** The default training and test data for this particular example (mnist-pytorch) is for convenience downloaded and split automatically by the client when it starts up. -The number of splits and which split to use by a client can be controlled via the environment variables ``FEDN_NUM_DATA_SPLITS`` and ``FEDN_DATA_PATH``. +The number of splits and which split to use by a client can be controlled via the environment variables ``SCALEOUT_NUM_DATA_SPLITS`` and ``SCALEOUT_DATA_PATH``. Setup the environement for a client (using a 10-split and the 1st partition) by running the following commands: @@ -176,34 +174,34 @@ Setup the environement for a client (using a 10-split and the 1st partition) by .. code-tab:: bash :caption: Unix/MacOS - export FEDN_PACKAGE_EXTRACT_DIR=package - export FEDN_NUM_DATA_SPLITS=10 - export FEDN_DATA_PATH=./data/clients/1/mnist.pt + export SCALEOUT_PACKAGE_EXTRACT_DIR=package + export SCALEOUT_NUM_DATA_SPLITS=10 + export SCALEOUT_DATA_PATH=./data/clients/1/mnist.pt .. code-tab:: bash :caption: Windows (PowerShell) - $env:FEDN_PACKAGE_EXTRACT_DIR=".\package" - $env:FEDN_NUM_DATA_SPLITS=10 - $env:FEDN_DATA_PATH=".\data\clients\1\mnist.pt" + $env:SCALEOUT_PACKAGE_EXTRACT_DIR=".\package" + $env:SCALEOUT_NUM_DATA_SPLITS=10 + $env:SCALEOUT_DATA_PATH=".\data\clients\1\mnist.pt" .. code-tab:: bash :caption: Windows (CMD.exe) - set FEDN_PACKAGE_EXTRACT_DIR=.\package\\ - set FEDN_NUM_DATA_SPLITS=10 - set FEDN_DATA_PATH=.\data\\clients\\1\\mnist.pt + set SCALEOUT_PACKAGE_EXTRACT_DIR=.\package\\ + set SCALEOUT_NUM_DATA_SPLITS=10 + set SCALEOUT_DATA_PATH=.\data\\clients\\1\\mnist.pt **Start the client (on your local machine)** -Each local client requires an access token to connect securely to the FEDn server. These tokens are issued from your FEDn Project. +Each local client requires an access token to connect securely to the Scaleout Edge server. These tokens are issued from your Scaleout Edge Project. #. Navigate to the Clients page and click Connect Client. #. Follow the instructions in the dialog to generate a new token. #. Copy and paste the provided command into your terminal to start the client. Repeat these two steps for the number of clients you want to use. -A normal laptop should be able to handle several clients for this example. Remember to use different partitions for each client, by changing the number in the ``FEDN_DATA_PATH`` variable. +A normal laptop should be able to handle several clients for this example. Remember to use different partitions for each client, by changing the number in the ``SCALEOUT_DATA_PATH`` variable. 5. Train the global model ----------------------------- @@ -212,14 +210,14 @@ With clients connected, we are now ready to train the global model. .. tip:: - You can use the FEDn API Client to start a session and monitor the progress. For more details, see :ref:`apiclient-label`. + You can use the Scaleout Edge API Client to start a session and monitor the progress. For more details, see :ref:`apiclient-label`. .. code-block:: python client.start_session(name="My Session", rounds=5) -In the FEDn UI, +In the Scaleout Edge UI, #. Navigate to the Sessions page and click on "Create session". Fill in the form with the desired settings. #. When the session is created, click "Start training" and select the number of rounds to run. @@ -227,18 +225,16 @@ In the FEDn UI, In the terminal where your are running your client you should now see some activity. When a round is completed, you can see the results on the "Models" page. -.. _studio-api: - -Congratulations, you have now completed your first federated training session with FEDn! Below you find additional information that can +Congratulations, you have now completed your first federated training session with Scaleout Edge! Below you find additional information that can be useful as you progress in your federated learning journey. .. note:: - In FEDn Studio, you can access global model updates by going to the 'Models' or 'Sessions' tab. Here you can download model updates, metrics (as csv) and view the model trail. + In the Scaleout Edge UI, you can access global model updates by going to the 'Models' or 'Sessions' tab. Here you can download model updates, metrics (as csv) and view the model trail. **Where to go from here?** -With you first FEDn federated project set up, we suggest that you take a closer look at how a FEDn project is structured -to learn how to develop your own FEDn projects: +With you first Scaleout Edge federated project set up, we suggest that you take a closer look at how a Scaleout Edge project is structured +to learn how to develop your own Scaleout Edge projects: :ref:`projects-label` @@ -248,20 +244,14 @@ including the use of different aggregators. Learn how to use the APIClient here: :ref:`apiclient-label` -Study the architecture overview to learn more about how FEDn is designed and works under the hood: +Study the architecture overview to learn more about how Scaleout Edge is designed and works under the hood: :ref:`architecture-label` -For developers looking to customize FEDn and develop own aggregators, check out the local development guide -to learn how to set up an all-in-one development environment using Docker and docker-compose: - -:ref:`developer-label` - .. meta:: - :description lang=en: This tutorial is a quickstart guide to FEDn based on a pre-made FEDn Project. It is designed to serve as a starting point for new developers. + :description lang=en: This tutorial is a quickstart guide to Scaleout Edge based on a pre-made Scaleout Edge Project. It is designed to serve as a starting point for new developers. :keywords: Getting started with Federated Learning, Federated Learning, Federated Learning Framework, Federated Learning Platform - :og:title: Getting started with FEDn - :og:description: This tutorial is a quickstart guide to FEDn based on a pre-made FEDn Project. It is designed to serve as a starting point for new developers. - :og:image: https://fedn.scaleoutsystems.com/static/images/scaleout_black.png - :og:url: https://fedn.scaleoutsystems.com/docs/quickstart.html + :og:title: Getting started with Scaleout Edge + :og:description: This tutorial is a quickstart guide to Scaleout Edge based on a pre-made Scaleout Edge Project. It is designed to serve as a starting point for new developers. + :og:url: https://docs.scaleoutsystems.com/en/stable/quickstart.html :og:type: website diff --git a/docs/requirements.txt b/docs/requirements.txt index 102f07e7e..b52412f2d 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,3 +1,4 @@ sphinx-rtd-theme sphinx_code_tabs -sphinx-copybutton \ No newline at end of file +sphinx-copybutton +sphinx-click \ No newline at end of file diff --git a/docs/serverfunctions.rst b/docs/serverfunctions.rst index 15ae1c536..ac447d240 100644 --- a/docs/serverfunctions.rst +++ b/docs/serverfunctions.rst @@ -1,7 +1,9 @@ -Modifying Server Functionality -============================== +.. _server-functions: -FEDn provides an interface where you can implement your own server-side logic directly into FEDn Studio by utilizing the ``ServerFunctions`` class. This enables advanced customization of the server's behavior while working with FEDn. +Server Functions +================ + +Scaleout Edge provides an interface where you can implement your own server-side logic directly into your server by utilizing the ``ServerFunctions`` class. This enables advanced customization of the server's behavior while working with Scaleout Edge. You can for example implement custom client selection logic, adjust hyperparameters, or implement a custom aggregation algorithm. See https://www.youtube.com/watch?v=Rnfhfqy_Tts for information in video format. Requirements for ``ServerFunctions`` Implementation @@ -12,16 +14,16 @@ The ``ServerFunctions`` class has specific requirements for proper instantiation 1. **Class Name**: The implemented class must be named ``ServerFunctions``. 2. **Allowed Imports**: Only a pre-defined list of Python packages is available for use within a ``ServerFunctions`` implementation for compatibility and security reasons. You can find the allowed packages at: - :py:mod:`fedn.network.combiner.hooks.allowed_imports`. + :py:mod:`scaleout-client.scaleout.network.combiner.hooks.allowed_imports`. Overridable Methods ------------------- -The ``ServerFunctions`` class provides three methods that can optionally be overridden. If you choose not to override one or several of these, FEDn will execute its default behavior for that functionality. +The ``ServerFunctions`` class provides three methods that can optionally be overridden. If you choose not to override one or several of these, Scaleout Edge will execute its default behavior for that functionality. The base class defining these methods and their types is: -:py:mod:`fedn.network.combiner.hooks.serverfunctionsbase.ServerFunctionsBase`. +:py:mod:`scaleout-util.scaleoututil.serverfunctions.serverfunctionsbase.ServerFunctionsBase`. The methods available for customization are: @@ -72,26 +74,26 @@ Below is an example of how to implement custom server functionality in a ``Serve logger.info("Models aggregated") return [param / total_weight for param in weighted_sum] -Using ``ServerFunctions`` in FEDn Studio ----------------------------------------- +Using ``ServerFunctions`` in Scaleout Edge +------------------------------------------ -To use your custom ``ServerFunctions`` code in FEDn Studio, follow these steps: +To use your custom ``ServerFunctions`` code in Scaleout Edge, follow these steps: 1. **Prepare Your Environment**: - Ensure you have an API token for your project. Retrieve it from the "Settings" page in FEDn Studio and add it to your environment: + Ensure you have an API token for your project. Retrieve it from the "Settings" page in your Scaleout Edge UI and add it to your environment: .. code-block:: bash - export FEDN_AUTH_TOKEN= + export SCALEOUT_AUTH_TOKEN= 2. **Connect Using the API Client**: - Connect to your FEDn project using the ``APIClient``. Replace ```` with the address found on the Studio dashboard. + Connect to your Scaleout Edge project using the ``APIClient``. Replace ```` with the address found on the Scaleout Edge dashboard. .. code-block:: python - from fedn import APIClient + from scaleout-client import APIClient client = APIClient(host="", secure=True, verify=True) 3. **Start a Session with ``ServerFunctions``**: @@ -105,14 +107,13 @@ To use your custom ``ServerFunctions`` code in FEDn Studio, follow these steps: 4. **Monitor Logs**: - Logs from your ``ServerFunctions`` implementation can be viewed on the Studio dashboard under the "Logs" section. + Logs from your ``ServerFunctions`` implementation can be viewed on the Scaleout Edge dashboard under the "Logs" section. Notes ----- -- **Beta Usage**: Custom server functionality is available in beta starting from FEDn 0.20.0. - **Documentation**: Refer to the full APIClient documentation for more details on connecting to your project: https://docs.scaleoutsystems.com/en/stable/apiclient.html -This modular interface enables you to integrate your specific server-side logic into your FEDn federated learning pipeline. +This modular interface enables you to integrate your specific server-side logic into your Scaleout Edge federated learning pipeline. diff --git a/fedn/network/combiner/hooks/grpc_wrappers.py b/fedn/network/combiner/hooks/grpc_wrappers.py index 0aff6b371..b288ff8cb 100644 --- a/fedn/network/combiner/hooks/grpc_wrappers.py +++ b/fedn/network/combiner/hooks/grpc_wrappers.py @@ -26,6 +26,7 @@ def wrapper(self, request, context): try: yield from fn(self, request, context) except Exception as e: + self.client_updates = {} self._retire_and_log(func_name, e) # Option B for streaming: signal an RPC error the client understands context.set_code(grpc.StatusCode.FAILED_PRECONDITION)