Skip to content

Releases: DecLearn/declearn

declearn v2.6.0

26 Jul 10:44
3730011
Compare
Choose a tag to compare

Released: 26/07/2024

Release Highlights

Group-Fairness capabilities

This new version of DecLearn brings a whole new type of federated optimization algorithms to the party, introducing an API and various algorithms to measure and optimize the group fairness of the trained model over the union of clients' training datasets.

This is the result of a year-long collaboration with Michaël Perrot and Brahim Erraji to design and evaluate algorithms to learn models under group-fairness constraints in a federated learning setting, using either newly-introduced algorithms or existing ones from the literature.

A dedicated guide on fairness features was added to the documentation, that is the advised entry-point for people interested in getting around these new features. The guide is both about explaining what (group-)fairness in machine learning is, what the design choices (and limits) of our new API are, how the API works, which algorithms are available, and how to write custom fairness definitions or fairness-enforcing algorithms.

As noted in the guide, end-users with an interest in fairness-aware federated learning are very welcome to get in touch if they have feedback, questions or requests about the current capabilities and possible future ones.

To sum it up shortly:

  • The newly-introduced declearn.fairness submodule provides with an API and concrete algorithms to enforce fairness constraints in a federated learning process.
  • When such an algorithm is to be used, the only required modifications to an existing user-defined process is to:
    • Plug a declearn.fairness.api.FairnessControllerServer subclass instance (or its configuration) into the declearn.main.config.FLOptimConfig that is defined by the server.
    • Wrap each and every client's training dataset as a declearn.fairness.api.FairnessDataset; for instance using declearn.fairness.core.FairnessInMemoryDataset, which is an extension of the base declearn.dataset.InMemoryDataset.
  • There are currently three available algorithms to enforce fairness:
    • Fed-FairGrad, defined under declearn.fairness.fairgrad
    • Fed-FairBatch/FedFB, defined under declearn.fairness.fairbatch
    • FairFed, defined under declearn.fairness.fairfed
  • In addition, declearn.fairness.monitor provides with an algorithm to merely measure fairness throughout training, typically to evaluate baselines when conducting experiments on fairness-enforcing algorithms.
  • There are currently four available group-fairness criteria that can be used with the previous algorithms:
    • Accuracy Parity
    • Demographic Parity
    • Equalized Odds
    • Equality of Opportunity

Scheduler API for learning rates

DecLearn 2.6.0 also introduces a long-awaited feature: scheduling rules for the learning rate (and/or weight decay factor), that adjust the scheduled value throughout training based on the number of training steps and/or rounds already taken.

This takes the form of a new (and extensible) Scheduler API, implemented under the new declearn.optimizer.schedulers submodule. Instances of Scheduler subclasses (or their JSON-serializable specs) may be passed to Optimizer.__init__ instead of float values to specify the lrate and/or w_decay parameters, resulting in time-varying values being computed and used rather than a constant one.

Scheduler is easily-extensible by end-users to write their own rules.
At the moment, DecLearn natively provides with:

  • Various kinds of decay (step, multi-steps or round based; linear, exponential, polynomial...);
  • Cyclic learning rates (based on this and that paper);
  • Linear warmup (steps or round based; combinable with another scheduler to use after the warmup period).

The user-guide on the Optimizer API was updated to cover this new feature, and remains the preferred entry-point for new users that want to get hold of the overall design and specific features offered by this API. Users already familiar with Optimizer may simply check out the API docs for the new [Scheduler][declearn.optimizer.schedulers.Scheduler] API.

declearn.training submodule reorganization

DecLearn 2.6.0 introduces the declearn.training submodule, that merely refactors some unchanged classes previously made available under declearn.main.utils and declearn.main.privacy. The mapping of changes is the following:

  • declearn.main.TrainingManager -> declearn.training.TrainingManager
  • declearn.main.privacy -> declearn.training.dp (which remains a manual-import submodule relying on the availability of the optional opacus third-party dependency)

The former declearn.main.privacy is deprecated and will be removed in DecLearn 2.8 and/or 3.0. It is kept for now as an alias re-export of declearn.training.dp, that raises a DeprecationWarning unpon manual import.

The declearn.main.utils submodule is kept, but importing TrainingManager from it is deprecated and will also be removed in version 2.8 and/or 3.0. For now, the class is merely re-exported from it.

Evaluation rounds can now be skipped

Prior to this release, FederatedServer always deterministically ran training and evaluation rounds in alternance as part of a Federated Learning process. This can now be modularized, using the new frequency parameter as part of declearn.main.config.EvaluateConfig (i.e. the "evaluate" field of the declearn.main.config.FLRunConfig instance, dict or TOML file provided as input to FederatedServer.run).

By default, frequency=1, meaning an evaluation round is run after each and every training round. But make it frequency=N and evaluation will only occur after the N-th, 2*N-th, ... training rounds. Note that if the server is checkpointing results, then an evaluation round will forcefully occur after the last training round.

Note that a similar parameter is available for FairnessConfig, albeit working slightly differently, because fairness evaluation rounds occur before training. Hence, with frequency=N, fairness evaluation and constraints update will occur before the 1st, N+1-th, 2*N+1-th, ... training rounds. Note that if the server if checkpointing results, then a fairness round will forcefully occur after the last training round, for the sake of measuring the fairness levels of the last model.

Other changes

Removal of deprecated features

A number of features were deprecated in DecLearn 2.4.0 (whether legacy API methods, submdules or methods that were re-organized or renamed, parameters that were no longer used or a plainly-removed function). As of this new release, those features that had been kept back with a deprecation warning are now removed from the code.

As a remainder, the removed features include:

  • Legacy aggregation methods:
    • declearn.aggregator.Aggregator.aggregate
    • declearn.metrics.Metric.agg_states
    • declearn.metrics.MetricSet.agg_states
  • Legacy instantiation parameters:
    • declearn.aggregator.AveragingAggregator parameter client_weights
    • declearn.aggregator.GradientMaskedAveraging parameter client_weights
    • declearn.optimizer.modules.ScaffoldServerModule parameter clients
  • Legacy names that were aliasing new locations:
    • declearn.communication.messaging (moved to declearn.messaging)
    • declearn.communication.NetworkClient.check_message (renamed
      recv_message)
  • declearn.dataset.load_dataset_from_json

New developer-oriented changes

A few minor changes are shipped with this new release, that are mostly of interest to developers - including end-users writing custom algorithms or bridging DecLearn APIs within their own orchestration code.

  • The declearn.secagg.messaging.aggregate_secagg_messages function was introduced as a refactoring of previous backend code to combine and decrypt an ensemble of client-emitted SecaggMessage instances into a single aggregated cleartext Message.
  • The declearn.utils.TomlConfig class, from which all TOML-parsing config dataclasses of DecLearn inherit, now has a new autofill_fields class attribute to indicate fields that may be left empty by users and will then be dynamically filled when parsing all fields. For instance, this enables not specifying evaluate in the TOML file to a FLRunConfig instance.
  • New unit tests were added, most notably for FederatedServer, that now benefits from proper coverage by mere unit tests that verify high-level logic and coherence of actions with inputs and documentation - whereas the overall working keeps being assessed using functional tests.

Note: the PyPI release is flagged as 2.6.0.post1, due to the initial 2.6.0 including some buggy code files from a working branch, improperly added to the build files. The initial 2.6.0 release was deleted from PyPI; 2.6.0.post1 is strictly equivalent to the Git version 2.6.0.

declearn v2.5.0

13 May 14:39
5ee232e
Compare
Choose a tag to compare

Released: 13/05/2024

These release notes can also be found on our website and on the source GitLab repo.

Release Highlight: Secure Aggregation

Overview

This new version of DecLearn is mostly about enabling the use of Secure Aggregation (also known as SecAgg), i.e. methods that enable aggregating client-emitted information without revealing said information to the server in charge of this aggregation.

DecLearn now implements both a generic API for SecAgg and a couple of practical solutions that are ready-for-use. This makes SecAgg easily-usable as part of the existing federated learning process, and extensible by advanced users or researchers that would like to use or test their own method and/or setup.

New features are mostly implemented under the new declearn.secagg submodule, with further changes to declearn.main.FederatedClient and FederatesServer integrating these features to the main process.

Usage and scope

Setting up SecAgg is meant to be straightforward:

  • The server and clients agree on a SecAgg method and, when required, some hyper-parameters in advance.
  • Clients need to hold a private Ed25519 identity key and share the associate public key with all other clients in advance, so that the SecAgg setup can include the verification that ephemeral-key-agreement information comes from trusted peers. (This is mandatory in current implementations, but may be removed in custom setup implementations.)
  • The server and clients must simply pass an additional secagg keyword argument when instantiating their FederatedServer or FederatedClient, which can take the form of a class, a dict or the path to a TOML file.
  • Voilà!

At the moment, SecAgg is used to secure the aggregation of model parameter updates, optimizer auxiliary variables and evaluation metrics, as well as some metadata from training and evaluation rounds. In the future, we plan to cover metadata queries and (yet-to-be-implemented) federated analytics.

Available algorithms

At the moment, DecLearn provides with the following SecAgg algorithms:

  • Masking-based SecAgg (declearn.secagg.masking), that uses pseudo-random number generators (PRNG) to generate masks over a finite integer field so that the sum of clients' masks is known to be zero.

    • This is based on Bonawitz et al., 2016.
    • The setup that produces pairwise PRNG seeds is conducted using the X3DH protocol.
    • This solution has very limited computation and communication overhead and should be considered the default SecAgg solution with DecLearn.
  • Joye-Libert sum-homomorphic encryption (declearn.secagg.joye-libert), that uses actual encryption, modified summation operator, and aggregate-decryption primitives that operate on a large biprime-defined integer field.

    • This is based on Joye & Libert, 2013.
    • The setup that compute the public key as a sum of arbitrary private keys involves the X3DH protocol as well as Shamir Secret Sharing.
    • This solution has a high computation and commmunication overhead. It is not really suitable for model with many parameters (including few-layers artificial neural networks).

Documentation

In addition to the extensive in-code documentation, a guide on the SecAgg features was added to the user documentation, that may be found
here. If anything is not as clear as you would hope, do let us know by opening a GitLab or GitHub issue, or by dropping an e-mail to the package maintainers!

Other changes

A few minor changes are shipped with this new release in addition to the new SecAgg features:

  • InMemoryDataset backend code was refactored and made more robust, and its documentation was improved for readability purposes.
  • declearn.typing.DataArray was added as an alias.
  • declearn.dataset.utils.split_multi_classif_dataset was enhanced to support a new scheme based on Dirichlet allocation. Unit tests were also added for both this scheme and existing ones.
  • Unit tests for both FederatedClient and FederatedServer were added.
  • Deprecated keyword arguments of Vector.sum were removed, as due.

declearn 2.4.0

18 Mar 14:21
c13246c
Compare
Choose a tag to compare

Released: 18/03/2024

These release notes can also be found on our website and on the source GitLab repo.

Important notice:

DecLearn 2.4 derogates to SemVer by revising some of the major DecLearn component APIs.

This is mitigated in two ways:

  • No changes relate to the main process setup code, meaning that end-users that do not use custom components (aggregator, optimodule, metric, etc.) should not see any difference, and their code will work as before (as an illustration, our examples' code remains unchanged).
  • Key methods that were deprecated in favor of new API ones are kept for two more minor versions, and are still tested to work as before.

Any end-user encountering issues due to the released or planned evolution of DecLearn is invited to contact us via GitLab, GitHub or e-mail so that we can provide with assistance and/or update our roadmap so that changes do not hinder the usability of DecLearn 2.x for research and applications.

New version policy (and future roadmap)

As noted above, v2.4 does not fully abide by SemVer rules. In the future, more partially-breaking changes and API revisions may be introduced, incrementally paving the way towards the next major release, while trying as much as possible not to break end-user code.

To avoid unforeseen incompatibilities and cryptic bugs from arsing, from this version onward, the server and clients are expected and verified to use the same major.minor version of DecLearn. This policy may be updated in the future, e.g. to specify that clients may have a newer minor version than the server (and most probably not the other way around).

To avoid unhappy surprises, we are starting to maintain a public roadmap on our GitLab. Although it may change, it should provide interested users (notably those that are interested in developing custom components or processes on top of DecLearn) with a way to anticipate changes, and voice any concerns or advice they might have.

Revise all aggregation APIs

Revise the overall design for aggregation and introduce Aggregate API

This release introduces the Aggregate API, which is based on an abstract base dataclass acting as a template for data structures that require sharing across peers and aggregation.

The declearn.utils.Aggregate ABC acts as a shared ancestor providing with a base API and shared backend code to define data structures that:

  • are serializable to and deserializable from JSON, and may therefore be preserved across network communications
  • are aggregatable into an instance of the same structure
  • use summation as the default aggregation rule for fields, which is overridable by redefining the default_aggregate method
  • can implement custom aggregate_<field.name> methods to override the default summation rule
  • implement a prepare_for_secagg method that
    • enables defining which fields merely require sum-aggregation and need encryption when using SecAgg, and which fields are to be preserved in cleartext (and therefore go through the usual default or custom aggregation methods)
    • can be made to raise a NotImplementedError when SecAgg cannot be achieved on a data structure

This new ABC currently has three main children:

  • AuxVar: replaces plain dict for Optimizer auxiliary variables
  • MetricState: replaces plain dict for Metric intermediate states
  • ModelUpdates: replaces sharing of updates as Vector and n_steps

Each of this is defined jointly with another (pre-existing, revised) API for components that (a) produce Aggregate data structures based on some input data and/or computations; (b) produce some output results based on a received Aggregate structure, meant to result from the aggregation of multiple peers' produced data.

Revise Aggregator API, introducing ModelUpdates

The Aggregator API was revised to make use of the new ModelUpdates data structure (inheriting Aggregate).

  • Aggregator.prepare_for_sharing pre-processes an input Vector containing raw model updates and an integer indicating the number of local SGD steps into a ModelUpdates structure.
  • Aggregator.finalize_updates receives a ModelUpdates resulting from the aggregation of peers' instances, and performs final computations to produce a Vector of aggregated model updates.
  • The legacy Aggregator.aggregate method is deprecated (but still works).

Revise auxiliary variables for Optimizer, introducing AuxVar

The OptiModule API (and, consequently, Optimizer) was revised as to the design and signature of auxiliary variables related methods, to make use of the new AuxVar data structure (inheriting Aggregate).

  • OptiModule.collect_aux_var now emits either None or an AuxVar instance (the precise type of which is module-dependent), instead of a mere dict.
  • OptiModule.process_aux_var now expects a proper-type AuxVar instance that already aggregates clients' data, externalizing the aggregation rules to the AuxVar subtypes, while keeping the finalization logic part of the OptiModule subclasses.
  • Optimizer.collect_aux_var therefore emits a {name: aux_var} dict.
  • Optimizer.process_aux_var therefore expects a {name: aux_var} dict, rather than having distinct signatures on the client and server sides.
  • It is now expected that server-side components will send the same data to all clients, rather than allow sending client-wise values.

The backend code of ScaffoldClientModule and ScaffoldServerModule was heavily revised to alter the distribution of information and computations:

  • Client-side modules are now the sole owners of their local state, and send sum-aggregatable updates to the server, that are therefore SecAgg-compatible.
  • The server consequently shares the same information with all clients, namely the current global state.
  • To keep track of the (possibly growing with time) number of unique client, clients generate a random uuid that is sent with their state updates and preserved in cleartext when SecAgg is used.
  • As a consequence, the server component knows which clients contributed to a given round, but receives an aggregate of local updates rather than the client-wise state values.

Revise Metric API, introducing ModelState

The Metric API was revised to make use of the new MetricState data structure (inheriting Aggregate).

  • Metric.build_initial_states generates a "zero-state" MetricState instance (it replaces the previously-private _build_states method that returned a dict).
  • Metric.get_states returns a (Metric-type-dependent) MetricState instance, instead of a mere dict.
  • Metric.set_states assigns an incoming MetricState into the instance, that may be finalized into results using the unchanged get_result method.
  • The legacy Metric.agg_states is deprecated, in favor of set_states (but it still works).

Revise backend communications and messaging APIs

This release introduces some important backend changes to the communication and messaging APIs of DecLearn, resulting in more robust code (that is also easier to test and maintain), more efficient message parsing (possibly-costly de-serialization is now delayed to a time posterior to validity verification) and the extensibility of application messages, enabling to easily define and use custom message structures in downstream applications.

The most important API change is that network communication endpoints now return SerializedMessage instances rather than Message ones.

New declearn.communication.api.backend submodule

  • Introduce a new ActionMessage minimal API under its actions submodule, that defines hard-coded, lightweight and easy-to-parse data structures designed to convey information and content across network communications agnostic to the content's nature.
  • Revise and expose the MessagesHandler util, that now builds on the ActionMessage API to model remote calls and answer them.
  • Move the declearn.communication.messaging.flags submodule to declearn.communication.api.backend.flags.

New declearn.messaging submodule

  • Revise the Message API to make it extendable, with automated type-registration of subclasses by default.
  • Introduce SerializedMessage as a wrapper for received messages, that parses the exact Message subtype (enabling logic tests and message filtering) but delays actual content de-serialization and Message object recovery (enabling to prevent undue resources use for unwanted messages that end up being discarded).
  • Move most existing Message subclasses to the new submodule, for retro-compatibility purposes. In DecLearn 3.0 these will probably be re-dispatched to make it clear that concrete messages only make sense in the context of specific multi-agent processes.
  • Drop backend-oriented Message subclasses that are replaced with the new ActionMessage backbone structures.
  • Deprecate the declearn.communication.messaging submodule, that is temporarily maintained, re-exporting moved contents as well as deprecated message types (which are bound to be rejected if sent).

Revise NetworkClient and NetworkServer

  • Have message-receiving methods return SerializedMessage instances rather than finalized de-serialized Message ones.
  • Quit sending and expecting 'data_info' with registration requests.
  • Rename NetworkClient.check_message into recv_message (keep the former as an alias, with a DeprecationWarning).
  • Improve the use of (optional) timeouts when sending or expecting messages and overall exceptions handling:
    • NetworkClient.recv_message may either raise a TimeoutError (in case...
Read more

declearn v2.3.2

05 Oct 11:38
fcc7347
Compare
Choose a tag to compare

Released: 05/10/2023

This is a subminor release that patches extra dependency specifiers for Torch.

Revise Torch extra dependency specifiers and version support

The rules for the "torch", "torch1" and "torch2" extra dependency specifiers were updated in order to be simpler and support the newest Torch 2.1 version. As a trade off, support for older Torch versions 1.10 to 1.12 was dropped, due to functorch being shipped together with versions >=1.13, enabling to remove the burden of having to clumsily specify for it (in spite of it no longer being used starting with version 2.0).

The opacus dependency is back to being specified merely by the "dp" extra specifier, which may be updated (either independently from or together with) "torch" in the future - and may also be specified freely by declearn-depending packages, starting with Fed-BioMed.

declearn v2.3.1

06 Sep 14:23
34ee4bf
Compare
Choose a tag to compare

Released: 06/09/2023

This is a subminor release that patches a regression introduced in v2.2.0 but was undetected before. It also introduces a revision of the non-core-package util for generating SSL certificates.

Hotfix: fix DP-SGD with TorchModel

Since version 2.2.0, the functorch-based backend for computing and clipping sample-wise gradients of a TorchModel has attempted to benefit from the experimental functorch.compile API. However, this change, which has been tested to yield performance gains on fixed-size batches of inputs, turns out not to be compatible with variable-size batches - which are mandatory as part of DP-SGD due to the use of Poisson sampling.

As a consequence, this version drops the use of functorch.compile, restoring DP-SGD features with `TorchModel``.

Rewrite 'declearn.test_utils.generate_ssl_certificates'

The declearn.test_utils.generate_ssl_certificates, that is still excluded from the actual package API but is useful to set up examples, tests and even some real-life applications, was rewritten entirely to make use of the python cryptography third-party library rather than rely on subprocess calls to openssl, which make it more robust and avoid incompatibilities with OpenSSL 1.1 as to specifying multiple DNS and/or IPs for the server certificate. The util's API remains unchanged, save for the addition of a duration parameter that controls the validity duration of the generated CA and certificate files.

declearn v2.2.2

06 Sep 14:21
37a45b4
Compare
Choose a tag to compare

Released: 06/09/2023

This is a subminor release that patches a regression introduced in v2.2.0 but was undetected before.

Hotfix: fix DP-SGD with TorchModel

Since version 2.2.0, the functorch-based backend for computing and clipping sample-wise gradients of a TorchModel has attempted to benefit from the experimental functorch.compile API. However, this change, which has been tested to yield performance gains on fixed-size batches of inputs, turns out not to be compatible with variable-size batches - which are mandatory as part of DP-SGD due to the use of Poisson sampling.

As a consequence, this version drops the use of functorch.compile, restoring DP-SGD features with `TorchModel``.

declearn v2.3.0

01 Sep 09:52
03e7b1f
Compare
Choose a tag to compare

Release date: 30/08/2023

Release Highlights

New Dataset subclasses to interface TensorFlow and Torch dataset APIs

The most visible addition of v2.3 are the new TensorflowDataset and TorchDataset classes, that respectively enable wrapping up torch.utils.data.Dataset and tensorflow.data.Dataset objects into declearn Dataset instances that can be used for training and evaluating models in a federative way.

Both of these classes are implemented under manual-import submodules of declearn.dataset: declearn.dataset.tensorflow and declearn.dataset.torch. While applications that rely on memory-fitting tabular data can still use the good old InMemoryDataset, these new interfaces are designed to enable users to re-use existing code for interfacing any kind of data, including images or text (thay may require framework-provided pre-processing), that may be loaded on-demand from a database or distributed files, or even generated procedurally.

Our effort has been put on keeping the declearn-side code minimal and to try to leave the door open for as much framework-provided features as possible, but it is possible that we have missed some things; if you run into issues or limits when using these new classes, feel free to drop us a message, using either the historical Inria-Gitlab repository or the newly-created mirroring GitHub one!

Support for Torch 2.0

Another less-visible but possibly high-impact update is the addition of support for Torch 2.0. It took us a bit of time to adjust the backend code for this new release of Torch as all of the DP-oriented functorch-based code has been made incompatible, but we are now able to provide end-users with compatibility for both the newest 2.0 version and the previously-supported 1.10-1.13 versions. Cherry on top, it should even be possible to have the server and clients use different Torch major versions!

The main interest of this new support (apart from not losing pace with the framework and its backend improvements) is to enable end-users to use the new torch.compile feature to optimize their model's runtime. There is however a major caveat to this: at the moment, options to torch.compile are lost, which means that they cannot yet be properly-propagated to clients, making this new feature usable only with default arguments. However, the Torch team is working on improving that (see for example this issue), and we will hopefully be able to forward model-compilation instructions as part of declearn in the near future!

In the meanwhile, if you encounter any issues with Torch support, notably as to 2.0-introduced features, please let us know, as we are eager to build on user feedback to improve the package's backend as well as its APIs.

Numerous test-driven backend fixes

Finally, a lot of effort has been put in making declearn more robust, by adding more unit and integration tests, improving our CI/CD setup to cover our code more extensively (notably systematically testing it on both CPU and GPU) and efficiently, and adding a custom script to launch groups of tests in a verbose and compact way. We thereof conducted a number of test-driven backend patches.

Some bugs were pretty awful and well-hidden (we recently backported a couple of hopefully-unused operations' formula fix to all previous versions via sub-minor version releases); some were visible but harmful (some metrics' computations were just plain wrong under certain input shapes conditions, which showed as values were uncanny, but made results' analysis and use a burden); some were minor and/or edge-case but still worth fixing.

We hope that this effort enabled catching most if not all current potential bugs, but will keep on improving unit tests coverage in the near future, and are adopting a stricter policy as to testing new features as they are being implemented.

Full Release Notes

Release notes are available on our website and on the source GitLab repo.

declearn v2.2.0

01 Sep 09:52
f7a13df
Compare
Choose a tag to compare

Release date: 11/05/2023

Release highlights

Declearn Quickrun Mode & Dataset-splitting utils

The two most-visible additions of v2.2 are the declearn-quickrun and declearn-split entry-point scripts, that are installed as CLI tools together with the package when running pip install declearn (or installing from source).

declearn-quickrun introduces an alternative way to use declearn so as to run a simulated Federated Learning experiment on a single computer, using localhost communications, and any model, dataset and optimization / training / evaluation configuration.

declearn-quickrun relies on:

  • a python code file to specify the model;
  • a standard (but partly modular) data storage structure;
  • a TOML config file to specify everything else.

It is thought of as:

  • a simple entry-point to newcomers, demonstrating what declearn can do with zero to minimal knowledge of the actual Python API;
  • a nice way to run experiments for research purposes, with minimal setup (and the possibility to maintain multiple experiment configurations in parallel via named and/or versioned TOML config files) and standardized outputs (including model weights, full process logs and evaluation metrics).

declearn-split is a CLI tool that wraps up some otherwise-public data utils that enable splitting and preparing a supervised learning dataset for its use in a Federated Learning experiment. It is thought of as a helper to prepare data for its use with declearn-quickrun.

Support for Jax / Haiku

Another visible addition of declearn v2.2 is the support for models implemented in Jax, specifically via the neural network library Haiku.

This takes shape of the new (optional) declearn.model.haiku submodule, that provides with dedicated JaxNumpyVector and HaikuModel classes (subclassing the base Vector and Model ones). Existing unit and integration tests have been extended to cover this new framework (when available), which is therefore usable on par with Scikit-Learn, TensorFlow and Torch - up to a few framework specificities in the setup of the model, notably when it is desired to freeze some layers (which has to happen after instantiating and initializing the model, contrary to what can be done in other neural network frameworks).

Improved Documentation and Examples

Finally, this new version comes with an effort on improving the usability of the package, notably via the readability of its documentation and examples.

The documentation has been heavily-revised (which has already been partially back-ported to previous version releases upon making the documentation website public).

The legacy Heart UCI example has been improved to enable real-life execution (i.e. using multiple agents / computers communicating over the internet). More importantly, the classic MNIST dataset has been used to implement simpler and more-diverse introductory examples, that demonstrate the various flavors of declearn one can look for (including the new Quickrun mode).

The declearn.dataset.examples submodule has been introduced, so that example data loaders can be added (and maintained / tested) as part of the package. For now these utils only cover the MNIST and Heart UCI datasets, but more reference datasets are expected to be added in the future, enabling end-users to make up their own experiments and toy around the packages' functionality in no time.

Full Release Notes

Release notes are available on our website and on the source GitLab repo.

declearn v2.1.0

01 Sep 09:52
1d6f499
Compare
Choose a tag to compare

Release date: 02/03/2023

Release Highlights

Add proper GPU support and device-placement policy utils.

  • Add device-placement policy utils: declearn.utils.DevicePolicy, declearn.utils.get_policy and declearn.utils.set_policy.

  • Implement device-placement support in TorchModel, TorchVector, TensorflowModel and TensorflowVector, according to shared API principles (some of which are abstracted into Model).

  • Add tests for these features, and automatic running of unit tests on both CPU and GPU when possible (otherwise, run on CPU only).

Add framework-specific TensorflowOptiModule and TorchOptiModule.

  • Enable wrapping framework-specific optimizer objects into a plug-in that may be used within a declearn Optimizer and jointly with any combination of framework-agnostic plug-ins.

  • Add functional tests that verify our implementations of the Adam, Adagrad and RMSprop optimizers are equivalent to these of the Tensorflow and Torch frameworks.

Add declearn.metrics.RSquared metric to compute a regression's R^2.

Fix handling of frozen weights in TensorflowModel and TorchModel.

  • Add trainable: bool=False parameter to Model.get_weights and Model.set_weights to enable excluding frozen weights from I/O.

  • Use Model.get_weights(trainable=True) in Optimizer methods, enabling to use loss-regularization Regularizer plug-ins and weight decay with models that have some frozen weights.

  • Use Model.set_weights(trainable=True) and its counterpart to remove some unrequired communications and server-side aggregator and optimizer computations.

Fix handling of tf.IndexedSlices structures in TensorflowVector.

  • Avoid the (mostly silent, depending on tensorflow version) conversion of tf.IndexedSlices row-sparse gradients to a dense tensor every time it can be avoided.

  • Warn about that conversion when it happens (unless the contexts is known to require it, e.g. as part of noise-addition optimodules).

Full Release Notes

Release notes are available on our website and on the source GitLab repo.

declearn v2.0.0

01 Sep 09:52
Compare
Choose a tag to compare

Release date: 06/02/2023

This is the first stable and public release of declearn.