A Practical Platform for Heterogeneous Federated Learning on Real Heterogeneous Devices

An extension of HtFLlib for on-device HtFL deployment using real devices, based on Flower and CoLExT.

Property

dataset: Realistically and naturally distributed datasets sourced generated by the codes from PFLlib. You can find more raw data here
- HAR (Human Activity Recognition) (30 clients, 6 labels)
- PAMAP2 (9 clients, 12 labels)
- iWildCam (194 camera traps, 158 labels)
system: Flower servers and clients. Here are supported HtFL frameworks:
- Partial-heterogeneity-based HtFL
  - LG-FedAvg — Think Locally, Act Globally: Federated Learning with Local and Global Representations 2020
  - FedGen — Data-Free Knowledge Distillation for Heterogeneous Federated Learning ICML 2021
  - FedGH — FedGH: Heterogeneous Federated Learning with Generalized Global Header ACM MM 2023
- Full-heterogeneity-based HtFL
  - FD — Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data 2018
  - FML — Federated Mutual Learning 2020
  - FedKD — Communication-efficient federated learning via knowledge distillation Nature Communications 2022
  - FedProto — FedProto: Federated Prototype Learning across Heterogeneous Clients AAAI 2022
  - FedTGP — FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated Learning AAAI 2024

How to Use

Distribute and store realistic datasets (.npz files) on all devices using a designated strategy (to be determined). Please read the config.json file for the details of each dataset. Each .npz file can be directly used by PyTorch's DataLoader.
Run ./remap_labels.py to re-map the labels into consecutive integers, and get the total number of labels to set --num_classes.
For each HtFL frameworks, deploy the ./system/servers/serverNAME.py and ./system/clients/clientNAME.py (with ./system/utils) to the workstation and devices, respectively. The server and client models will be saved to their checkpoints folders, respectively.
Execute the server file with the appropriate configurations (argparse), which varies by HtFL framework.
Execute the client file with the appropriate configurations (argparse), which varies by HtFL framework. Real data loading not implemented yet.
Checkpoints are stored locally on clients in args.save_folder_path, with timestamps used by default.

CoLExT Experiment Deployment

Deploying experiments on CoLExT requires the specification of a CoLExT config file. An example is shown below. Additional examples will be placed inside colext_experiments/.

Important notes:

Before running the client Python code, client devices self-assign their own data using the ./config_device_data.sh script. The script assigns data partitions based on the client ID provided by CoLExT. For example, the "identify" strategy assigns partition x to client x.
It's possible to specify additional arguments to a particular client group, allowing different client groups to receive different arguments.

Example config:

# colext_experiments/example_config.yaml
project: htfl-ondevice

code:
  # Path to the code root dir, relative to the config file
  # Defaults to the config file dir if omitted
  # The working directory is set to this path
  path: "../"
  client:
    command: >-
      ./config_device_data.sh ${COLEXT_DATASETS}/iWildCam identity &&
      python3 -m system.clients.clientLG
      --server_address=${COLEXT_SERVER_ADDRESS}
      --num_classes=158
  server:
    command: >-
      python3 -m system.servers.serverLG
      --min_fit_clients=${COLEXT_N_CLIENTS}
      --min_available_clients=${COLEXT_N_CLIENTS}
      --num_rounds=10

clients:
  - dev_type: JetsonAGXOrin
    count: 2
    # Add additional arguments this client group
    add_args: "--model=ResNet101"

  - dev_type: OrangePi5B
    count: 2
    add_args: "--model=ResNet18"

  - dev_type: OrangePi5B
    # If `count` is not specified, it's assumed to be 1
    add_args: "--model=ResNet34"

For an up-to-date reference on the CoLExT config file, refer to CoLExT's README.

Interacting with CoLExT:

# Launch an experiment
colext_launch_job -c colext_experiments/example_config.yaml
# Collect metrics
colext_get_metrics -j <job_id>

To debug an experiment, the CoLExT config can be run locally on the CoLExT server using CoLExT's local Python deployer.

colext_launch_job -c colext_experiments/example_config.yaml --deployer=local_py
# Experiment logs are collected on the current working directory

CoLExT Benchmarks

Easy to run benchmarks are available in colext_experiments/benchmarks:

resnet_models/: benchmark several resnet models across all devices types
devs_w_same_energy_efficiency/: benchmark all HtFL frameworks using a model per device that ensures a similar energy budget for each device. The energy threshold was selected to be the median energy spent in a round by the fastest device based on the the resnet_models benchmark.

How to run a benchmark:

$ cd colexT_experiments/benchmark
$ ./run_benchmark.sh <path_to_benchmark_folder>
# Calls <benchmark_folder>/gen_configs.py to generate CoLExT configs
#   Outputs configs to `<benchmark_folder>/output/colext_configs`
# Launches jobs based on each config and records the job id
#   Writes job ids to `<benchmark_folder>/output/output_job_id_maps.txt`

# After benchmark is finished, plot the results
$ python3 plot_benchmark.py <path_to_benchmark_folder>
# Creates plots based on the job ids from `<benchmark_folder>/output/output_job_id_maps.txt`
#   Plots are written to `<benchmark_folder>/output/plots`

Adding or modifying benchmarks

Benchmarks consist of running multiple colext_config.yaml files. To avoid manually generating these files, each benchmark folder contains a gen_configs.py script. This script generates all the necessary configuration files and outputs them to <benchmark_folder>/output/colext_configs. This script should contain all the logic required to create the configuration files for the benchmark. Refer to existing benchmark folders for examples.

The gen_configs.py file is all that's needed to create a benchmark. With the file in place we can run the benchmark according to How to run a benchmark.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
colext_experiments		colext_experiments
dataset		dataset
system		system
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_device_data.sh		config_device_data.sh
remap_data_labels.py		remap_data_labels.py
requirements.txt		requirements.txt
requirements_orig.txt		requirements_orig.txt
update_colext_to_main.sh		update_colext_to_main.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A Practical Platform for Heterogeneous Federated Learning on Real Heterogeneous Devices

Property

How to Use

CoLExT Experiment Deployment

CoLExT Benchmarks

How to run a benchmark:

Adding or modifying benchmarks

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

TsingZ0/HtFL-OnDevice

Folders and files

Latest commit

History

Repository files navigation

A Practical Platform for Heterogeneous Federated Learning on Real Heterogeneous Devices

Property

How to Use

CoLExT Experiment Deployment

CoLExT Benchmarks

How to run a benchmark:

Adding or modifying benchmarks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages