Skip to content

An implementation of federated learning research baseline methods based on FedML-core, which can be deployed on real distributed cluster and help researchers to explore more problems existing in real FL systems.

Notifications You must be signed in to change notification settings

ysyisyourbrother/Federated-Learning-Research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Federated-Learning-Research

An implementation of Federated Learning research baseline methods based on FedML-core. This is not an implementation for only stand-alone simulation, but a distributed system that can be deployed on multiple real devices (or several docker containers on a same server), which can help researchers to explore more problems that may exist on real FL systems.

Here are the list of publications based on this repository:

  • #ICPP 2022# Our work "Eco-FL: Adaptive Federated Learning with Efficient Edge Collaborative Pipeline Training" has been accepted by International Conference on Parallel Processing(ICPP), 2022.

Quick Start

Here is demo to deploy our framework on a cluster with three devices. A (ip: 172.17.0.4) represents the server worker, B (ip: 172.17.0.13) and C (ip: 172.17.0.12) represent two client workers in FL system. All devices must have a python3 environment with pytorch and grpc installed. You can find the env-requirement under requirement.txt file.

Startup scripts for all methods are under experiment/ directory. We first need to modify the grpc_ipconfig.csv file as below:

receiver_id,ip
0,172.17.0.4
1,172.17.0.13
2,172.17.0.12

Receiver_id represents the worker_id for each worker in FL system. Usually, worker_id of server worker is 0 and client workers' id will start from 1.

After that, we will start the worker process on each devices. All client workers need to be started up before server worker. Remember to execute all commands below at the root direcroy of this project:

python experiment/fedprox/run_fedprox_distributed.sh $worker_id

Then the training process will begin and the checkpoint of global model will be saved under checkpoint/fedprox/. You can change the experiment settings by editing the pre-set arguments in run_fedprox_distributed.sh.

Reproduced FL Algorithms

Method Reference
FedAvg McMahan et al., 2017
FedProx Li et al., 2020
FedAsync Wang et al., 2021
FedAT Zheng Chai et al., 2021
On going ...

Contacts

About

An implementation of federated learning research baseline methods based on FedML-core, which can be deployed on real distributed cluster and help researchers to explore more problems existing in real FL systems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published