top of page
  • Shubham Chaudhary

Federated Learning - Decentralized Deep Learning Technology

What is Federated Learning?

Federated learning, sometimes referred to as collaborative learning, is a machine learning technique that uses several distributed edge devices or servers that keep local data samples to train an algorithm without transferring the data itself. This strategy differs from more established centralized machine learning methods where all local datasets are uploaded to a single server. By separating the capacity to do machine learning from the requirement to put the training data in the cloud, federated learning enables mobile devices to cooperatively develop a shared prediction model while maintaining all the training data on the device. The algorithm functions as follows: your smartphone downloads the most up-to-date model, refine it using data from your phone and then compiles the changes into a brief, focused update. Only this model update is transmitted via encrypted communication to the cloud, where it is quickly averaged with updates from other users to enhance the shared model. No specific updates are saved in the cloud; all training data is kept on your device.

History of Federated Learning

Google first used the phrase "federated learning" in a paper published in 2016 that sought to address the issue of how to train a centralized machine learning model while the data is spread among millions of clients. Sending all the data to Google for processing is the "brute-force" approach to tackling this problem. This strategy has several issues, including heavy client data utilization and a need for more data privacy. The paper's suggested fix was sending local models to each device, calculating the global minimum, and then sending the calculated weights to the central or federated server. The central node will repeatedly cycle through all clients' parameters and average the best ones for a global model. As a result, we can generate the ideal models with just two exchanges of small files (models and weight).


Algorithms used in Federated Learning

  • Federated stochastic gradient descent (FedSGD)

The optimization approach of gradient descent aids in locating the local minimum of a function and is frequently used to train ML models and neural networks. A random subset of the available or whole dataset is used to compute gradients. The server calculates an average gradient based on the number of training samples for each client, which is then applied to each step of the descent.

  • Federated Averaging

In contrast to FedSGD, federated averaging (FedAvg) allows clients to share updated weights rather than descending values. Since all clients start from the same initial position, averaging the gradient is nearly equivalent to averaging the weights, which is why we generalize the FedSGD.

Applications of Federated Learning

  • Smartphone

Statistical models are used to power apps like next-word prediction, facial recognition, and voice recognition by studying user behaviour over a large pool of mobile phones. Users can choose not to share their data in order to maintain their privacy or to save data or battery life on their phones. Without disclosing personal information or jeopardizing the user experience, federated learning can produce precise smartphone predictions.

  • Organization

In federated learning, entire organizations or institutions may be referred to as "devices." For instance, hospitals store enormous amounts of patient data that programs for predictive healthcare can access. On the other hand, hospitals adhere to strict privacy laws and may be constrained by administrative, legal, or ethical restrictions that call for the localization of data. Since it lessens network burden and enables private learning among several devices/organizations. Federated learning is a promising solution for these applications.

  • IoT (Internet of Things)

Sensors are utilized in contemporary IoT networks, such as wearable technology, autonomous vehicles, and smart homes, to collect and process data in real time. A fleet of autonomous vehicles, for instance, might need a current simulation of pedestrian, construction, or traffic behaviour to function properly. However, because of privacy concerns and the constrained connectivity of each device, creating aggregate models in these situations may be challenging. Federated learning techniques make it possible to develop models that quickly adapt to these systems' changes while protecting user privacy.

  • Healthcare

Healthcare is one of the areas that can most benefit from federated learning because sensitive health information cannot be shared readily due to HIPAA and other constraints. This method allows for the construction of AI models while adhering to the regulations, using a sizable amount of data from various healthcare databases and devices.

  • E-Commerce

As you are aware, advertising personalization depends heavily on the information provided by each individual user. However, websites like social networking, eCommerce platforms, and other venues come to mind as more people worry about how much information they would prefer not to share with others. Advertising may rely on private customer data through federated learning to survive and reduce people's concerns.

  • Autonomous Automobiles

Federated learning is used to create self-driving cars since it can make predictions in real time. According to one study, Federated learning may reduce training time for predicting the steering angle of self-driving cars. The data may contain real-time updates on the state of the roads and traffic, enabling continual learning and quicker decision-making. This might lead to a safer and more fun self-driving car experience. The automotive industry is a prospective field for federated machine learning applications. But at the moment, research is the only thing being done in this area.


The Architecture of Federated Learning

  1. Client – The client is the user device that has its own local data.

  2. Server – server contains an initial ‘Machine Learning’ model which is shared with the clients.

  3. Locally trained model – When a client gets the initial model from the server, it uses its local data to train that model locally, then the locally trained model comprising updated weights is shared with the server.

  4. This cycle of downloading and updating happens on multiple devices and is repeated several times before reaching good accuracy. Only then is the model distributed to all other users and for all types of use cases.

Working - Federated Learning

Federated learning relies on an iterative process that is further divided into client-server interactions known as federated learning rounds to guarantee good task performance of a final global model. During each round, the current state of the global model is transmitted to the participating nodes, local models are trained on these local nodes to produce a set of potential model updates at each node, and finally, the local updates are combined and processed into a single global update and applied to the global model. Assuming that there is only one iteration of the learning process in a federated round, the learning process can be summed up as follows:


1. Initialization: A machine learning model is selected to be trained on local nodes and initialized based on the server inputs. Nodes are then turned on and await instructions to begin doing calculations from the central server.

2. Client Selection: A subset of local nodes is chosen to begin training on local data. While the others wait for the subsequent federated round, the chosen nodes obtain the current statistical model.

3. Configuration: The central server directs a subset of nodes to train the model on their local data in accordance with predefined rules (e.g., for some mini-batch updates of gradient descent).

4. Reporting: For aggregate, each chosen node sends its local model to the server. The central servers combine the receiving models, which then transmit the updated model to the models. Failures due to lost model updates or disconnected nodes are also handled. The next federated round is started by returning to the client selection phase.

5. Termination: The central server compiles the updates and completes the global model after a pre-defined termination criterion is fulfilled (e.g., the maximum number of iterations is completed, or the model accuracy exceeds a threshold).


Patent Analysis

Nearly one-third of all AI journal papers and citations worldwide in 2021 came from China. China attracted $17 billion in economic investment for AI start-ups in 2021, accounting for more than one-fifth of all private investment funding worldwide. In China, there are five major categories into which AI businesses often fall. To assist both business-to-business and business-to-consumer firms, hyperscale’s develop end-to-end technological expertise in AI and collaborate within the ecosystem. By developing and utilising AI for internal change, the launch of new products, and customer services, companies in conventional industries directly give customer care to consumers. Programs and solutions are developed for specific use cases by businesses that specialize in AI for a particular industry. Developers can obtain computer vision, natural language processing, voice recognition, and machine learning technologies from AI core tech providers in order to build AI systems. Hardware suppliers provide the processing and storage infrastructure required to meet the demand for AI.


Artificial intelligence (AI) has been widely used in recent years, which holds fresh promise and implications for the finance industry. A symposium on the most recent developments in AI in finance was held on March 1 by the Nanyang Business School (NBS) and the Artificial Intelligence Research Institute (AI. R) of Nanyang Technological University Singapore (NTU Singapore). The session, which was a part of the NBS Knowledge Lab Webinar series, was supported by the Joint WeBank-NTU Research Centre on FinTech and the NBS Information Management Research Centre (IMARC).

Facial recognition, natural language processing (NLP), federated learning, and other artificial intelligence (AI) technologies from WebBank have aided in the advancement of back-office activities like anti-money laundering, identity theft prevention, credit risk management, and intelligent equity pricing. Additionally, they have advocated for the sophisticated transformation of customer service. It was also utilized in the cooperative modelling of consumer loan data on microloans. Seventy percent of the issues that small and medium-sized enterprises raised have been handled (SMEs). Through collaborative modeling, $1 billion in business loans might be made. The AI team at WebBank, a leader in "federated learning" technology, developed the FATE ("Federated AI Technology Enabler") Federated Learning Framework. The idea has gained support from more than 800 companies and 300 organizations.


Advantages of Federated Learning

  1. Integrated Server: Mobile devices learn from the prediction model and retain the training data with the aid of federated learning rather than uploading and keeping it on a central server.

  2. Security: You no longer need to be concerned about security when your personal information is local and stays on your personal server. Thanks to federated learning, all the data needed to train the model will remain under tight security. Federated learning, for instance, can be used by institutions like hospitals that place a high value on data protection.

  3. Real-Time Predictions: Since the data sets are accessible without the requirement for a centralized server, FL enables real-time forecasts on your smartphone. As a result, the lag is decreased and data can be accessed without connecting to the main server. Direct data transmission and reception are possible through the local server.

  4. Internet is not Required: The model's predictive capabilities do not require an internet connection because the data is stored on your device. This implies that regardless of where you are, you can find solutions quickly.

  5. Minimum Hardware: Because all your data is accessible on your mobile devices, a federated learning model does not require a substantial hardware infrastructure. Consequently, FL models make it simple to obtain data from a single device.

Conclusion

By allowing edge devices to train the model using their own data, FL has emerged as a cutting-edge learning platform that addresses data privacy concerns while also offering a transfer learning paradigm. Modern machine learning has undergone a radical change thanks to the increasing storage and computing power of edge nodes like autonomous vehicles, smartphones, tablets, and 5G mobile networks. Thus, FL applications span multiple domains. However, there are several places where FL must be developed further. For instance, FedAvg, its default aggregation algorithm, has application-dependent convergence, indicating the need to investigate more sophisticated aggregating techniques. Similarly, resource management can be crucial when dealing with the complex computation FL requires. Therefore, it is necessary to develop and optimize the optimization of communication, computing, and storage costs for edge devices during the model-training process. Additionally, the majority of research typically focuses on IoT, healthcare, etc. However, other application fields, like food delivery systems, virtual reality applications, finance, public safety, hazard identification, traffic management, monitoring, etc., can profit from this learning paradigm.


References

Recent Insights
bottom of page