Skip to content

Federated Learning For Nonconvex Problems

Summary

The increasing availability and diversity of quantitative information together with advances in the theory and practice of computational science are allowing once obscure information and insights to emerge, facilitating users with more accurate and personalized product recommendation and decision making. With this increasingly important role, it is crucial to have well-understood computational tools which are fair, correct, and robust. The most exciting recent developments in data-driven computation involves optimizing complicated black box models with enormous numbers of parameters. However, global recovery guarantees are only available when the optimization problem is convex, which imposes strong technical assumptions and is only applicable to a small portion of practical problems. Figure 9: Federated Learning

Figure 9: Federated Learning

Despite the lack of theoretical justification, people have come up with natural, efficient and effective non-convex formulations and algorithms over years of research and engineering applications. For example, iterative and greedy algorithms including Lloyd’s algorithm, EM algorithm have been popular for clustering deployed for bioinformatics, image analysis, computer graphics and so on. For these nonconvex formulations, only local solutions can be recovered, which heavily rely on the initialization and training data distribution.

This challenge is further amplified due to some modern computation trends. Classic machine learning approaches require centralizing large volume of training data on one machine or in a data center. Federated learning (FL) is a new edge-computing approach that proposes training models directly on remote devices by leveraging enhanced local resources on each device. In a standard FL setting, there are N clients (N is typically very large in practice), each having its own training data, and a central server whose role is to manage the training of a centralized model using the clients’ data.

Federated learning enables multiple clients to build a common, robust machine learning model without sharing data, thus address issues including data privacy, data security, data access rights and access to heterogeneous data. In FL, data is distributed across millions of clients in a highly uneven fashion, and underlying machine learning problem f is usually nonconvex. The absence of knowledge about local solutions is worsened in this setting. When computation and data are being de-centralized unevenly, each client is very likely to return different local solutions for the same underlying nonconvex problem, hence how the central server synchronizes those local solutions stick out as a challenge for federated learning.