Scaling Probabilistic Circuits via Data Partitioning
By: Jonas Seng , Florian Peter Busch , Pooja Prasad and more
Potential Business Impact:
Trains smart computer models on many computers at once.
Probabilistic circuits (PCs) enable us to learn joint distributions over a set of random variables and to perform various probabilistic queries in a tractable fashion. Though the tractability property allows PCs to scale beyond non-tractable models such as Bayesian Networks, scaling training and inference of PCs to larger, real-world datasets remains challenging. To remedy the situation, we show how PCs can be learned across multiple machines by recursively partitioning a distributed dataset, thereby unveiling a deep connection between PCs and federated learning (FL). This leads to federated circuits (FCs) -- a novel and flexible federated learning (FL) framework that (1) allows one to scale PCs on distributed learning environments (2) train PCs faster and (3) unifies for the first time horizontal, vertical, and hybrid FL in one framework by re-framing FL as a density estimation problem over distributed datasets. We demonstrate FC's capability to scale PCs on various large-scale datasets. Also, we show FC's versatility in handling horizontal, vertical, and hybrid FL within a unified framework on multiple classification tasks.
Similar Papers
Federated Learning Framework for Scalable AI in Heterogeneous HPC and Cloud Environments
Distributed, Parallel, and Cluster Computing
Trains AI on many computers without sharing private data.
Emerging Paradigms for Securing Federated Learning Systems
Cryptography and Security
Makes AI learn from data without seeing it.
A new type of federated clustering: A non-model-sharing approach
Machine Learning (CS)
Lets groups learn from private data together.