Date Thesis Awarded

5-2024

Access Type

Honors Thesis -- Access Restricted On-Campus Only

Degree Name

Bachelors of Science (BS)

Department

Computer Science

Advisor

Huajie Shao

Committee Members

Haipeng Chen

Ashley Gao

Abstract

In modern machine learning research, data privacy and scarcity are two pressing challenges for deploying data-driven models in domains with sensitive data, including healthcare, finance, and government. Under data privacy regulations, restrictions on data sharing make it almost impossible to aggregate sufficient data from each party to train a robust machine learning model collectively. While federated learning (FL) frameworks enable collaborative training of a shared model without direct data exchange to preserve data privacy, their performances generally degrade significantly when clients only have a limited amount of local data. To address these issues, we propose a Deep Generative Federated Learning Model (FedDeepGen) under the Federated Averaging framework. Our model involves the concurrent training of a Variational Autoencoder (VAE) and a Multilayer Perceptron (MLP) classifier on local clients and parameter aggregation on a central server, enabling both image classification and image generation. We verify its efficacy in image classification by comparing it against two baselines across different numbers of clients using multiple benchmark datasets. Additionally, we demonstrate its capability to generate new images for effective data augmentation, resolving data scarcity issues in federated learning and improving the global model’s generalization ability.

On-Campus Access Only

Share

COinS