• Starts: 10:00 am on Monday, August 5, 2024
  • Ends: 12:00 pm on Monday, August 5, 2024

ECE PhD Dissertation Defense: Ruizhao Zhu

Title: Auxiliaries for Training Deep Neural Networks

Presenter: Ruizhao Zhu

Advisor: Professor Venkatesh Saligrama

Chair: TBD

Committee: Professor Venkatesh Saligrama, Professor Brian Kulis, Professor Eshed Ohn-Bar, Professor Wenchao Li.

Google Scholar Link: https://scholar.google.com/citations?hl=en&user=otVAkGkAAAAJ&view_op=list_works&sortby=pubdate

Abstract: The generalization of deep neural networks (DNNs) remains a critical challenge in machine learning, where the goal is to ensure that models perform well on unseen test data. Regularization techniques play a pivotal role in this process by constraining the model's capacity to prevent overfitting. Methods such as dropout, data augmentation, normalization, and penalty functions introduce beneficial biases that guide the optimization process towards better generalization. Traditional regularization is typically data-agnostic. In this dissertation, we propose leveraging data-dependent auxiliary sources for regularization, focusing on image classification across various settings including supervised learning, knowledge distillation, federated learning, meta-learning, and few-shot learning, as well as imitation learning for autonomous driving.

We introduce four main types of auxiliary sources: models, gradients, features and data. Auxiliary models approximate the mean of historical predictions, providing data-dependent regularization by aligning the deployed model's output with the auxiliary model's output. This method, which we extend to knowledge distillation, demonstrates improved generalization and draws connections to applications in large language models. Auxiliary gradients address bias in neural networks handling multiple tasks, particularly in personalized federated learning and online meta-learning. We propose algorithms that dynamically modify gradients to reduce task-specific biases, enhancing model performance across diverse users and tasks. Then, auxiliary features focus on tasks like few-shot learning and fine-grained image recognition, where specialized modeling is required to capture detailed visual characteristics. We propose Deep Object Parsing to generate auxiliary features that enhance performance in these settings. For imitation learning in autonomous driving, we introduce frameworks that utilize vast amounts of auxiliary data. The SelfD framework leverages unlabeled dash camera data through semi-supervised learning, while the AnyD framework utilizes cross-domain driving data to improve policylearning. Both frameworks demonstrate significant improvements in driving performance by efficiently using auxiliary data.

Location:
PHO 339