This event has passed.
Monday, October 23, 2023
10:30 am – 11:30 am
Presenter: Babak Hassibi
The NSF AI Institute for Edge Computing (Athena) presents a Distinguished Seminar by Babak Hassibi, titled “Implicit and Explicit Regularization in Deep Networks: Generalization, Pruning, and Compression” on Monday, October 23, 2023, from 10:30 -11:30 a.m. EST via Zoom Meeting ID 954 5846 0559, Passcode 102323 and in-person at Duke University. Refreshments will be provided at 10:00am.
Abstract:
Deep learning has been the main driver behind the tremendous recent achievements in machine learning and AI. One of the seeming paradoxes of deep learning is their uncanny ability to generalize to unseen data, even though they have orders of magnitude more parameters than the training data and can perfectly "interpolate" the training set. It is now being recognized that this generalization ability is due to the implicit regularization that is inherent to the stochastic gradient descent (SGD) algorithms that are used to the train them, and which allows one to find "good" interpolating solutions. In this talk, we shall review these results and further introduce a family of algorithms called stochastic mirror descent (SMD) which allows one to choose arbitrary (implicit and explicit) convex regularizers. In particular, we show that explicit regularization yields far superior generalization performance over SGD on noisy data sets. We further show that, with appropriate explicit regularizers, it is possible to significantly prune networks, or to quantize each parameter to a small number of bits, without sacrificing appreciable performance.
Bio:
Babak Hassibi is the inaugural Mose and Lillian S. Bohn Professor of Electrical Engineering and Computational and Mathematical Sciences at the California Institute of Technology, where he has been since 2001, From 2011 to 2016 he was the Gordon M Binder/Amgen Professor of Electrical Engineering and during 2008-2015 he was Executive Officer of Electrical Engineering, as well as Associate Director of Information Science and Technology. Prior to Caltech, he was a Member of the Technical Staff in the Mathematical Sciences Research Center at Bell Laboratories, Murray Hill, NJ. He obtained his PhD degree from Stanford University in 1996 and his BS degree from the University of Tehran in 1989. His research interests span various aspects of information theory, communications, signal processing, control, and machine learning. He is an ISI highly cited author in Computer Science and, among other awards, is the recipient of the US Presidential Early Career Award for Scientists and Engineers (PECASE) and the David and Lucille Packard Fellowship in Science and Engineering.