Special Seminar: Aurelien Lucchi "An Optimization Perspective on Deep Learning"
An Optimization Perspective on Deep Learning
Monday, 15 March at 13:00
via Zoom: request the link by email [email protected]
Note! the link will be sent to the CS staff separately every day.
Abstract: Deep neural networks have become ubiquitous in many areas of machine learning over the past decade. However, learning the parameters of these networks is a challenging task due to, among other factors, the non-convex nature and the high-dimensionality of the objective function. Training deep neural networks is in fact nearly impossible without various normalization techniques and proper initialization strategies.
This talk will start with a review of some of the important problems that make training deep neural networks a challenging task, including the problem of vanishing gradients. I will discuss a new result that demonstrates that this phenomenon extends to curvature, which has important consequences that can severely affect the performance of commonly-used optimization methods. Then, I will demonstrate how batch normalization -- a technique that stabilizes the distribution of the inputs across layers -- has provable benefits on the optimization process. A discussion about several research directions aiming to improve the optimization process of deep neural networks will conclude this talk.
Bio: Aurelien Lucchi has been a researcher at the institute of Machine Learning at ETH Zurich since July 2018. He has earned his PhD in Machine Learning and Computer vision from EPFL in 2013, and a MSc in Computer science from INSA Lyon - France. From January 2014 to June 2018, he was a postdoctoral fellow at ETH Zurich in the group of Prof. Thomas Hofmann. His research interests are in optimization and large scale learning as well as machine learning applications in computer vision, cosmology, quantum computing, and biology. He also regularly serves as area chair or program committee member at major conferences (NeurIPS, ICML, ICLR, UAI, IJCAI) and is a reviewer for major machine learning conferences and journals (JMLR, Math programming, etc). In addition to his academic career, Aurelien has had relevant experience in industrial research including internships at Google Research as well as Microsoft Research.