Principled Solutions for Efficient Artificial Neural Networks

Speaker

Bio: Adrian Bulat is currently a Senior Research Scientist at Samsung AI Cambridge. Previously, he received his PhD from the University of Nottingham as part of the Computer Vision Laboratory group. His research work lies at the intersection of Computer Vision and Machine Learning, with work conducted on topics such as efficient neural networks (via bit quantization, network binarization and compression), representation learning and human analysis, where he covered topics such face alignment, face super-resolution and human pose estimation.

Homepage: https://www.adrianbulat.com/.

Abstract

The ever-growing demand of efficient models brought by the increasingly fast adoption of mobile and low-powered devices has attracted a recent influx of methods that attempt to tackle the problem from different angles. In this presentation we will primarily focus on strategies for designing efficient architectures and on quantization-based methods. On the architectural side, we will present our recent work on efficient image and video transformers that explore hardware-friendly attention modelling solutions. On the second part, we will briefly cover extreme quantization in the form of binarization – a general approach that can enable up to 64x faster inference.

Video