Multi-Head Attention has become ubiquitous in modern machine learning architectures, but how much efficiency can still be gained? This question was the focus of Dr. Maximilian Baust’s talk, "Beyond Transformers: Why Beating Multi-Head Attention is Hard."
In his presentation, Dr. Baust explored potential solutions for improving efficiency, ranging from implementation strategies and algorithmic modifications to new architectures, including spiking neural networks.
Dr. Maximilian Baust serves as Director of Solution Architecture Industries EMEA at NVIDIA and is also an industry mentor for one of relAI’s PhD students.
We extend our gratitude to Dr. Baust for sharing his insights and to our director, Gitta Kutyniok, for inviting him to relAI.