The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663
Today we’re joined by Markus Nagel, research scienti...
more
Dec 26 2023 46m
Chapter 1 15 mins
Transformer Efficiency With Qualcomm AI ResearchChapter 2 12 mins
Model Efficiency Through Quantization and PruningChapter 3 10 mins
Equivariance, Transformers, and LLMsChapter 4 8 mins
On-Device AI and Full Stack Optimization