The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) - Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla

Model Diversity and Output Performance

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Today we're joined by Alex Havrilla, a PhD student a... more

Apr 16 2024 46m

Chapter 1 14 mins

Reinforcement Learning for Language Models

Chapter 2 10 mins

Exploring Model Output Diversity in RL

Chapter 3 7 mins

Comparison of RL Fine-Tuning Algorithms

Chapter 4 10 mins

Reasoning and Noise Impact in Training

Chapter 5 2 mins

Enhancing Transformer Model Generalization

Clip

Transcript

Read Transcript

Chapters

About This Episode

Play Full

Get the future of podcasts.