The Real Python Podcast - Measuring Bias, Toxicity, and Truthfulness in LLMs With Python on Fathom

AI Language Models

The Real Python Podcast

Measuring Bias, Toxicity, and Truthfulness in LLMs With Python

How can you measure the quality of a large language ... more

Jan 19 2024 1h 15m

Chapter 1 2 mins

ndash; Introduction

Chapter 2 1 min

ndash; Testing characteristics of LLMs with Python

Chapter 3 4 mins

ndash; Background on LLMs

Chapter 4 5 mins

ndash; Training of models

Chapter 5 1 min

ndash; Uncurated sources of training

Chapter 6 5 mins

ndash; Safeguards and prompt engineering

Chapter 7 2 mins

ndash; TruthfulQA and creating a more strict prompt

Chapter 8 2 mins

ndash; Information that is out of date

Chapter 9 2 mins

ndash; WinoBias for evaluating gender stereotypes

Chapter 10 1 min

ndash; BOLD dataset for evaluating bias

Chapter 11 49 sec

ndash; Sponsor: Intel

Chapter 12 4 mins

ndash; Using Hugging Face to start testing with Python

Chapter 13 2 mins

ndash; Using the transformers package

Chapter 14 5 mins

ndash; Using langchain for proprietary models

Chapter 15 4 mins

ndash; Putting the tools together and evaluating

Chapter 16 1 min

ndash; Video Course Spotlight

Chapter 17 1 min

ndash; Assessing toxicity

Chapter 18 4 mins

ndash; Measuring bias

Chapter 19 1 min

ndash; Checking the hallucination rate

Chapter 20 1 min

ndash; LLM leaderboards

Chapter 21 7 mins

ndash; What helped ChatGPT leap forward?

Chapter 22 1 min

ndash; Improvements of what is being crawled

Chapter 23 3 mins

ndash; Revisiting agents and RAG

Chapter 24 2 mins

ndash; ChatGPT plugins and Wolfram-Alpha

Chapter 25 1 min

ndash; How can people follow your work online?

Chapter 26 1 min

ndash; Thanks and goodbye

Clip

Transcript

Read Transcript

Chapters

About This Episode

Play Full

Get the future of podcasts.