How AI learned to feel | 75 years of reinforcement learning

Channel	Publish Date	Thumbnail & View Count	Download Video
	Publish Date not found	0 Views	View on YouTube Download Video

I follow the history of RL (without model), from learning tic-tac-toe, checkers, backgammon, but also physical problems (cart and pole), walking, grasping (OpenAI's dexterous robot hand)… I explain value functions out, q functions, policy functions and how they work together. Including how TD learning was used.

Thanks to Jane Street for sponsoring this video. They are hiring people who are interested in ML! Learn more about their work and open positions (and support me), visit their website: https://www.janestreet.com/machine-learning/?utm_source=yt&utm_medium=video&utm_campaign=AoP

Along the way, we will encounter the challenges of transferring simulated skills to the real world (domain randomization) and witness the emergence of uncanny human-like behavior in AI agents. It leaves us with a provocative question: where is the line between actions and words? What is the role of a GPT for actions?
With insights from:
Claude Shannon
Arthur Samuel
Gerald Tesauro
Richard Suton
David Silver
Deep mind/open AI etc.

00:00 – Introduction
00:32 – Learning Tic Tac Toe
02:00 – Teaching cart and pole
04:20 – Shannon and chess
06:50 – Samuel's checkers
09:25 – TD Gammon (Gerald Tesaruo)
11:00 – TD Learning
14:30 – Learning Atari (DQN)
17:28 – DIRECT Policy progress
19:40 – Domain randomization