How AI learned to feel | 75 years of reinforcement learning

How AI learned to feel | 75 years of reinforcement learning

HomeArt of the ProblemHow AI learned to feel | 75 years of reinforcement learning
How AI learned to feel | 75 years of reinforcement learning
ChannelPublish DateThumbnail & View CountDownload Video
Channel AvatarPublish Date not found Thumbnail
0 Views
I follow the history of RL (without model), from learning tic-tac-toe, checkers, backgammon, but also physical problems (cart and pole), walking, grasping (OpenAI's dexterous robot hand)… I explain value functions out, q functions, policy functions and how they work together. Including how TD learning was used.

Thanks to Jane Street for sponsoring this video. They are hiring people who are interested in ML! Learn more about their work and open positions (and support me), visit their website: https://www.janestreet.com/machine-learning/?utm_source=yt&utm_medium=video&utm_campaign=AoP

Along the way, we will encounter the challenges of transferring simulated skills to the real world (domain randomization) and witness the emergence of uncanny human-like behavior in AI agents. It leaves us with a provocative question: where is the line between actions and words? What is the role of a GPT for actions?
With insights from:
Claude Shannon
Arthur Samuel
Gerald Tesauro
Richard Suton
David Silver
Deep mind/open AI etc.

00:00 – Introduction
00:32 – Learning Tic Tac Toe
02:00 – Teaching cart and pole
04:20 – Shannon and chess
06:50 – Samuel's checkers
09:25 – TD Gammon (Gerald Tesaruo)
11:00 – TD Learning
14:30 – Learning Atari (DQN)
17:28 – DIRECT Policy progress
19:40 – Domain randomization

Please take the opportunity to connect and share this video with your friends and family if you find it helpful.