Home Artists Posts Import Register

Content

A quick one: The last video about the AI Safety Gridworlds paper. How does an agent detect and adapt to friendly and adversarial intentions in the environment?

https://youtu.be/WM2THPzFSNk


Files

Friend or Foe? AI Safety Gridworlds extra bit

The last video about the AI Safety Gridworlds paper. How does an agent detect and adapt to friendly and adversarial intentions in the environment? The previous video: https://youtu.be/CGTkoUidQ8I The Computerphile video: https://www.youtube.com/watch?v=eElfR_BnL5k The EXTRA BITS video, with more detail: https://www.youtube.com/watch?v=py5VRagG6t8 The paper: https://arxiv.org/pdf/1711.09883.pdf The GitHub repos: https://github.com/deepmind/ai-safety-gridworlds With thanks to my excellent Patreon supporters: https://www.patreon.com/robertskmiles Jason Hise Steef Cooper Lawton Jason Strack Chad Jones Stefan Skiles Jordan Medina Manuel Weichselbaum Scott Worley JJ Hepboin Alex Flint Justin Courtright Pedro A Ortega James McCuen Richárd Nagyfi Ville Ahlgren Alec Johnson Clement Chiris Simon Strandgaard Joshua Richardson Jonatan R Michael Greve The Guru Of Vision Alexander Hartvig Nielsen Volodymyr David Tjäder Julius Brash Tom O'Connor Gunnar Guðvarðarson Shevis Johnson Erik de Bruijn Robin Green Alexei Vasilkov Maksym Taran Laura Olds Jon Halliday Robert Werner Paul Hobbs Jeroen De Dauw Enrico Ros Tim Neilson Eric Scammell christopher dasenbrock Igor Keller Morten Jelle Ben Glanton Robert Sokolowski Vlad D William Hendley DGJono robertvanduursen Scott Stevens Emilio Alvarez Dmitri Afanasjev Brian Sandberg Einar Ueland Marcel Ward Andrew Weir Taylor Smith Ben Archer Scott McCarthy Kabs Phil Tendayi Mawushe Anne Kohlbrenner Jake Fish Bjorn Nyblad Jussi Männistö Mr Fantastic Matanya Loewenthal Wr4thon Dave Tapley Archy de Berker Kevin Marc Pauly Joshua Pratt Andy Kobre Brian Gillespie Martin Wind Peggy Youell Poker Chen Kees Darko Sperac Paul Moffat Jelle Langen Lars Scholz Anders Öhrt Lupuleasa Ionuț Marco Tiraboschi Michael Kuhinica Fraser Cain Robin Scharf Oren Milman John Rees Shawn Hartsock Seth Brothwell https://www.patreon.com/robertskmiles

Comments

Jason Hise

I think there's an interesting sub-problem here. Can an AI learn to behave unpredictably? Randomness is notorious for being a thing deterministic machines have a bit of trouble with, and I wonder how well training (especially with a neural net) can work when the goal is not to match similar situations to similar answers, but instead to match particular situations to highly variable answers.

robertskmiles

It would be easy enough to just give the agent a side channel that has a random value, so it could learn to ignore that input in the green room and use it as an entropy source in the red room. I'm not actually sure how they handled that here though, maybe I'll ask Pedro :)