AI safety is in the spotlight - what is it all about, and should you worry?

27 Dec

When I first started working on AI policy at Google, I was surprised how tetchy the researchers I was working with got about using the term ‘AI safety’. To me it was a commonsense phrase that captured the essence of the goal — to make sure AI systems were not harmful to those who used them.

The problem, it turned out, was that in research circles ‘AI safety’ had come to be shorthand for talking about longer-term sci-fi-esque fears over whether humanity could retain control of superintelligent AI and whether misaligned systems might turn us all into paperclips. Given the limitations of AI’s performance at the time (6+ years ago), superintelligent AI seemed a very distant prospect — and thus discussion of AI-related existential risk was seen by many as an unhelpful distraction from tackling critical problems of the here-and-now, namely improving AI fairness, accountability and transparency (FAT-ML).

But then in the past year, ChatGPT blew everyone’s expectations out of the water. As Bill Gates put it

“I knew I had just seen the most important advance in technology since the graphical user interface”.

OpenAI spearheaded a massive industry wave of Generative AI development, with ever-more impressive performance, and suddenly super-smart AI — and the safety risks it may pose — didn’t seem so far-fetched. Cue various open letters signed by the AI literati — “Pause giant AI experiments”, “Mitigating the risk of extinction from AI should be a global priority” — and even a TIME article suggesting that we should “shut it all down”.

It’s safe to say that governments got the message! On October 30th, President Biden issued an Executive Order on Safe, Secure, and Trustworthy AI, which in its breadth and depth leapfrogs the US from laggard to top table in AI-related regulation, including around safety. Not coincidentally, the day after the UK staged their AI Safety Summit, pulling off the diplomatic coup of getting 28 countries including the US, Europe AND China to sign the Bletchley Declaration, and kickstarting what seems to be genuine momentum towards developing international governance mechanisms to address AI safety concerns.

Given this recent flurry of AI safety related activity, I thought it would be helpful to provide some context for anyone new to the topic. For ease of formatting, it’s written on my Substack blog. Follow the links to jump ahead, or start reading here.

Three things you should know about the state of AI today
It’s not unreasonable to assume rapid, unforeseen leaps in performance
The pace of AI advance appears unstoppable
Modern-day AI is so complex that no one fully understands how it works
Reasons why some people fear that AI could be catastrophic for human society
Round-up of what’s being done to address AI safety risks - by industry as well as by governments
My personal perspective on AI safety. (spoiler: I’m much more worried than 2 years ago; but I’m still cautiously optimistic)

Road to Amherst

AI safety is in the spotlight - what is it all about, and should you worry?

If a machine learning model was a chef…

Demystifying the AI policy landscape