AI Safety
In the wake of the AI Summit in Seoul, Katherine and Anna discuss AI safety—how to ensure that AI systems don’t pose unacceptable risks, including catastrophic risks—in this week’s episode of “Waking Up With AI.”
- Guests & Resources
- Transcript
Partner
» BiographyCounsel
» BiographyKatherine Forrest: All right. Good morning, everyone, and welcome to another episode of “Waking Up With AI,” a Paul, Weiss podcast. I'm Katherine Forrest.
Anna Gressel: And I'm Anna Gressel.
Katherine Forrest: And Anna, before we get started today, I wanted to confess to the audience that this is our third shot at recording this podcast, and that's literally because the dog ate my microphone. I mean, yes, he took a big chunk out of the microphone.
Anna Gressel: You do sound great now, Katherine, for what it's worth.
Katherine Forrest: All right. Well, we had to wait two days until I could get access to a real mic.
Anna Gressel: It's like your version of the dog ate your homework.
Katherine Forrest: Yeah, but it's grown-up style.
Anna Gressel: Well, on the topic of keeping all of our important technologies safe, from microphones to highly capable AI models, let's turn to the topic for today's podcast, AI safety.
Katherine Forrest: Yeah, well AI safety, Anna, is an incredibly important topic and it's going to be one that we're going to come back to again and again. So, let's just go ahead and dive in by starting with a definition of what we mean by AI safety.
Anna Gressel: Totally. AI safety is a phrase that's used to cover a really broad array of topics, all having to do with the ways in which the use of AI can result in a lack of safety, and the ways in which developers, governments and interested parties have been trying to understand and address these issues.
Katherine Forrest: So, what we're saying is that the phrase “AI safety” really means the ways that the AI can be unsafe.
Anna Gressel: I think that's right, or I would say it's the concept of how to make AI safe in the face of what we're learning about the dangers that AI can pose.
Katherine Forrest: So, let's talk a little bit about those and then give an overview as to what the government and developers are doing in this regard.
Anna Gressel: For sure. So, AI safety acknowledges that AI can present powerful dangers in a number of forms. So, for example, with highly capable models, the frontier models we were talking about before, it can allow users to develop chemical, biological, radiological, nuclear and explosive weapons.
Katherine Forrest: And those risks go by the acronym CBRNE, and there's the related risk that because these models are usually highly capable, a person can interact with them using natural language prompts and without even a lot of technical know-how, can actually get the instructions as to how to build weapons, can get instructions as to how to take down a power grid or assemble other kinds of, you know, biological mishaps. I mean, there's a lot that can be done just with using natural language.
Anna Gressel: Yeah, and these are not just risks that people are worried about with, you know, individuals like the kid in your neighbor's basement. I mean, these are really concerns that people are worried about from the perspective of rogue state and non-state actors.
Katherine Forrest: Right, and another part of AI safety has to do with the spread of misinformation. Misinformation about political candidates or events that could really endanger the democratic process or the use of a sophisticated LLM to create realistic sounding political events that could precipitate even military action.
Anna Gressel: I mean, and from the misinformation perspective, one type of misinformation we've seen a lot of already is the spread of false statements about health-related issues such as vaccine efficacy or how diseases spread.
Katherine Forrest: Right, and apart from this, there's also now in the AI safety area issues about increasing capabilities of AI that could result in other significant concerns in terms of really existential risks.
Anna Gressel: Yeah, and that's the issue, Katherine. I think we've seen a lot of individuals like Stephen Hawking, Gregory Hinton, Stuart Russell, Elon Musk, Bill Gates and others who have been involved in AI from a scientific perspective, business perspective, have been discussing recently the possibility that AI can get so smart it no longer obeys the instructions of humans.
Katherine Forrest: Becoming what's called misaligned.
Anna Gressel: Yeah, like misaligned with human values.
Katherine Forrest: And this misalignment can result in AI, again, causing a potential catastrophic event for humanity. You know, AI developers have a few different ways of aligning their systems today. One includes something called reinforcement learning from human feedback, which goes by the acronym RLHF, where models can actually learn from humans responding to their outputs.
Anna Gressel: I mean, misalignment has become a really important part of the conversation today because folks are interested in solving it so that AI can't wipe out humans through an explosive event, for example. And some people in the AI space, this is very debated, but some people think that AI safety should really be our number one priority.
Katherine Forrest: Well, I'm actually one of those people and I think that either human use of highly capable AI or AI itself gone rogue can pose really unacceptable and existential risks.
Anna Gressel: Yeah, and I think it's important to recognize that these are not completely hypothetical concerns. I mean, we're really beginning to see examples from a concrete perspective of this happening. So recently, the Tokyo police arrested a man with absolutely no IT experience for having used generative AI to write a ransomware program that was designed to encrypt data on targeted systems and demand crypto as ransom. And that guy actually told investigators he “wanted to make money” and thought he could do anything if he asked AI and he actually could create this program.
Katherine Forrest: Yeah, and for folks who are interested in getting up to speed quickly in the AI safety area, we'd recommend taking a look at the recent report that was released in connection with an international summit on AI safety that was called the Seoul Summit. And the report's called “International Scientific Report on the Safety of Advanced AI.”
Anna Gressel: I mean there are other interesting papers out there, one on May 17th called “Towards a Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems.”
Katherine Forrest: Right. And one practical pointer for our audience is to remember that the EU AI Act and the Executive Order both speak to AI safety and to obligations that companies have when they have highly capable models in order to really directly confront those AI safety risks.
And I think that's all we've got time for today.
Anna Gressel: Yep, you bet. And with that, I'm Anna Gressel.
Katherine Forrest: And I'm Katherine Forrest. Don't forget to like our podcast and we'll see you again next week.