Nearly half of AGI security staff have left the company, says former researcher

Nearly half of OpenAI’s staff, once focused on the long-term risks of overpowered AI, have left the company in recent months, says Daniel Kokotajlo, a former OpenAI governance researcher.

OpenAI, the maker of AI assistant ChatGPT, is widely regarded as one of the few companies at the forefront of AI development. According to its founding charter, the company’s mission is to develop a technology called artificial general intelligence (AGI) in a way that “benefits all of humanity.” OpenAI defines AGI as autonomous systems that can do the most economically valuable work, just as humans currently do.

Because such systems could pose significant risks, including, according to some AI researchers, the possibility that they could slip beyond human control and pose an existential threat to all of humanity, OpenAI has, since its inception, employed a large number of researchers focused on what it calls “AGI safety”—techniques designed to ensure that a future AGI system does not pose a catastrophic or even existential threat.

According to Kokotajlo, those researchers have been decimated by recent resignations. The departures include Jan Hendrik Kirchner, Collin Burns, Jeffrey Wu, Jonathan Uesato, Steven Bills, Yuri Burda, Todor Markov and co-founder John Schulman. Their departures followed the high-profile resignations in May of chief scientist Ilya Sutskever and another researcher, Jan Leike, who co-led the company’s so-called “superalignment” team. When Leike announced his resignation on social media platform X, he said that security was increasingly “taking a back seat to shiny products” at the San Francisco-based AI company. (The superalignment team was supposed to be working on ways to control “artificial superintelligence,” an even more speculative technology than AGI that would involve autonomous systems more powerful than the collective intelligence of all humans combined.)

Kokotajlo, who joined OpenAI in 2022 and left in April 2024, said Assets in an exclusive interview that there had been a slow but steady exodus in 2024. Of the approximately 30 employees who had worked on issues related to AGI security, only around 16 remained.

“It wasn’t a coordinated thing. I think it’s just people giving up one by one,” Kokotajlo said, as OpenAI becomes more focused on products and commercial products and less emphasis on research to figure out how to ensure safe development of AGI. In recent months, OpenAI hired Sarah Friar as CFO and Kevin Weil as chief product officer, and last week the company brought in former Meta executive Irina Kofman to lead strategic initiatives.

The departures are important because they could say something about how cautious OpenAI is about the potential risks of the technology it is developing and whether profit motives are leading the company to take actions that could pose dangers. Kokotajlo has previously described the race among big techs to develop AGI as “reckless.”

An OpenAI spokesperson said the company was “proud of our track record of delivering the most powerful and safe AI systems and believes in our scientific approach to managing risk.” The spokesperson also said the company agreed that a “rigorous debate” about potential AI risks was “critical” and that it would “continue to engage with governments, civil society and other communities around the world.”

Although researchers in the field of AGI security have primarily focused on controlling future AGI systems, some of the most effective methods for better controlling the large language models that underpin today’s AI software—to ensure that they do not use racist or harmful language, write malware, or give users instructions on how to create bioweapons—have come from researchers working on AGI security.

While Kokotajlo could not comment on the reasons for all the resignations, he suggested they were consistent with his belief that OpenAI was “pretty close” to developing AGI, but not yet ready to “handle everything that comes with it.” This has led to what he described as a “chilling effect” within the company on those trying to publish research on the risks of AGI, and “increasing influence from OpenAI’s communications and lobbying departments” on what can be published.

OpenAI did not respond to a request for comment.

Kokotajlo said his own concerns about the changing culture at OpenAI began even before the board drama in November 2023, when CEO Sam Altman was fired and then quickly rehired. At the time, three members of OpenAI’s board who handled AGI’s security were fired. “That sort of sealed the deal. There was no turning back after that,” he said, adding that while he didn’t have access to what was going on behind the scenes, it felt like Altman and President Greg Brockman (who recently took an extended sabbatical) had “consolidated their power” since then.

“People who are primarily concerned with AGI safety and preparedness are increasingly being marginalized,” he said.

However, many leading minds in AI research, including former Google Brain co-founder Andrew Ng, Stanford University professor Fei-Fei Li, and Meta chief scientist Yann LeCun, believe the AI safety community’s focus on the alleged threat posed by AI to humanity is overblown. AGI is still decades away, they say, and AI can help solve the real existential risks to humanity, including climate change and future pandemics. Moreover, they claim that the excessive focus on AGI risk – often by researchers and organizations funded by organizations with ties to the controversial effective altruism movement, which focuses heavily on the “existential risk” of AI to humanity – will lead to laws that stifle innovation and punish model developers, rather than focusing on the application of AI models.

These critics say an example of such a law is California’s SB 1047, which was supported by groups with ties to effective altruism and has sparked heated debate ahead of a final vote in the House expected this week. The bill aims to set limits on the development and use of the most powerful AI models.

Kokotajlo said he was disappointed but not surprised that OpenAI opposed SB 1047. He and former colleague William Saunders wrote a letter last week to the bill’s sponsor, State Senator Scott Wiener, saying OpenAI’s complaints about the bill “do not appear to be in good faith.”

“In a way, this is a betrayal of the plan we had from 2022,” said Kokotajlo AssetsHe pointed to efforts by OpenAI and the wider AI industry to assess the long-term risks of AGI and to achieve voluntary commitments on how to respond when dangerous thresholds are exceeded. The results will then serve as “inspiration and template for legislation and regulation”.

Still, Kokotajlo doesn’t regret joining OpenAI in the first place. “I think I learned a lot of useful things there. I feel like I probably made a positive difference,” he said, although he regrets not leaving sooner. “I had started thinking about it before the board crisis, and in hindsight, I just should have done it.”

He has friends who continue to work on AGI security at OpenAI, he added, saying that although OpenAI’s Superalignment team was disbanded after Leike’s departure, some of those who remained have moved on to other teams where they are allowed to work on similar projects. And there are certainly other people at the company focused on the security of OpenAI’s current AI models. Following the dissolution of the Superalignment team, OpenAI announced in May a new security committee “responsible for making recommendations on critical security decisions for all OpenAI projects.” This month, OpenAI appointed Carnegie Mellon University professor Zico Kolter, whose work focuses on AI security, to its board.

But Kokotajlo warned those who remain at the company about groupthink in the race among the biggest AI companies to develop AGI first. “I think part of what’s going on is that, of course, what is considered the common sense view in the company is shaped by what the majority thinks and also by the incentives that people face,” he said. “So it’s not surprising that companies end up concluding that if they win the race to AGI, it’s good for humanity – that’s the conclusion that provides incentives.”

Update 27 August: This story has been updated with statements from an OpenAI spokesperson.

This story originally appeared on Fortune.com

Leave a Reply Cancel reply