Todd Underwood has joined generative AI company Anthropic as the head of reliability.

There, the former Google and OpenAI employee will build a new team called AI Reliability Engineering (AIRE).

Anthropic
– Anthropic

"I know that collectively we have a wide range of opinions about AI," Underwood said on LinkedIn.

"Some of us are profoundly skeptical, annoyed, or actively concerned about the effects of these technologies on our society. But there are also those who cannot get more AI fast enough, and who cannot wait to see the improvements that are coming. We all have to take AI technologies seriously while still acknowledging the limitations and problems.

"Everyone that I’ve talked to at Anthropic also seems to be grappling with these questions in a way that impressed me with honesty and seriousness. They are an impressive bunch and I'm honored to get the chance to work with them."

Underwood spent nearly 15 years at Google, co-authoring the O'Reilly book Reliable Machine Learning. There, he created the Machine Learning Site Reliability Engineering (ML SRE) organization."

SREs are tasked with building and maintaining highly reliable and scalable software systems.

He left for OpenAI last year, joining amid the chaos of CEO Sam Altman's firing, and was among those who signed a letter threatening to quit and join Microsoft if he wasn't rehired. Altman was back at OpenAI after five days.

At OpenAI he was tasked with setting up an SRE team focused on research and training workloads. The company already had an SRE team for the applied side working on inference and API products.

However, this September, OpenAI's head of research platform, Tal Broda, scrapped the new SRE team and laid Underwood off.

"Stay tuned for updates on reliability engineering efforts once I better understand the excellent work that is already in flight and look for ways to extend it," Underwood said of his new role.

"And, if you have reliability problems (availability, quality, latency) with Claude web, mobile app, desktop app, or API, I’ll be keen to learn more."

This week, Anthropic's CEO said that AI training data center clusters will cost $10 billion in 2026, and $100 billion from 2027.

Anthropic has raised $2.3 billion from Google and $4 billion from Amazon Web Services, and primarily uses the latter company's cloud.