A Yandex data center experienced a major outage on March 30, impacting the operation of Yandex Cloud and other services.

The Russian company explained the cause of the outage in an April 7 post, noting that it is the first of its kind experienced at the data center in its 15-year history.

The location of the impacted data center has not been shared but the cloud region impact is located around Moscow.

Moscow
– Getty Images

Yandex - which has been referred to colloquially as "Russia's Google" - said that the outage was caused by an incident with the support substation.

According to the company, the data center in question is connected to the national power grid via the nearest 220kV support substation via two independent 110kV lines. Only one power line needs to work for the data center to continue running as normal.

Yandex described the outage as "an unprecedented case," noting that in the data center's 15-year history, it has never happened before, and nor has an issue of this scale occurred at the substation since its inception in 1960.

The substation was impacted for three hours - from 12:25 to 15:30 local time - during which the data center had to run on diesel generators. While critical DC infrastructure elements, including the network control centers and security services, were able to keep running, other services, such as Yandex Cloud, were impacted.

Specifically, as the data center housed the ru-central1-b availability zone of the Yandex Cloud platform - one of three - Yandex Cloud client applications deployed only in ru-central1-b remained unavailable during the entire recovery period. Some applications deployed in multiple availability zones may also have been impacted.

Yandex distributed the load between its other data centers.

In a detailed analysis of the outage posted to Yandex's blog Habr, the data center in question was developed in the 2010s on a site "that previously belonged to a plant and already had a favorable position: it is located as close as possible to a generator and a reliable energy supplier."

The blog notes that once the issue of the substation was discovered, a second problem was encountered. "With failures of this scale, which affect several regional subsystems at once, you can’t just turn everything back on; you need to make sure that this does not worsen the situation."

Power resumed at 15:30. Yandex's main services were restored by 22:22, and by midnight, the zone was fully functional.

The company said that it will include double failures in its regular "exercises," in which the operations team practices actions in emergencies, and it will also make improvements to its data center management systems to speed up cold starts.

Yandex operates five data centers in Russia, located in Vladimir, Sasovo, Ivanteevka, Mytishichi, and the newest - a 63MW data center in Kaluga Oblast around 200 miles south of Moscow.

Yandex's holding company was previously based in the Netherlands, but separated its European and Russian operations, fully divesting Yandex from the company in February 2024 as part of a strategic move following the outbreak of the Ukraine war. The European operations are now known as Nebius.

Subscribe to The Management & Operations Channel for regular news round-ups, market reports, and more.