Farid Dana joined AMD at the end of October 2008. The man responsible for the chip company’s infrastructure came from a background of data center networking.
His brief was to look at the infrastructure to “see where we could gain efficiencies”.
AMD’s global fleet had grown to 18 data centers – mostly through acquisition. It was grossly ineffi cient from several points. Areas that needed addressing included the cost of power and the effi ciency of collective services delivered from each data center. “It was pretty chopped up,” Dana says. “My review took the 18 data centers through a journey to gain the required efficiency.”
AMD visited everything from software licensing to the cost of electricity in the data
center, all the while asking: “What can we do to make the private cloud more effi cient?”
Dana says the company had been trying this approach for four or five years but could not even get the ball rolling. It had nine data centers in North America, with more across Asia – mostly in China and including two in India.
Most were on lease and from a facility point of view were maxed out in terms of power.
Among the questions to address were: When would the lease run out? Where were the engineers in relation to the location of the data center? How effi ciently were the data centers being run? What Dana found was that some were vulnerable to outages, not having the required redundancy power set up.
Another factor for consideration was which data centers contained equipment that was reaching its end of life.
“When it came to placing new equipment we asked if it made sense to put it in an existing data center. Or did it get moved to a new data center and have the applications moved to that facility?” Dana says. “This required a phased approach to the migration. “We proposed a journey which received board approval to build two data centers – one in Kuala Lumpur, Malaysia, which was online in 2010 and one in Georgia, US, which came on board in Summer 2012. The target is to have three sites. We have moved from 18 and have six more to go. We expect to be completed by 2014.”
Dana says the main consideration for AMD was the impact on engineers. AMD
is a design and manufacture company and traditionally engineers like to have IT close to them. This meant that bringing the engineers on the journey was key to AMD’s success. (“Corporate applications you can run from anywhere,” he says.)
Dana says engineers want to work in grid-type environments. “You have to be careful when attempting to overlay design automation. It cannot be done over the grid unless it is done carefully,” he says. He believes corporate applications can be run from anywhere – including the Cloud – but engineers like to have their applications close by. Performance is key to engineering and “you can’t just add latency”.
So getting engineers to operate across even a private cloud meant addressing their concerns. As an IT designer and manufacturer, AMD knows about the energy efficiency to be extracted from the latest multi-core processors and has great insight into optimized deployments.
The combination of the new sites, the latest IT and new power and cooling equipment means running fewer facilities and running those at a minimum of 40% greater efficiency. The facility at Suwanee will have a 10MW capacity once complete but in raw processing power is far beyond anything that currently exists.
“In nine data centers in the US we were running 8.3MW and nine in Asia were running close to 3.5MW. In Asia, after the consolidation, we’ll end up at 2MW but can expand to beyond 5MW,” Dana says. “We will have to go to 2MW over the next two years and ultimately, if needed, go to 10MW over 10 years. The difference is that we’re not powering up or having UPS (Uninterruptible Power Supplies) and generators on board today. That allows us to pursue new methods in efficiency. So as new technologies come along, we have the facility and we have the capability to adopt them but we don’t have to over-provision to meet existing needs.”
At a rack level, the sites are designed to handle up to 32kW per rack in a dense environment – averages are closer to 14kW to 18kW. “With the new cores we can extend the life of the server and get more horse power for less energy used,” Dana says. “We can replace the chips and the DIMM and this gives us a green, more efficient environment.”
AMD deployed HP ProLiant BL465c G7 servers with AMD Opteron 6200 Series processors to help improve the compute power of the engineering cloud, which performs up to 40 million engineering simulations per month.
On the networking side it will run HP 12500 Switch Series and HP 5820 Switch Series. “We were experiencing sub-optimal business operations due to infrastructure sprawl,” Dana says. “One of the reasons was the top-of-rack form which makes it easy to manage. HP uses the same OS in the access layer, distribution and core layers as pretty much the same which gives us effi ciency. We stayed with layer three MPLS for the carriers and have gained significant cost savings.”
FROM NON_BELIEVERS TO CRUSADERS
By 2014, AMD’s goal is to have three data centers. The life expectancy is 10 years and depending on equipment refresh this will be expanded with five more years.
For Dana, however, the project is a lasting legacy for AMD. “I love data center consolidation projects. I love seeing nonbelievers become believers,” he says.
“The biggest challenge in consolidation is changing the attitudes of people who are used
to having equipment close by. They didn’t believe it could be done over distance. Now we’re doing it across the Pacific.”
Dana says all it takes is a little belief. “It is amazing to see how engineering views it (consolidation) differently when they see it can be done,” he says. “The challenge was the engineers and the belief system. We engaged early with the engineers. They were part of the team. They helped us do it. We’re a chip maker and design manufacturer – we had tohave the help of the engineers to make it work.
“Today they are happier in terms of not having to run over fragmented clouds and data centers. Now they have a giant private cloud which has 200,000 cores and the visibility andunderstanding of how it works.”