Thursday, 13 June 2013
Facebook Data Centre "Rainstorm"
Back in 2011, Facebook experienced a rather unexpected phenomenon occurring inside their Prineville, Oregon data centre. Apparently, the right set of conditions converged inside the data centre to create a rain cloud that soaked the company's servers in water. Workers inside the data centre could actually hear power supplies blowing, one pop after another.
The problem began when the building's cooling system malfunctioned. At that facility, Facebook uses only outdoor air to cool the data centre, primarily because it requires significantly less power than traditional cooling methods. This type of cooling is fine as long as air temperatures and humidity levels are kept in check. However, the malfunctioning system did not do that.
The system allowed air temperatures to exceed 26°C and the humidity to go above 95%. If you are guessing condensation was the result, you're absolutely correct. The condensation was great enough that it actually formed a visible rain cloud inside the data centre. When the building's cooling system eventually failed entirely and began circulating low humidity, high temperature air, that cloud turned to liquid.
Facebook VP of infrastructure and engineering Jay Parikh acknowledged the problem at the time saying, "this is one of those things." He went on to explain that when a data centre uses 100% external air for cooling purposes, the risk of such problems is always present. Unfortunately, Facebook did not anticipate issues serious enough to protect its servers. That has since changed.
The company is still using the outdoor air cooling system, but now all of their servers and power supplies are sealed in a layer of protective rubber. Should it ever rain at Facebook again, no damage will be done.
We are sure the Facebook episode put a smile on the faces of IT professionals everywhere. It just goes to show what a day in the life of a data centre manager can truly be like. No matter how advanced a power and cooling system is, no matter how robust the architecture, freak events do happen - serious events that can completely disable a server.
Although such events are rare in relation to total uptime, the possible risks are always in the minds of managers at hosting companies, collocation facilities, and data centres. As we reported earlier regarding Sears Holdings Corp., what appears to be a minor equipment breakdown can end up in a full-blown disaster capable of costing a company millions of pounds.
The other important component to this story is its demonstration of just how difficult cooling a data centre can be. When you are talking about hundreds of servers in an individual location, you're also talking about tremendous amounts of heat being generated around the clock. Without a good cooling system, it simply wouldn't be possible for us to maintain the Internet as it is.
It is fortunate that Facebook's rain cloud incident isn't something that's happening to the world's biggest servers every day. Otherwise, we'd all be very wet!