Twitter blames outage on dual data centre crashes
Parallel, redundant servers failed at the same time, company claims
Tech4Biz | 27 Jul 2012 :
A Twitter outage on Thursday that lasted up to two hours for some users was caused by separate data centres failing at almost the same time, the company said in an apologetic blog post.
Twitter went down between about 08:20 and 09:00 US Pacific Time on 26 July and was back in action by about 10:25 wrote Mazen Rawashdeh, vice president of engineering.
The outage affected man users in both Ireland the UK. Though some users suspected an overload of Tweets related to the Olympic Games, which opens on Friday in London, that was not the cause of the outage. Instead, two data centres that operate in parallel for redundancy both failed, in what Rawashdeh called an "infrastructural double whammy."
"What was noteworthy about today's outage was the coincidental failure of two parallel systems at nearly the same time," Rawashdeh wrote. "We are investing aggressively in our systems to avoid this situation in the future."
It was Twitter's second outage in about six weeks. On June 21, the microblogging service went down about 09:00Pacific and started to come back just after 10:00, only to fail again before full recovery began after 11:00. The company blamed that outage on a cascading bug, a type of problem that spreads from one software element to others.
IDG News Service