Sunday, April 22, 2007

Blackberry blackout

Last week's Blackberry blackout refreshes two important lessons for IT Operations.

First is ofcourse never deploy without adequate testing. RIM went on record to state that the outage on Tuesday/ Wednesday which left millions of users affected was caused by an untested (or atleast 'not tested sufficiently enough') system update. Talk about explanation being worse than crime ...
http://www.topix.net/business/telecom/2007/04/system-update-led-to-blackberry-outage and
http://news.com.com/2100-1039_3-6177829.html?part=rss&tag=2547-1_3-0-20&subj=news

Anyone who feels that testing is so much waste of time and resources (imagine setting up a separate test environment ... and what about all those deadlines) ... should just take a look at what RIM customers are just thinking about ... Blackberry Dependency Re-evaluated After RIM Service Outage ...

Another aspect comes from the following "Blackberry said in a statement that the failure was triggered by 'the introduction of a new, non-critical system routine' designed to increase the system's e-mail holding space. " -- From BBC News Report

Consider the 'non-critical'. If something that affects millions of users is 'non-critical', then there is something seriously wrong with the classification perception.
How is a technology asset classified? Is it on the value of the hardware or the number of lines of code in the software patch ? Or is it on the potential impact to customers or loss of value in business that could be caused by such a failure?
Was there a business owner for this asset who signed-off or approved the application of a change/ update? Was he informed of the "non-critical" classification with a reliable data? Was he involved at all in the classification process?

So many questions. RIM is a great company and I am sure they will already have thought about these and many more things. They will only grow stronger and come back with stronger processes to avoid such things in the future.

The point is if such a great company can make a mistake that can lead to such a high impact, the rest of us need to be doubly careful ... So are, your technology assets classified adequately?

Have a good time ...

No comments: