Outage Outrage...

Now that the bits are getting themselves flowing in the right direction at Spark, its probably a good time to make some observations.

The first one is that this is the second major Spark/Telecom 'outage' media storm that I have covered from the TUANZ perspective (the first was the XT meltdown) and there have been a number of similarities and some major differences. 

I think Spark have an issue in knowing when an issue has arisen, and I think this is down to a couple of things, the first is the sheer size of the beast and the second is the distance between outsourced, call centre based customer service and senior management. 

I know its unfair but i have an image in my mind of a lumbering 'brontosaurus' with its tail on fire and it taking a while for the distant head to identify the smell, then look back to see the flames and then finally alter course to find some water to put it out. 

Once they knew they had a problem Spark were faced with three challenges, solving the problem, dealing with customers and dealing with the media. 

The first problem has been aptly described as 'whack a mole'  at the same time as they trying to find a needle in a haystack. I have the utmost respect for the systems administrators and cyber-security team who did the actual digital fire firefighting, I hope they're enjoying some well deserved rest and are in line for preformance bonuses.

The next issue is now really hard, customers now rely on the internet being always on and complain loudly when it isn't. If this had happened ten years ago it wouldn't have been as big a deal, but now it ranks just below a power cut in terms of disruption and inconvenience.

What's even more difficult is that it becomes hard to keep your customers up to date when the phones are running red hot and they can't get on-line to find out why they're not on-line. Everyone assumes that the problem is isolated in the first instance and just relates to them.

The only disinfectant for confusion and hearsay is the truth or at least as much truth as you know and to Sparks credit they had their comms team on the case on Saturday, I know this because I got caught up in phase three when the media were looking for explanations as to what was going on. 

It was at about this time that the 'fog of war' descended and people were coming up with explanations about 'malware', 'cyber crime' and why just Spark?. The media were looking for newsworthy angles and good stories and by Monday it had turned juicy, the malware was introduced by customers downloading a 'viewer' allowing to see the infamous 'hacked' Jennifer Lawrence intimate 'selfies'.  

The story had it all now but still didn't feel right, so it was with relief that I read this piece on Stuff this morning. 

Mundane as it seems that does make a lot more sense, there is still a 'cyber warfare' angle but it seems Spark were mere pawns in a much bigger game. 

So what lessons have we learnt? 

 Spark need to run a 24/7 NOC that is pretty pro-active in getting alerts out to customers

Spark need a non-internet based method of alerting customers to outages

Spark need to get the media on side fast

Spark have done well sharing details once they are known

Spark need to pay more attention to customers 'edge' devices and possibly manage them remotely

I hope this helps.