EBay (EBAY) this morning clarified what triggered the software glitch that crashed its Skype Internet phone service for two days last week: A Microsoft (MSFT) Windows security download that required users to restart their computers.
On the surface, it seems strange that such a routine event – restarting a computer – could disrupt a global voice communications network like Skype. But two of the features that make Skype so convenient also proved to be its Achilles heel in this instance:
(1) When a Skype user’s computer boots up, it typically tries to log into the Skype service immediately. When the Windows update prompted millions of computers to restart at roughly the same time, the Skype system was overwhelmed.
(2) Because Skype is based on peer-to-peer technology, Skype users rely somewhat on each other’s computers to make the service work. Each user computer that’s part of the Skype network can play a role in routing voice traffic and completing other tasks. How? Picture a giant safety net with lots of tiny mesh connections. Normally, the net can still work even if a few of those connections are torn. But last week so many of those connections tore that the net itself broke apart.
Worst-case scenario
On normal days, Skype’s peer-to-peer voice-over-IP structure gives it advantages over the centralized structure used by traditional telecom providers like AT&T (T), Verizon (VZ) and Sprint (S). In Skype’s own words:
Decentralized P2P networks have several advantages over traditional client-server networks. These networks scale indefinitely without increasing search time and without the need for costly centralized resources. They utilize the processing and networking power of the end-users machines since these resources always grow in direct proportion to the network itself. Each new node added to the network adds potential processing power and bandwidth to the network. Thus, by decentralizing resources, second generation (2G) P2P networks have been able to virtually eliminate costs associated with a large centralized infrastructure.
But on days like the ones Skype experienced last week, its peer-to-peer system doesn’t seem like an advantage at all. Mainstream telecom service providers pride themselves on having what they call “five nines” availability: service that’s working properly 99.999 percent of the time. A two-day disruption of traditional phone service is pretty much unthinkable, given that society relies on it not only for business and social communication, but also for emergency services.
Normally Skype’s peer-to-peer network has an inbuilt ability to self-heal, however, this event revealed a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly. Regrettably, as a result of this disruption, Skype was unavailable to the majority of its users for approximately two days.
This Skype outage is sure to provide ammunition to rivals who wish to convince customers that they would be foolish to try to run their business on Skype alone. So although eBay is seeking to reassure its users that Skype is now fixed and ready to handle similar problems in the future, the company might have to work hard to win back the confidence of businesses.