WorldCom Outage Only the Start
But some say it's only the beginning.
"I think that we haven’t even seen the tip of the iceberg yet," says Alex Yuriev, an independent technology consultant.
“It will happen again,” agrees Tom Ohlsson, the vice president of business development at Matrix NetSystems, Inc., an independent network monitoring firm. “It may not be WorldCom, but it will happen again.”
Massive layoffs and the dramatic drop in capital and operations spending is showing up in service degration, analysts say. While some folk were almost expecting a large-scale breakdown of the UUNet backbone following WorldCom’s gigantic accounting and bankruptcy woes over the summer, other networks have been experiencing similar problems, they warn -- problems that are getting worse (see WorldCom Workers Get the Shaft, Carrier Spending Hopes Dim, Whither WorldCom's Network?, AT&T: WorldCom Shutdown No Problem and WorldCom's at $7.1 Billion and Counting).
"This was very similar to the AT&T outage about a month ago. I think that the cash crunch has a significant effect on this.” Yuriev says.
According to Ohlsson, carriers have continued to take care of their core networks despite the capex crunch. But they have let their peering relationships, whereby network links are shared with other operators, slip a bit, and it’s showing in the performance of the Internet. Peering relationships take daily maintenance, he says, and they're not always getting it in today's environment.
Not true, says WorldCom spokeswoman Jennifer Baker. “At WorldCom, obviously the core of our network is very important, but [peering] is equally important,” she asserts.
Meanwhile, argument continues about what exactly caused WorldCom's outage. Initial reports indicated that the outages that started at 8 a.m. on October 3 and continued through most of the day were caused by faulty software loaded onto UUNet edge routers during a routine software upgrade. But Baker says the problem occurred when a technician who was repairing a gateway or border router in St. Louis made a configuration change that caused a routing instability in the network. This caused intermittent outages that lasted between 15 and 30 minutes each, she says, claiming that the problems were resolved by 2 p.m.
There's been much speculation about the supplier whose equipment was at fault. WorldCom buys most of its routers from Cisco Systems Inc. (Nasdaq: CSCO) and Juniper Networks Inc. (Nasdaq: JNPR), but the carrier won't say which of the vendors provided the routers in question. “We’re working with that vendor to make sure the product is more stable,” Baker says.
She also concedes that WorldCom has changed its operating procedures.
That’s a good thing, says Yuriev, who says WorldCom’s procedures, or lack thereof, should get most of the heat for what happened. The carrier was trying to do two things at the same time, he says, likely with inadequate communications among departments doing the work. He points out that while some of his clients that use WorldCom local loop services on the East Coast received notices of maintenance on transport nodes, others were told at the same time that UUNet would be performing a software upgrade.
Baker says the software upgrade happened within the normal maintenance window, which is very late at night or very early in the morning, and she says it did not overlap with the router repair. She wouldn’t comment on Yuriev’s speculations.
WorldCom’s troubles over the outage are probably far from over. Cutting service to a large number of customers for up to six hours is likely to have violated a slew of service-level agreements (SLAs), and customers will be asking for compensation. A service breach on this scale might prompt some to terminate their contracts altogether.
“We will honor any SLA agreements we have with customers,” Baker says, admitting that 20 percent of WorldCom’s U.S.-based customers were affected by the outage.
Bottom line? According to Ohlsson, the situation in which thousands of Internet users were deprived of a fast and flawless connection to the Web drives home the need to diversify and host with multiple providers. “It’s not like the sky is falling… but we do feel that it’s going to get worse before it gets better.”
— Eugénie Larson, Reporter, Light Reading