re: Verizon Biz Searches for Core Competency> Am I wrong that the Internet was designed to withstand a nuclear blast? > Why can VZ not architect a network that can route around failures?
Yeah, nuclear blasts. But this was before dynamic routing protocols. These were also designed to recover, but after a few seconds (though in practice it sometimes took longer).
VZ is looking to run voice/video with POTS/CATV quality, which means 50mS detection & recovery, & < 3 minutes of downtime per year including software/firmware updates. This is MUCH more challenging than anything ARPA ever strived for.
> Perhaps VZ is just trying to re-crate the centralized TDM network with IP routers.
re: Verizon Biz Searches for Core CompetencyDear spelurker:
Thanks. You are right, ARPA was not in real time. However, I keep hearing that mesh networks add capacity and redundancy at the same time. Can't VZ design a meshed network that supplies that redundancy while also insuring sufficient capacity for most needs? If they get swamped, they can perhaps degrade those video streams a bit. It just seems to me that they are trying to rebuild those expensive single points of failure that stat-muxing and routing were designed to avoid.
re: Verizon Biz Searches for Core Competency To follow on to Geoff's comments one thing that is often forgotten is that the RBOCs have a mindset of 5 9's for financial reasons.
The FCC and State PUCs require reporting on downtimes. If these exceed standards, the RBOC is fined. That is on top of service SLAs that business customers might have.
The RBOCs took that mentality and turned into a point to compete on.
re: Verizon Biz Searches for Core CompetencyComrades,
Original ARPA idea of resilience: "If half of my network gets vapourised, I guess I'm doing pretty well if I can reroute in a minute or so."
Telco idea of resilience: "If some dickhead puts a backhoe through one of my cables then somebody who is making a phone call through that cable should not have their call disconnected, and ideally shouldn't even notice any problem."
At the risk of sparking off a huge debate, it's REALLY hard to achieve typical telco resilience levels (ie. sub 50ms) using IP protocols alone. Remember that SONET/SDH only achieves sub 50ms on an individual span or ring, not end to end. So in the first place letGÇÖs make sure weGÇÖre comparing like with like.
I wrote a big report that included this issue a couple of years ago for Heavy Reading, at at that time the situation was that if an ISP wanted 50ms rerouting then they were using SONET/SDH, or they trusted in a vendor-specific GÇ£extensionGÇ¥ to IP.
Rerouting has a number of key stages that (depending on which text book you read, and forgive my paraphrasing the correct terms) include link failure, failure detection, notification, rerouting, path setup, path testing and path restoration.
From the application perspective, the total outage time starts when the links fails, and ends when the path is restored (plus some delay before the application actually notices the link is back). So which combination, if any, of these events are we thinking of when people quote 50ms? ThatGÇÖs actually open to debate, but itGÇÖs interesting to understand the magnitudes of each step. Note that for SONET/SDH, the 50ms is based on the delay between loss of timing signal and regaining the timing signal again. ItGÇÖs a Physical Layer thing, and not measured at the application level.
For example, one of the unspoken secrets of the IP world is that it may take IP devices quite a while to figure out that a link is actually bad. The reason for this is that IP networks are inherently asynchronous. In other words, if a given path has no data on it, is it because thereGÇÖs a fault in this path, or is it because thereGÇÖs no data to send at the moment?
Most of the fast rerouting demonstrations (note IGÇÖm not implying MPLS Fast Rerpute here, but it would be included as a fast rerouting technology) actually rely on a link failure alarm being signalled by an underlying SONET/SDH interface.
You see, SONET/SDH is a synchronous technology. In other words a frame has to be sent every 125 microseconds, even if itGÇÖs an GÇ£emptyGÇ¥ frame with no data on it. So if a receiver doesnGÇÖt see a frame within a given time period then it can assume the link has a fault and notify high level rerouting systems of a fault extremely quickly.
But there are obvious disadvantages to synchronous technologies in terms of bandwidth efficiency, technical complexity etc. So IP networks were designed to operate without any assumptions about the underlying transmission technology. Routing protocols like OSPF, for example,. Make use of a keep alive (breath of life) protocol that, by default, is sent every 3 seconds to see if the other router is still there. In addition, OSPF uses a GÇ£hold downGÇ¥ technique (that was initially used to prevent false alarms in terms of link failure notifications if links were GÇ£bouncingGÇ¥ up and down). So by default, an OSPF router should wait for 3 BoFLs to be missed (ie. 9 seconds), plus a 1 second delay, then it will start rerouting. These timers can be turned down (to a minimum of 3 seconds total). Or the router can be configured to use a SONET/SDH alarm to trigger an immediate reroute.
Once the router knows the link has failed, it then has a few options. Ideally it should keep a set of alternate routes available so that it can reroute traffic immediately. This was a feature of ATMGÇÖs PNNI, for example. Only if the none of the precalculated routes are available should the router go through the whole OSPF/ISIS routing algorithm.
The next step involves connection-oriented networks like ATM and MPLS. These technologies have to set up paths before they can send traffic on them. Path signalling can take a GÇ£long timeGÇ¥ (relatively speaking) and so for MPLS Fast Reroute demos, the alternate paths are always set up in advance. This certainly speeds up rerouting, but if we care about QoS and link loading, we also need to reserve capacity on the backup links.
Ironically at this point, MPLS Fast Reroute starts to take on all the complexity and disadvantages that MPLS proponents (of which I was one) use to complain about in SONET/SDH. Basically the guys who designed SONET/SDH put all that stuff in for a reason, and around about 2002 the MPLS folks began to understand what those reasons were, and started to build them into the MPLS specs.
Anyway, thatGÇÖs enough gasoline on this particular fire, hope itGÇÖs given some useful background :-)
re: Verizon Biz Searches for Core CompetencyHi fiber_r_us,
Thanks for the compliment - I think ;-)
I do accept, and apologise, for mixing up links and paths, but my excuse is I was just trying to keep things simple, and yet still managed to splurge out a page and a half. Go figure. And I think that was before my second cup of coffee too :-)
I'm puzzled about a couple of things you said, but I could certainly be wrong...
You mentioned Ethernet IDLE frames. Are these sent switch to switch, not just switch to node? I'm not an Ethernet expert, but I know that vendors like Extreme have some fast reroute thing (I'm racking my brain for it, I know it has a funny acronym like SHLURP or NEAP or something). Wasn't that supposed to fix the switch to switch issue, but then it makes the network proprietary? As I understand it, it was basically a fast hello between the switches.
If link failure is detected on pure ATM OAM I guess that relies on SSCOP timeouts, which are several seconds. However, if ATM is operating on a SONET/SDH interface then loss of light can be used as a sub-50ms trigger.
I don't thing we're disagreeing on the rest of your post. I tried to highlight the apples-for-oranges comparison of SONET/SDH section (I guess this is a better term than link) "rerouting".
It's also appropriate to point out that the decision that SONET/SDH makes is vastly simpler than IP. With SONET/SDH we're talking an A or B decision, but IP could be looking at a full mesh.
I suppose one question I would ask is that if IP really is able to reroute so quickly, why are there so many extended network outages?
Candidate answers:
A: ISPs are operating on such thin margins they can't afford to design fully resilient network topologies.
B: A lot of networks are just badly designed.
C: Many resilience features are proprietary, and most ISPs have at least two vendors (ie. they buy their BRAS from Juniper and their routers from Cisco).
D: So much complexity has been added to IP over the years, and the system level specifications for IP are so vague and poorly documented (eg. where is the RFC for BGP that would actually allow a vendor to create a stable, interoperable implementation?) that a lot of stuff just goes wrong.
re: Verizon Biz Searches for Core CompetencyGbennett, you are usually pretty good, but this is way off the mark and mixing up a lot of stuff:
>For example, one of the unspoken secrets of the IP world is that it may take IP devices quite a >while to figure out that a link is actually bad. The reason for this is that IP networks >are inherently asynchronous. In other words, if a given path has no data on it, >is it because thereGÇÖs a fault in this path, or is it because thereGÇÖs no data to >send at the moment?
In this statement, you are mixing "IP networks" with "links" and "paths". Let's try to be a little more specific:
- "IP" doesn't know anything about "paths" (i.e. end-to-end connections through a network). TCP has that concept at the next higher layer between the clients on the TCP session. But, the routers and the "network" are unaware of them in general. That is, IP routers do not calculate new "paths" through the network; routers only calculate next-hop routes.
- The vast majority of IP networks are built upon links DIRECTLY between routers based on:
>> SONET/SDH (or DS-x) links (POS) >> Ethernet >> ATM PVCs (in the DSL access world)
In each of these cases, there is a mechanism for the hardware to determine if the link is up or not:
>> SONET/SDH uses frame arrival (as you mention) >> Ethernet uses detection of the symbols in the line code. While Ethernet is "asynchronous" at the packet level, it is "synchronous" at the bit-stream level where there are always bits arriving (even when no packets are there, IDLE symbols are being sent on the link). >> ATM uses its OAM features to detect the loss of PVCs.
In all of the cases (even Ethernet), faults are detectable in sub-millisecond time-frames. If a router does not detect link-faults in this manner almost instantly, then the router is simply BROKEN (and not following standards), and you should be contacting your vendor for a replacement. I know of no major main-stream routers that haven't performed this way for almost a decade.
Since the link faults are detectable at the link-layer, there is no dependency on the routing protocols, and thus the OSPF link timers are almost never used. They are there as a fall-back mechanism if some, more complex, failure mode occurs (like a routing protocol failing, but the link is still "up").
One can certainly build a poorly designed network that can defeat this behavior. For example, instead of using point-to-point Ethernet links between routers, the designer could place a layer-2 Ethernet switch between router ports. In this case, if an Ethernet link fails (between a router port and the Ethernet switch), the router on which the link fails will be aware of the failure, as will the port on the Ethernet switch. But, the other routers on that same Ethernet switch will not know (at the link level) that the one link has failed. In this case the routers must use the OSPF link timers. Recent router implementations allow for very fast OSPF link timers (ms) to accommodate this (poor) design.
Once a link-fault has been detected, the router must "converge" on a new topology (one that is missing the failed link). Again, much improvement has occurred here over the years and VERY large networks can converge in 100ms. Note, that "convergence" in the IP network is the time it takes for all of the routers (for OSPF, all of the routers in an OSPF area) to complete a new topological calculation and update the route table. This is not directly comparable to SONET's 50ms "switching" time, which is a link-layer operation and doesn't include detection times and re-synchronization time.
Neither measurement addresses application performance as a result of the transient switching/convergence event. Application performance is totally dependent on the specific application and its behavior associated with the packet loss that occurred during the SONET switch or IP convergence time.
re: Verizon Biz Searches for Core CompetencyDear Geoff and Fiber: Great posts! I have an "F" for your last list: perhaps these vendor specific restoration solutions are just not very good.
The question boils down to what cost we collectively are willing to pay for 5-9s. Do we want an open internet like we have had, or expensive, closed walled gardens that are supposed to stay up ALL the time?
Clearly, the answer varies by application, since my e-mail is not the same as my stock trade. If people are just using this to watch movies, what is a minute of downtime once a year? Perhaps, we just cache this stuff all over to cover up network failures. Its a question that at least needs to be asked.
re: Verizon Biz Searches for Core CompetencyGREAT posts!!!!!!!
I would add that it is usually not as simple as fiber_r_us explained, but in principle(or is it principal) he was correct.
Most (mid to large) networks now consist of DSLAM/OLT to aggregation ethernet switches to routers to MPLS device and encompasses the recovery method of all these devices (at each of their levels) from link (SONET/SDH) to network recovery (MPLS (or maybe IP))
re: Verizon Biz Searches for Core CompetencyThey are not IDLE "frames", they are idle *symbols*. For each Ethernet implementation, a line code is used to encode the data and control information into a series of symbols:
- For 100Mb/s Ethernet, this is the 4b/5b line coding derived from the FDDI standard.
- For 1000Mb/s Ethernet, this is the 8b/10b line coding derived from the Fibre Channel standard.
- For 10Gb/s Ethernet, this is the 64b/66b line coding scheme.
Each of these mechanisms encodes the data (the octets) from upper-layer protocols into a series of symbols that include the data and control information. When no data is presented to the link, the coding schemes typically transmit idle symbols.
Thus, the receiver on an Ethernet link is in synchronization with the transmitter because it is always receiving a stream of symbols that can be decoded into frame data and control information (IDLE symbols are a form of control information).
Each line coding scheme contain a set of valid symbols that can be decoded into data and control information, and a set of invalid symbols that would not appear on a link during normal operation. If the receiver ceases to receive a stream of valid symbols, then the link is marked as failed. For GigE, this is done within 3 symbol times (3 *micro* seconds).
This mechanism is part of the Ethernet standard and must be implemented on all Ethernet gear. It is not a vendor specific thing.
Once a link failure has been detected, what happens next, and how fast it happens, is specific to the device, the network, and how it is all configured.
Also, as you and I both point out, the convergence mechanism that IP uses is far more complicated than protection switching in SONET. This allows IP routing protocols to handle issues in a far more sophisticated and robust manner than a layer 1 protection switch. For example:
How would a SONET protection mechanism deal with:
- A router port failure? - An entire router failure? - A failure of a routing protocol? - A city failure (think New Orleans)
It can't, because SONET is a link layer thing, and to recover from these problems you need a network layer thing. Yet robust networks must handle problems like these, and that is exactly what routing protocols do. Handling the failure of one link is the simplest thing a routing protocol has to deal with.
On the topic of extended network outages, you have certainly hit on some of the issues. I think you can classify network outages partially by the scope of what was affected:
1) The case of a single customer being down: This is almost always because there has been no redundancy designed into the customer's connection to the providerGÇÖs network. In this case it is likely that the last-mile link has failed (link cut, port failed, router failed, etc) and there was simply no redundancy. I think Light Reading got bit by this a while back. As has been stated many times on this board and elsewhere, if you are serious about uptime, you need to have redundant routers, links, and providers. "Single-homing" to one provider is asking for trouble.
2) The case of all of the customers on a local ISP's network loosing connectivity: This is most likely because the provider is running on a "shoe-string" budget, has built little redundancy into his network, and something critical has failed. This describes my situation currently as my Internet connection is through a wireless provider who has no redundant links in the network (its all I can get!). There have been several extended outages while replacement equipment was brought-in. While this is a bad design, it was because of budgetary constraints. For major backbone providers, in general, the network is built well by competent people and they do not suffer from not having sufficient redundancy. Naturally, even with the big guys, there have been exceptions!
3) The case of a specific major provider's entire (or large part of) backbone spectacularly failing: This is almost always due to operator error (mis-configuring a router) or bug in router code. In both cases, the end result is that the routing protocols either completely fail or get so "overloaded" that they are no longer effective. No routing protocol, no network! Fixing these can take a long time and can include re-loading all of the routers in extreme cases. These outages usually make the headlines.
4) Outages in the Internet at large: The problems here are wide and varied, but usually are *not* link failures, router failures, or routing protocol failures. Mostly these outages have to do with overloaded peering links, overloaded servers, or mis-configured routing protocols. All of these things take time to fix; which explains the perceptions that the "network" has been down a long time. In reality, the "Network" isn't down, something else is wrong that is affecting a large group of users. I know of no instance in the modern Internet where the GÇ£entire InternetGÇ¥ has been down.
In none of these cases was the outage due to a link failure, router failure, or router port failure. These types of failures occur regularly on large networks and are recovered by the routing protocols; in most instances without any customer or application noticing. In fact, most of the large IP backbones do not use SONET protection mechanisms on their backbone links. Yet, I would guess, that there are failures on these major backbones (links, ports, routers, etc) somewhere in the world every few hours or so. Yet no one really notices! That is because the routing protocols work very well.
re: Verizon Biz Searches for Core CompetencyI was mostly talking about the backbone networks (national/international nx10G stuff).
In the case of access networks (DSLAM/OLT/etc), there is almost certainly no redundancy in the last mile (DSLAM to CPE). In many cases, there is no redundancy between the DSLAM and the CO either (that is, the DSLAM sits on an un-protected fiber link back to the CO).
Once you are in the CO, and for interconnectivity within the CO and to other metro sites, redundant and diverse equipment and links are pretty much the norm (though there are exceptions).
> Why can VZ not architect a network that can route around failures?
Yeah, nuclear blasts. But this was before dynamic routing protocols. These were also designed to recover, but after a few seconds (though in practice it sometimes took longer).
VZ is looking to run voice/video with POTS/CATV quality, which means 50mS detection & recovery, & < 3 minutes of downtime per year including software/firmware updates. This is MUCH more challenging than anything ARPA ever strived for.
> Perhaps VZ is just trying to re-crate the centralized TDM network with IP routers.
That is certainly coloring their expectations.