x
<<   <   Page 7 / 9   >   >>
Mark Seery 12/5/2012 | 12:14:10 AM
re: Caspian Comes Out I can not speak specifically about Caspian but let me say the following generic things about some of the issues that have been raised in this thread.

A multi-stage switching fabric is a network of fabric nodes within a switch or router. But it is not a network that is like the Internet. It is a network topology that is much smaller, and by design usually conforms to some well characterized topology the knowledge of which may allow for certain types of optimizations.

A routing algorithm may be used to direct traffic within the fabric network, but that is not anywhere near the complexity of building a "router", and is not at all like putting a "router" within a router except at a very high abstract level. And unlike a real network, routing decisions might be made with real-time understanding of actual resource usage within the network (this is in fact simple to implement in a well bounded network such as this).

There are very ways to go about the process of utilizing the links within the fabric network. A true datagram approach provides a great deal of efficieny, but then there needs to be a mechanism to maintain packet ordering as this is by conventional wisdom a desireable thing.

A flow-based approach might be another way. Let's take a look at Ethernet link aggregation for example. Test results I have seen indicate that it works extremely well in the face of millions of short-lived flows. Would it be possible to apply a similar concept to a fabric network? Would it be worthwhile adding QoS reservations to such a concept? These are interesting questions, but suffice to say that all complex fabrics do a lot more than just shunt bits around, there is often some internal control plane even if it is as simple as feedback on resource usage.

Some other miscellaneous thoughts. The Internet is not in my opinion a datagram network in the same sense as was envisioned in the 1960's. Many flows will follow the same path, most of the time, until the topology changes. The efficiency of the Internet would probably be enhanced significantly by the introduction of a true datagram paradigm, if the resultant issues could be delt with.

We do not make reservatiote on the road system because a) there is no incetive to provide better service, i.e. it is a best effort service, with some quite good differential service for those that travel in the "CBR" lane by traveling with a companion; but like many differential service algorithms, the system is broken for every one else some of the time, and the system works exceptionally well for every one some of the time.

There are some activities in life that are well suited to a reservation paradigm, and then there are many that are not. The notion that one networking mode, and perhaps even one network, will solve all problems may be a more broken thought than anything that is wrong with the individual network modes that we as an industry already understand.
arch_1 12/5/2012 | 12:14:10 AM
re: Caspian Comes Out [I asked]
So here is my question: how in the heck can a single router in the core implement flow-based routing and have any effect on the QoS of the flow?

[rjmcmahon respondes socratically]
If the flow traversed (at least initially) the router, couldn't that have some effect?

[to which I reply]
In theory it might have some effect, but in practice the effect has exactly zero impact on the flow's QoS. Essentially all traffic in the internet has an individual at one end and a server at the other end. The "individual" end has fairly low continuous bandwidth (say 10Mbps or lower.) The core internet is running at 10Gbps, so to a first approximation, contention for the user's access line will have 1000 times the impact on QoS that contention on a core line will have. The actual average core-to-edge ratio is higher than this.

The above is a gross oversimplification, but a rigorous analysis will show that it understates the true difference.

[I asserted]
Flow QoS is an end-to-end issue, not a per-router issue

[rjmacmahon replied]
Not sure how flow QoS could be realized without router (and forwarder) involvement. Is that possible? (My engineering is rusty -- been spending too much time worrying about politics ;-)

[I reply]
Is is almost certainly cheaper to simply overprovision the core bandwidth. Long-haul costs have dropped to nothing, as have all other link costs except the local loop. Therefore, QoS is only needed on the local loop. So, only the CPE and the access equipment (i.e., the two ends of each local loop) need to worry about QoS.


[I asserted]
The place where it has the biggest impact (by orders of magnitude) is on a customer's access line, not on core trunks.


[rcmcmahon replied]
Agreed. Maybe we could fix the access lines and get rid of that congestion (and the lobbyist) rather than throwing expensive flow based technology at that problem? (I know Wall St. prefers pick and shovels over public infrastructure, but heck, we all don't live by the rules of Wall St. quite yet.)

[I reply]
This is actually backwards, believe it or not. If we can drive the cost of the local loop down, it will no longer be the chokepoint. At that point we will be back to optimizing the use of the long-haul, and QoS in the core will again make sense.

I have given up on the "political" process to drive down the local loop cost. It's too slow. We will see lower costs when the structural monopolies (telco and cable) are rendered irrelevant by technology advances, and specifically by advances that do not require a truck roll to each home. Current contenders include 802.11g.
joe_average 12/5/2012 | 12:14:09 AM
re: Caspian Comes Out As Bobby so badly stated:

Caspian wants to launch its products based on reliability based on reliability and scalability. None of these objectives have been by achieved by Caspian. nI sincerely hope no carrier in the US, Europe and Asia believes the claims made by Caspian.

-----------------------

It doesn't appear that he knows much about the subject (or sentence construction) but he did hit on something seemingly by accident.

It is interesting to note the continuing use of reliability as a competitive differentiator by Caspian. Especially when you consider they laid off their reliability engineers 20 months ago. Doh!
skeptic 12/5/2012 | 12:14:08 AM
re: Caspian Comes Out So here is my question: how in the heck can a single router in the core implement flow-based routing and have any effect on the QoS of the flow? Flow QoS is an end-to-end issue, not a per-router issue, and the place where it has the biggest impact (by orders of magnitude) is on a customer's access line, not on core trunks.
=================
The answer is that you are supposed to buy
lots of routers from caspian and re-build your
entire network as a caspian network.

I dont know how many people remember it, but
during one of caspian's earlier launches, their
theme was "the death of the router" (with
picture ads to go with it). What
Caspian was building was supposed to replace
the entire interior of the internet and
make routers obsolete.

The problem with what they were saying then is
that Caspian seemed to be suggesting that the
"future" of IP routing in Robert's vision
was replacement of open-standards routers with
propritary standards routers (DMPLS) from
Caspian.

They don't seem to be saying that as much anymore.
But thats where they started.
skeptic 12/5/2012 | 12:14:08 AM
re: Caspian Comes Out A routing algorithm may be used to direct traffic within the fabric network, but that is not anywhere near the complexity of building a "router", and is not at all like putting a "router" within a router except at a very high abstract level. And unlike a real network, routing decisions might be made with real-time understanding of actual resource usage within the network (this is in fact simple to implement in a well bounded network such as this).
===============

If you put nodes together and connect them with
links, you have a routed network. And you will
run into all the classic problems that routers
and networks have always dealt with. There is
no set of simplifying assumptions that removes
the complexity (in spite of endless efforts to
find them).

A mesh is a mesh. And figuring out routing to
optimize random traffic across a mesh is a hard
problem. If the traffic were more predictable
or at least didn't change its characteristics
over time, it might be more possible.

----------
A flow-based approach might be another way. Let's take a look at Ethernet link aggregation for example. Test results I have seen indicate that it works extremely well in the face of millions of short-lived flows. Would it be possible to apply a similar concept to a fabric network?
------------------
I dont know the test results you are looking at,
but often the aggregation schemes are based on
simple hashing of packet contents. The problem
is (as some have already found out) is that you
can easily fool yourself in testing these things
and then suddenly get into a real-traffic
situation at one particular place and time that
totally breaks the model.

Now you can go to adaptive models, but the
adaptive models always raise questions about
how a designer is supposed to figure out "one"
scheme that can account for every possible use
of the internet. What is the criteria used for
changing the traffic assignments? Should it
be reactive or predictive? Given that we can
mess up a flow every time the assignments are
changed, when should they be changed.
--------------
There are some activities in life that are well suited to a reservation paradigm, and then there are many that are not. The notion that one networking mode, and perhaps even one network, will solve all problems may be a more broken thought than anything that is wrong with the individual network modes that we as an industry already understand.
--------------

We have always had multiple modes. A great
example is the double-core. There were/are
networks that have an ATM "core" and an IP router
"core" at the same time. The way this worked
was that the ATM provided a reservation-oriented
system for transport of packets between the
core IP routers.

And, for odd reasons, both "cores" (ATM and IP)
thought of themselves as the "true core" and
thought that the other gear was not necessary.
Few people understood that both technologies
(reservation-oriented and IP datagram) played
a complementary role to each other.

And MPLS can play the same role that ATM did.


arch_1 12/5/2012 | 12:14:07 AM
re: Caspian Comes Out OK, thanks indianajones and skeptic.
I thought this might be the case. It matches the little history lesson I posted at the beginning of the thread.
Larry Roberts really did invent the ARPANET. He then really did successfully start over with a completely new X.25 net. So, his vision for Caspian was apparently to re-invent the internet once again, and in the same "new" way: this datagram stuff will never work correctly, so let's replace it with SVCs!

[indianajones replying to skeptic said]
I think what happened was that Caspian figured out that the customers were not stupid enough to fall for a proprietary solution and they changed their "proprietary protocol" to an MPLS-like scheme.

skeptic, I remember those ads, "Death of the router". While it is always gratifying to see startups challenge incumbents - I still believe startups have the best innovation and drive (recall Juniper's horrible M160 platform or should I call it M106) - why do startups like Caspian shoot themselves in the foot by making such stupid statements and hyping themselves to glory and falling flat on their face. Such a pity that over $270 or so odd venture money went into funding such a dog. What a misallocation of capital - I am sure there are a lot of startups with innovative and implementable product ideas looking for money.
arch_1 12/5/2012 | 12:14:07 AM
re: Caspian Comes Out [skeptic, responding to a comment on fabrics, said]
A mesh is a mesh. And figuring out routing to
optimize random traffic across a mesh is a hard
problem. If the traffic were more predictable
or at least didn't change its characteristics
over time, it might be more possible.

[I reply]
As the original poster said, a fabric differs greatly from an arbitrary mesh. The two most fundamental diffrences are often overlooked:
1) Fabrics are explicitly designed with overspeed.
2) The fabric physical diameter makes feedback times short, so feedback is cost-effective.

Yes, you can also do fancy stuff with QoS in a fabric, especially with short feedback times. But since fabric bandwidth is far cheaper than the external bandwidth it supports, it's cheaper and easier just to grossly overprovision. In essence,
if your fabric links are at 50% peak usage, QoS algorithms will never kick in. You only have problems when there is a gross imbalance (e.g., presented load for an egress is twice the egress speed.) but the short feedback time allows you to drop on ingress for that egress, which is exactly the behavior of a centralized or scheduled switch.
Mark Seery 12/5/2012 | 12:14:07 AM
re: Caspian Comes Out Skeptic,

>> ...will run into all the classic problems that routers and networks have always dealt with... <<

Well I guess we will have to agree to disagree. A multistage fabric is a self-contained network. You do not have to worry about huge topologies, you do not have to worry about waiting for some algorithm to be standardized - and for everyone else to implement it, and you can implement a true datagram mode. All these things make the problem much simpler - especially in a topology that is regular and predetermined. And as stated before, feedback mechanisms play a much larger role as the distances within the network are so small. I understand that for people who are used to thinking about crossbars this is confusing territory at first but elegant and simple solutions are possible. Of course this is probably all OT if Caspian did not pursue this route.

>> We have always had multiple modes. A great
example is the double-core <<

Well I was more thinking along the lines of different networks in parallel providing different services. The double-core you refer to illicits a wide-range of reactions as I am sure you are aware ;-)

As for ATM networks providing a reservation-oriented transport for IP cores, I think you will find that most of the time they simply provided a connectivity mesh (as will be the case with some MPLS TE cores as well). The offline traffic engineering tools allowed people to understand and characterize that mesh, but it does not mean reservations are made or necessary; especialliy if you trust the core traffic not be bursty because of the level of aggregation it has already gone through.

>> And MPLS can play the same role that ATM did <<

Well of course you'd first have to state which MPLS mode and whose implementation ;-) Seriously, a subject only for the brave of heart - the general notion that MPLS acts like ATM is debatable. In fact, MPLS was, and proudly in some cases, created by people who did not want it to act like ATM.
indianajones 12/5/2012 | 12:14:07 AM
re: Caspian Comes Out I think what happened was that Caspian figured out that the customers were not stupid enough to fall for a proprietary solution and they changed their "proprietary protocol" to an MPLS-like scheme.

skeptic, I remember those ads, "Death of the router". While it is always gratifying to see startups challenge incumbents - I still believe startups have the best innovation and drive (recall Juniper's horrible M160 platform or should I call it M106) - why do startups like Caspian shoot themselves in the foot by making such stupid statements and hyping themselves to glory and falling flat on their face. Such a pity that over $270 or so odd venture money went into funding such a dog. What a misallocation of capital - I am sure there are a lot of startups with innovative and implementable product ideas looking for money.
Mark Seery 12/5/2012 | 12:14:06 AM
re: Caspian Comes Out arch_1,

I generally agree with your response,....

>> But since fabric bandwidth is far cheaper than the external bandwidth it supports, it's cheaper and easier just to grossly overprovision. <<

Waste not want not applies here as well IMO. There are good incentives for silicon efficiency still as overall system power is still something to be concerned about.

Mark
<<   <   Page 7 / 9   >   >>
HOME
Sign In
SEARCH
CLOSE
MORE
CLOSE