x
Carrier WiFi

802.11: Fatally Flawed?

Recent throughput tests on wireless LANs based on the 802.11 networking standard have turned up a surprising fact: The standard, which is the basis for thousands of enterprise LANs, public WiFi hotspots, and municipal wireless networks, is inherently flawed in a way that causes packet losses in a small but significant number of transmissions. The loss, say officials at wireless-LAN testing firm VeriWave Inc. , is both predictable and unavoidable.

The flaw turned up in a series of tests of wireless LANs, looking at network scalability, carried out by VeriWave in conjunction with Network World magazine. Members of the Institute of Electrical and Electronics Engineers Inc. (IEEE) , which ratified the 802.11 standard and is in process of bringing out extensions including the much-anticipated 802.11n standard for wireless broadband, were previously unaware of the glitch, says VeriWave founder and CTO Tom Alexander.

"What we discovered is that as the transmitter retries sending dropped packets, some packets are being lost," explains Alexander. "It's extremely small, around .001 percent, but it's never zero. That's not what the protocol says; the loss should be zero."

Some degree of packet attrition is inherent in any transmission over an 802.11 network; that's why the standard builds in robust error-checking and retransmission mechanisms. In theory, even high rates of signal errors should be corrected as dropped packets get re-sent. These retries occur in such minute time periods -- a few microseconds, in typical 802.11g networks -- that they're invisible to users.

The problem turned up by VeriWave, in fact, happens during these retries. The flaw is not in the actual content of the transmission itself but in the physical layer convergence procedure (PLCP) header, which carries crucial information about each packet including its length and the speed at which it's transmitted. The PLCP header can be thought of as a label on each packet that gives the receiver a heads-up, as it were, for that specific packet -- "This is a 100-byte packet transmitting at 54 Mbit/s," for instance.

The main packet has a 32-bit "cyclic redundancy check," or CRC, which generates a checksum -- a sort of comparison function that in turn is used to detect errors after transmission. The CRC virtually guarantees that the packet itself can't arrive corrupted. The PLCP header, however has only a single bit for error checking. That means it's more easily corrupted, and it means that the receiver sometimes gets garbled information about the size and transmission rate of the accompanying packet.

The 100-byte packet at 54 Mbit/s cited above, for instance, might "get converted into what looks like a very long byte stream, at a low bit-rate," says Alexander. Thus a packet might normally take a few dozen microseconds to transmit, but in this case the corrupted PLCP header will cause the receiver to keep listening for the packet for several milliseconds -- a huge difference in the infinitesimal time scales of high-speed networking.

"The time period is wrong, and during that time the receiver can only listen for that specific packet," says Alexander. "All the retries that occur during the blind period don't get seen. The net result is that packets are lost."

The loss issue, which VeriWave attributes to a basic design flaw, has implications not just in the esoteric world of network testing but in real-world situations as well: If the loss occurs during a handoff between access points on a network -- a procedure known as an "Extensible Authentication Protocol handshake" -- it can cause dropped connections for laptops or handsets as the system takes 30 seconds or so to reset itself.

VeriWave and Aruba Networks Inc. (Nasdaq: ARUN) presented their diagnosis of the problem to IEEE members at the Sept. 19 meeting in Melbourne, and it's thought that the 802.11n standard -- on which many service providers and vendors are banking to take wireless networking to a new level -- can be revised to add more robust parity check mechanisms to the PLCP header and avoid the packet loss. Existing .11a, b, and g systems, however, are out of luck.

Users, says Alexander, "will just have to live with it."

— Richard Martin, Senior Editor, Unstrung

IPobserver 12/5/2012 | 3:34:56 AM
re: 802.11: Fatally Flawed? Thanks.

Another area could be how well vendors integrate and align with with various application providers.
wimaxfan 12/5/2012 | 3:35:01 AM
re: 802.11: Fatally Flawed? I think its time to think of the following:

1. System scalability (APs, Users, Density, etc.)
2. Voice + Data mobility
3. Security architecture
4. Integration with existing infrastructure
5. Solution extension beyond Wi-Fi (outdoor mesh, cellular, WiMAX, etc.)

There may be more here. But these are a lot more intersting to me personally that how fast the Wi-Fi link can go.

Except for 802.11n, there is very little interesting left in the Wi-Fi link and the IEEE standards body is beginning to smell like the ATM Forum these days. Too many people, too few results. Time to move on.
IPobserver 12/5/2012 | 3:35:02 AM
re: 802.11: Fatally Flawed? Interesting comments.

Following up on this oneG«™
I think its time we said, the Wi-Fi link is a commodity like ethernet has become and there is no real differentiation there. Time to move on to other system related issues

IG«÷m curious what folk think constitutes a differentiator in this market?

Disclaimer: IG«÷m close to finishing a paid research report on enterprise WLAN, so IG«÷m looking to cross-check my findings.
wirelessfreak 12/5/2012 | 3:35:05 AM
re: 802.11: Fatally Flawed? I think everyone is missing the point - probably deliberately on some people's part.

Aruba and Meru showed up for this test, Cisco did not. Even though they were given ample time (2 months according to the article) they refused to respond.

I guess Aruba and the testers had some spare time on their hands since the big gorilla was a no show so played with things like PCF and talking about flaws in the 802.11 spec.

What the test results show to me is that Aruba was capable of creating a mid-sized network in a lab and then getting reasonable performance out of it. 25 APs doesn't seem like a big deal for boxes that claim to support 128 or 256 APs, or in Cisco's case 300 APs per WiSM blade.

I'd hope they are all deploying customers everyday with this or larger numbers of APs and that they must be functional. So why didn't all the other vendors show up?
wimaxfan 12/5/2012 | 3:35:06 AM
re: 802.11: Fatally Flawed? I agree with your overall point. Every vendor these days is looking for some differentiator and unfortunately focusing on the wrong thing.

Airespace did downlink PCF, Aruba does adaptive PCF and Meru bastardizes the 802.11 MAC by setting NAVs, etc.

As the press and analyst community continues to encourage these drib drabs as real differentiation, we will continue to see vendors choosing to push the envelope.

My own bias is with 802.11e. With some aspects of 802.11e implemented correctly such as WMM and WMM-PS, I think we will be past these silly antics in short duration.

I think its time we said, the Wi-Fi link is a commodity like ethernet has become and there is no real differentiation there. Time to move on to other system related issues and of course new wireless links such as WiMAX, etc. (who I happen to be a fan of in case you can't tell :-)).
wimaxfan 12/5/2012 | 3:35:07 AM
re: 802.11: Fatally Flawed? Funny how you continue to pick on Aruba. In their defense, Airespace used this PCF technology to show that Wi-Fi thruput can be higher than the theoretical max. See here for their previous results.

Aruba seems to have improved on this with their adaptive radio management algorithms.

In defense of the tests, its to show what's possible in a lab and while a lab test is not a real world test, it is an indicator of system performance in the real world. Listen to the author speak. He makes this point.

Here is the old reference of the prior review.

http://www.networkworld.com/re...

"Airespace's maximum rates are well above 7M bit/sec, which is commonly understood to be 802.11b's theoretical top end. Airespace attributes its high rates to delivery-only point coordination function (PCF), a little-used mechanism in the 802.11 standard that allows for shorter gaps between frames than those in the more widely used distributed coordination function. With smaller gaps, Airespace's forwarding rates are higher.

Ethernet gap size has a somewhat notorious history. Some early switch makers used small gaps to get good scores in performance tests. That is not the case here - PCF is perfectly legal. The downside with PCF is other WLAN stations have less access to the wireless medium. Airespace says that traffic for most WLAN users is mainly downstream (from the access point to clients), and that clients associated to its access points always have some time in which they can send traffic upstream.

No system tripled forwarding rates in tests with three or four access points, even though three access points theoretically should not interfere with one another. Aruba's access points came closest, with aggregate rates that averaged about 95% of triple the single-access-point numbers. Trapeze was next (91%), followed by Airespace (90%) and Symbol (81%). This raises a key issue with WLANs: Capacity will decrease as contention for spectrum grows."


peanutoat 12/5/2012 | 3:35:07 AM
re: 802.11: Fatally Flawed? Mornin',

Normally, I wouldn't find myself spending a Saturday on my computer, but something brought me to it today.

I'll admit you got some good points, but I think you're missing the overall picture. Sure, it's mighty clever of Aruba and Airespace to have figured out that you could use PCF in a benchmark and outpace your competition. And, you can't fault them for doing that, I suppose, if you are the sort of fella who sees the world as what can you get away with without getting caught.

But, no one uses PCF in the real world. There are many good reasons for it. I've been tracking PCF for a while, and its children HCCA and WMM SA, because I always thought that a scheduled access scheme makes some amount of sense. But what everyone is missing is that the existing scheduled access schemes assume that no two schedulers are on the same channel. Cable modems work just fine because there is only one cable head end. Cellular works fine, also, because they can guarantee to avoid two base stations on the same channel.

But WiFi just doesn't work that way. PCF never took off because it was so ridiculously hard to implement in real life. That's why I actually doubt Aruba and Airespace's (past) claims. See, we all know that no real client supports it. And, we also all have heard that the chipsets that Aruba and now Cisco uses were not designed to be a PC, and so don't do the timings right. They can fool clients, but they can't actually follow the standard, or so, that's what everyone tells me.

So, I have to ask, what's the point of talking about companies who have figured out how to pass benchmarks but can't actually deploy it? Reminds me of this thing about how gas stations down where I live cheat by knowing that the county measures gas in 5 gallon units, so they make the pumps run fast until every 4 gallons, in which case they catch up to reality. Designing your product for the test isn't the sort of thing that is good for everyone.

And so, I still wait until someone figures out how to really make PCF, or WMM SA, or whatever you want to call it, real. But I know we aren't seeing it with these tests. And since PCF makes things run so much faster than what the real world would have, I think you know that this isn't the real thing, too.
lrmobile_GTHill 12/5/2012 | 3:35:15 AM
re: 802.11: Fatally Flawed? .001% of all packets are effected. That is 1 in 100,000. Whoa... I would have never thought that 1 in 100,000 Wi-Fi packets would have an issue. :)

I was discussing this on the CWNP forums as well. In the PLCP header, there are other checks, like Criss was saying. For example, there are 4 rate bits, 16 combinations. However, there are only 8 OFDM rates. So, if a bit is flipped and doesn't match a valid rate, then the frame is bad (discarded). If the tail bits (supposed to be 6 0's) are not correct, bad frame.

I wonder what the experts at the IEEE conference said. "Ok, thank you." :)
meshsecurity 12/5/2012 | 3:35:16 AM
re: 802.11: Fatally Flawed? Back, just can't help myslef. Oh well, who cares?

Same old people, same old silly tactics...boring.

http://www.unstrung.com/docume...


Ignore...
peanutoat 12/5/2012 | 3:35:17 AM
re: 802.11: Fatally Flawed? I normally don't do this posting thing, but I have to say, this one story got me going. VeriWave and Aruba's comments here remind me of an old advisory I once saw about security, stating that CSMA/CA is a denial of service vulnerability in and of itself.

It's clearly an attempt by someone with limited familiarity in the art of wireless networks to "rush to the head of the class". In this case, Criss, you are dead right about that this one bit is a trivial thing. Come on: it's easy to show, with simple math, that this problem is less important to the operation of a real wireless network, rather than VeriWave test equipment, than having an antenna positioned non-optimally or a person blocking the way.

Anyways, it isn't 802.11a or 802.11g networks that are out of luck. Just 802.11b networks. Oh well. You might as well just think of a successful 802.11b transmission as interference, because they take so much time.

And, whoa, about this Network World test. I just looked it up on Aruba's website. Isn't this the one where Aruba turns on PCF? Gosh, they don't have PCF in their real products. And no client supports them, at least, no client in any of the schools that I consult for. Why should they, considering that it's so old, unsupported, and generally a bad thing for wireless. I feel bad for the author of the article, cause he seems like a nice fella from the words, but I don't get this.
Criss_Hyde 12/5/2012 | 3:35:17 AM
re: 802.11: Fatally Flawed? The only IEEE 802.11 PLCP headers that use a single parity bit are the ones used with the OFDM PHY and the ERP PHY with OFDM data rates. All the other PLCP headers use CRC-16.

Since the OFDM PLCP parity bit is used in conjunction with other sanity checking of PLCP header values the opportunity for all the right bits to flip and not produce a PLCP format error gets even slimmer.

I think we can live with it.
HOME
Sign In
SEARCH
CLOSE
MORE
CLOSE