802.11: Fatally Flawed?
The flaw turned up in a series of tests of wireless LANs, looking at network scalability, carried out by VeriWave in conjunction with Network World magazine. Members of the Institute of Electrical and Electronics Engineers Inc. (IEEE) , which ratified the 802.11 standard and is in process of bringing out extensions including the much-anticipated 802.11n standard for wireless broadband, were previously unaware of the glitch, says VeriWave founder and CTO Tom Alexander.
"What we discovered is that as the transmitter retries sending dropped packets, some packets are being lost," explains Alexander. "It's extremely small, around .001 percent, but it's never zero. That's not what the protocol says; the loss should be zero."
Some degree of packet attrition is inherent in any transmission over an 802.11 network; that's why the standard builds in robust error-checking and retransmission mechanisms. In theory, even high rates of signal errors should be corrected as dropped packets get re-sent. These retries occur in such minute time periods -- a few microseconds, in typical 802.11g networks -- that they're invisible to users.
The problem turned up by VeriWave, in fact, happens during these retries. The flaw is not in the actual content of the transmission itself but in the physical layer convergence procedure (PLCP) header, which carries crucial information about each packet including its length and the speed at which it's transmitted. The PLCP header can be thought of as a label on each packet that gives the receiver a heads-up, as it were, for that specific packet -- "This is a 100-byte packet transmitting at 54 Mbit/s," for instance.
The main packet has a 32-bit "cyclic redundancy check," or CRC, which generates a checksum -- a sort of comparison function that in turn is used to detect errors after transmission. The CRC virtually guarantees that the packet itself can't arrive corrupted. The PLCP header, however has only a single bit for error checking. That means it's more easily corrupted, and it means that the receiver sometimes gets garbled information about the size and transmission rate of the accompanying packet.
The 100-byte packet at 54 Mbit/s cited above, for instance, might "get converted into what looks like a very long byte stream, at a low bit-rate," says Alexander. Thus a packet might normally take a few dozen microseconds to transmit, but in this case the corrupted PLCP header will cause the receiver to keep listening for the packet for several milliseconds -- a huge difference in the infinitesimal time scales of high-speed networking.
"The time period is wrong, and during that time the receiver can only listen for that specific packet," says Alexander. "All the retries that occur during the blind period don't get seen. The net result is that packets are lost."
The loss issue, which VeriWave attributes to a basic design flaw, has implications not just in the esoteric world of network testing but in real-world situations as well: If the loss occurs during a handoff between access points on a network -- a procedure known as an "Extensible Authentication Protocol handshake" -- it can cause dropped connections for laptops or handsets as the system takes 30 seconds or so to reset itself.
VeriWave and Aruba Networks Inc. (Nasdaq: ARUN) presented their diagnosis of the problem to IEEE members at the Sept. 19 meeting in Melbourne, and it's thought that the 802.11n standard -- on which many service providers and vendors are banking to take wireless networking to a new level -- can be revised to add more robust parity check mechanisms to the PLCP header and avoid the packet loss. Existing .11a, b, and g systems, however, are out of luck.
Users, says Alexander, "will just have to live with it."
— Richard Martin, Senior Editor, Unstrung