Tuesday, August 14, 2012

Traffic Shaping, part 2

A few days ago I bragged about my beautiful flow control on my home network.  Things were much better than before, but they weren't as good as I thought.

Backups hummed along at 95% of max line speed and interactive traffic usually responded in a half second or so.  But not everything was well in the Federalist household.  You see sometimes in the evenings if the kids are good we have "screen time."  When this happens my wife usually watches something on Hulu on her laptop, some of the kids watch Netflix on the Wii, and others might watch YouTube on a desktop.  When that happens everything pauses and has to wait to buffer and interactive latency shoots up to an unacceptable 2-3 seconds.  This vexed me so I went into my router looked around.  Problems, but no obvious solutions.  The backup traffic is way over its allocated bandwidth and the normal traffic is nowhere close to its allocated traffic.  Traffic shaping is supposed to fix this, and in my testing it did.  So I did what any geek would do, I started noodling with stuff.  Raise txqueuelen on the vlan.  Lower txqueuelen on the vlan.  Raise it on the physical device.  Lower it on the physical device.  Change burst lengths on the classes.  Nothing helped.

Then I started Googling and found the answer.  I consulted probably a dozen sites on using Linux traffic shaping before I wrote the first script, but they all missed something critical.  They said to measure your bandwidth with different sites and figure out what your actual upstream bandwidth is and use that as your cap.

Your DSL company provisions exactly the bandwidth they said.  I know, you've never gotten within 90% of the advertised bandwidth.  I haven't either.  That's their fault, but it's not because they're lying, it's because they're inefficient.  And only when you understand exactly how can you traffic shape DSL properly.

The maximum length of a TCP/IP packet on an ethernet frame is 1500 bytes (excluding jumbo frames, because they don't apply here.)  Ethernet sticks a 14 byte header plus some padding on that, but Linux's traffic shaping modules are clever enough to figure that out, so you don't have to worry about it (which is why if you watch your stats, even if you set your max well below the capacity of your line you can never sustain it).  But DSL is actually PPP, so it sticks another 8 byte header inside the ethernet frame, lowering the max per packet to 1492 (but not the size of the transfer).  It's actually potentially worse than that because there could be other information stuck either inside or outside the ethernet packet but this isn't really the cause of the problem, and it's virtually impossible to get your DSL provider to tell you what the DSL packet really looks like, so I'll pretend it's just 8 bytes.

So you have 1492 bytes being transmitted from your router and 1500 bytes leaving the modem.  But DSL isn't ethernet.  It's being carried over the same line that carries the voice traffic, and that uses ATM.  So that the small packet voice traffic doesn't have to compete with huge data packets, ATM uses fixed 53-byte cells with 48-bytes of payload.

So we take our 1514 bytes (1500 plus the ethernet header) and divide it into 31 cells of 48 bytes (with 5 byte headers) and one with 26 bytes (padded to 48 bytes, with a 5 byte header).  Now our router sent 1492 bytes in data (which it counted as 1492+14), which takes up 32*53=1696 bytes.  Meaning we get to use about 88% of the bandwidth outbound from the modem.  This is where those numbers from our speed test came from.

But that's the maximum length of an ethernet frame (which also happens to be the easiest thing to speed test with).  What about the minimum?  The minimum TCP/IP packet is 20 bytes for the IP header plus 20 bytes for the TCP header and no payload for 40 bytes.  This happens to be what an ACK looks like, which happens to be pretty much the only thing you send back to a streaming video provider while you're watching a video.  When we packetize that for DSL/ATM we take 40 bytes, add an 8 byte PPP header and a 14 byte ethernet header for 62 bytes.  Then we divide that up into one frame of 48 bytes and one frame of 14 bytes, each with a 5 byte header.  So our router counted our 40 byte packet as 54 bytes, but it really took 106.  That means every single ACK that netflix, hulu, youtube, etc. are throwing takes twice as much DSL bandwidth as the router accounted for.  You don't notice this normally because ACKs are small and they're only sent roughly once per round-trip-time to the other side (on DSL, over 100ms) so on a single connection we're talking maybe 10kbits per second.  With multiple continuous downloads (which is what streaming video looks like when observed as raw bandwidth) we're adding 40k, but counting it as 20k.  Again, this wouldn't normally be a problem, but we were letting the low priority traffic use all the available bandwidth so now suddenly we're asking the DSL modem to send 400k per second on a 384k link and it's throwing stuff away randomly, causing retransmits and latency and all that stuff we were trying to avoid.

So we could fix this by lowering the bandwidth cap on the router to half our provisioned bandwidth.  It would be obtuse beyond belief, though, because then on large packets that make up most of our bandwidth by volume we're wasting half the already small pipe.  It ends up linux comes to our rescue again.  The htb qdisc (which I was already using) or the stab function on traffic control (which isn't available on the version of OpenWRT I'm using) provides both for a way to add additional overhead to the packet and to even to account for waste on different frame sizes later.

So now I have a new script that provides a full 384kbit outbound but sets "overhead 8" and "linklay atm" to tell linux how the DSL modem is going to mangle the traffic.  I've also gotten rid of the Wii rules and replaced them with a rule that just prioritizes all ACKs, which I suspect will give me high priority streaming video without having to actually identify streaming video (and as a bonus keep downloads downstream bound instead of upstream).  I'm sure I'll find faults with this, but in my testing it performs beautifully.  Even with uploads running at 97% of capacity I'm seeing latency numbers that look like an idle pipe.

EDIT: This ended up not working out as well as I want, so I ended up upgrading the router to kernel 2.6 and using the stab function to recompute packet size on enqueue and it's worked out fantastically.  In my test last night I was running netflix, hulu, and youtube simultaneously on three different computers while running an unrestricted upload with the bulk traffic flag set.  None of the videos paused at all and latency on ssh traffic was about 5% above an idle link.  I've updated the script above to the new 2.6 one.

No comments: