Wednesday, August 15, 2012

More debate debates

story came out late yesterday that a group of Democrats had written the "Commission on Presidential Debates" requesting that they not bring up the Simpson-Bowles recommendations in debates.  This evidently comes after a group of Republicans had written requesting they ask specifically which parts of Simpson-Bowles they agree with.

I have thoughts on whether or not the National Commission on Fiscal Responsibility and Reform (the formal name of the Commission Obama put Simpson and Bowles in charge of) is actually relevant in the Presidential debates, but that's not my real issue here.  My real issue is that a private corporation initially established by the heads of the parties is accepting suggestions from individual congressmen on what the American people should or should not hear about during the Presidential debates.

Can we not get back to a format where candidates lay out their own cases and try to rebut the other side without the circus of 3 debates plus 1 vp debate, one of which is a "town hall" where questions are selected by a biased, but supposedly impartial, selector from the general public, all moderated by biased, but supposedly impartial, journalists?

Mr. Lincoln, do you prefer boxers or briefs?
Mr. Douglass, you have setup land grants to favor railroad expansion in Chicago.  As President, will you continue to support the railroads?


Tuesday, August 14, 2012

Traffic Shaping, part 2

A few days ago I bragged about my beautiful flow control on my home network.  Things were much better than before, but they weren't as good as I thought.

Backups hummed along at 95% of max line speed and interactive traffic usually responded in a half second or so.  But not everything was well in the Federalist household.  You see sometimes in the evenings if the kids are good we have "screen time."  When this happens my wife usually watches something on Hulu on her laptop, some of the kids watch Netflix on the Wii, and others might watch YouTube on a desktop.  When that happens everything pauses and has to wait to buffer and interactive latency shoots up to an unacceptable 2-3 seconds.  This vexed me so I went into my router looked around.  Problems, but no obvious solutions.  The backup traffic is way over its allocated bandwidth and the normal traffic is nowhere close to its allocated traffic.  Traffic shaping is supposed to fix this, and in my testing it did.  So I did what any geek would do, I started noodling with stuff.  Raise txqueuelen on the vlan.  Lower txqueuelen on the vlan.  Raise it on the physical device.  Lower it on the physical device.  Change burst lengths on the classes.  Nothing helped.

Then I started Googling and found the answer.  I consulted probably a dozen sites on using Linux traffic shaping before I wrote the first script, but they all missed something critical.  They said to measure your bandwidth with different sites and figure out what your actual upstream bandwidth is and use that as your cap.

Your DSL company provisions exactly the bandwidth they said.  I know, you've never gotten within 90% of the advertised bandwidth.  I haven't either.  That's their fault, but it's not because they're lying, it's because they're inefficient.  And only when you understand exactly how can you traffic shape DSL properly.

The maximum length of a TCP/IP packet on an ethernet frame is 1500 bytes (excluding jumbo frames, because they don't apply here.)  Ethernet sticks a 14 byte header plus some padding on that, but Linux's traffic shaping modules are clever enough to figure that out, so you don't have to worry about it (which is why if you watch your stats, even if you set your max well below the capacity of your line you can never sustain it).  But DSL is actually PPP, so it sticks another 8 byte header inside the ethernet frame, lowering the max per packet to 1492 (but not the size of the transfer).  It's actually potentially worse than that because there could be other information stuck either inside or outside the ethernet packet but this isn't really the cause of the problem, and it's virtually impossible to get your DSL provider to tell you what the DSL packet really looks like, so I'll pretend it's just 8 bytes.

So you have 1492 bytes being transmitted from your router and 1500 bytes leaving the modem.  But DSL isn't ethernet.  It's being carried over the same line that carries the voice traffic, and that uses ATM.  So that the small packet voice traffic doesn't have to compete with huge data packets, ATM uses fixed 53-byte cells with 48-bytes of payload.

So we take our 1514 bytes (1500 plus the ethernet header) and divide it into 31 cells of 48 bytes (with 5 byte headers) and one with 26 bytes (padded to 48 bytes, with a 5 byte header).  Now our router sent 1492 bytes in data (which it counted as 1492+14), which takes up 32*53=1696 bytes.  Meaning we get to use about 88% of the bandwidth outbound from the modem.  This is where those numbers from our speed test came from.

But that's the maximum length of an ethernet frame (which also happens to be the easiest thing to speed test with).  What about the minimum?  The minimum TCP/IP packet is 20 bytes for the IP header plus 20 bytes for the TCP header and no payload for 40 bytes.  This happens to be what an ACK looks like, which happens to be pretty much the only thing you send back to a streaming video provider while you're watching a video.  When we packetize that for DSL/ATM we take 40 bytes, add an 8 byte PPP header and a 14 byte ethernet header for 62 bytes.  Then we divide that up into one frame of 48 bytes and one frame of 14 bytes, each with a 5 byte header.  So our router counted our 40 byte packet as 54 bytes, but it really took 106.  That means every single ACK that netflix, hulu, youtube, etc. are throwing takes twice as much DSL bandwidth as the router accounted for.  You don't notice this normally because ACKs are small and they're only sent roughly once per round-trip-time to the other side (on DSL, over 100ms) so on a single connection we're talking maybe 10kbits per second.  With multiple continuous downloads (which is what streaming video looks like when observed as raw bandwidth) we're adding 40k, but counting it as 20k.  Again, this wouldn't normally be a problem, but we were letting the low priority traffic use all the available bandwidth so now suddenly we're asking the DSL modem to send 400k per second on a 384k link and it's throwing stuff away randomly, causing retransmits and latency and all that stuff we were trying to avoid.

So we could fix this by lowering the bandwidth cap on the router to half our provisioned bandwidth.  It would be obtuse beyond belief, though, because then on large packets that make up most of our bandwidth by volume we're wasting half the already small pipe.  It ends up linux comes to our rescue again.  The htb qdisc (which I was already using) or the stab function on traffic control (which isn't available on the version of OpenWRT I'm using) provides both for a way to add additional overhead to the packet and to even to account for waste on different frame sizes later.

So now I have a new script that provides a full 384kbit outbound but sets "overhead 8" and "linklay atm" to tell linux how the DSL modem is going to mangle the traffic.  I've also gotten rid of the Wii rules and replaced them with a rule that just prioritizes all ACKs, which I suspect will give me high priority streaming video without having to actually identify streaming video (and as a bonus keep downloads downstream bound instead of upstream).  I'm sure I'll find faults with this, but in my testing it performs beautifully.  Even with uploads running at 97% of capacity I'm seeing latency numbers that look like an idle pipe.

EDIT: This ended up not working out as well as I want, so I ended up upgrading the router to kernel 2.6 and using the stab function to recompute packet size on enqueue and it's worked out fantastically.  In my test last night I was running netflix, hulu, and youtube simultaneously on three different computers while running an unrestricted upload with the bulk traffic flag set.  None of the videos paused at all and latency on ssh traffic was about 5% above an idle link.  I've updated the script above to the new 2.6 one.

Tuesday, August 7, 2012

Passwords or "You're not paranoid if they're really out to get you"

After writing my first two posts on backups I was wondering if I was overly paranoid having not just a primary and backup storage, but primary and three separate backups.  This story convinced me I'm not. The writeup is excellent and you should read all of it.  I'll wait.

This is an excellent example of why you shouldn't trust somebody else's security model even to be what they claim it is.  If you have the either the account and password for a CrashPlan account or access to the system itself, you can delete all its CrashPlan backups.  These particular hackers don't appear to have cared what was actually on the laptop they were deleting, but I feel a whole lot more comfortable having a semi-recent copy of my data offline and recoverable even if CrashPlan's security is compromised.

I'll note, this is not a screed against CrashPlan's security.  As near as I can tell it's about as good as anyone else's.  I would have required a separate authentication on the system to delete the backups from the cloud (and no, setting "require account password to run CrashPlan desktop" is not sufficient.  I've set that and turned the network off, it's authenticating against a locally stored hash which means it can be bypassed using the locally stored credentials), but even if it were implemented exactly how I want, I still wouldn't trust it.

The author concludes "My experience leads me to believe that cloud-based systems need fundamentally different security measures. Password-based security mechanisms — which can be cracked, reset, and socially engineered — no longer suffice in the era of cloud computing."  I disagree. The problem isn't password-based security mechanisms.  Brute forcing a password is a terrible way to hack an account.  Even a weak password would take days on a badly configured service.  A horrible password (say a dictionary word, a single digit, and a single special character from the top row of the keyboard) would take 8 hours at 10 attempts per second, which ought to prompt any reasonable service to go into lockdown.  The problem is with the non-password methods we've developed to make resetting your too-secure password easier.

Apple's security failings are unforgivable.  Maybe ten years ago I could forgive Apple for using the last four digits of a credit card as some sort of secure PIN.  If they used the full number I would think it unacceptable, but forgivable.  Pretty much every modern system prints the last four as an insecure verification.  The reason Apple used the last four digits is that Payment Card security standards don't let them publish the entire number outside the secure area.  In other words, the full number is important security information, but you can publish the last four digits to your customer service personnel because they're not sufficiently identifying to pose a risk to the customer's identity.  But if they're not sufficiently identifying enough to compromise the customer's identity by publishing them, why does Apple think they're sufficiently identifying to give away the user's account?  Apple's posture is that you have a password for your account, but if you don't have that they'll take your less secure "security questions", and if you don't have those they'll take a matter of public record and a number that's probably printed on a dozen receipts you threw in the trash or left in the gas pump.  Amazon's is worse from an authentication point of view, but not as comprehensive.  You don't need to authenticate yourself at all to add information to the account, and that information can be used to authenticate yourself afterwards.  These are both absolutely boneheaded setups that should have been caught immediately.

I currently consider Google the Gold Standard for current internet security.  ING actually has more security and I want to consider it first.  A quick perusal of the mint.com forums will show you all sorts of people trying to bypass ING's security system.  ING requires security questions just to get to the password entry dialog if you're at an IP they don't recognize, they only accept numeric passwords (because they're less likely to be your wife's name) and they have a custom interface for entering them that makes it basically impossible for a browser to cache it.  I don't mind this for my bank account, but it would be extremely annoying for the Photoshop user forum.  (I, in fact, think autocomplete disabling is vastly overused)  I've forgotten my ING password before; it doesn't matter if you know your security questions, they won't even ask.  They snail mail you a new password to your registered mailing address in a completely nondescript envelope that doesn't even say ING on it.  This takes days, but you have to admit it's a lot harder to surreptitiously sort through a victim's USPS mail than it is to guess their first car (and it has the side benefit that going through somebody else's mail is a felony even if they fail to actually take over your account).

The problem with this, like most information security problems, is that people are willing to trade security in the abstract for convenience in the immediate.  Only the most computer savvy are going to be as forgiving as Mr. Honan and say "shame on me for poor security" when they lose all pictures of their kid, but they're not going to use an email service that locks them out for a week when they forget their password, either.  My problem with Amazon and Apple isn't that they weren't up to ING's level of annoyingness, it's that they made it impossible to be secure.

As I said before, Google is probably the best at this.  The interesting thing about Google is that they know almost nothing about you (yeah, I know, Google knows everything, but when you set up your account they didn't ask for a mailing address, a credit card, or even your real name) but they realize that you probably use your Google account for a lot of stuff and you might use your Gmail address for password recovery on various things, so it's important that they not give away your account.  It's sort of ironic that the goal of the Amazon and Apple hacks were to get to a Google account.  Amazon and Apple both knew vastly more about him than Google.  They could, like ING, have paper mailed reset credentials to his billing address, but they were the entry point because they were far easier to nuts to crack than Google.  And if he had had two-factor on Google they would have been insufficient.  The vandals could have ordered thousands of dollars of merchandise, but they couldn't have gotten into his email.

Google basically has two levels of security.  With the default level of security you login with a only a password.  With two-factor security there is additionally a smart-phone app that generates time-based tickets and a sheet of paper with backup tickets in case your phone dies.  When you setup your account they ask you for up to three ways to retrieve a lost password:  a cell phone, an email, and a security question (which you can choose).  If you have two-factor authentication you have to enter a code from either your phone or that piece of paper to execute a retrieval (and the phone doesn't count as a retrieval option).  If you can't do this then they make you go through a drawn out process of answering questions based on the contents of your account, preferably from an IP from which you have used gmail in the past.  Despite all this, they should still be better.  When I got back from a business trip to RedHat's headquarters recently I had notices in my inbox that Google had noticed suspicious source IP's logging in while I was away (from RedHat's headquarters).  If Google were suspicious of the fact that a reset request was coming for Mr. Honan's account from an IP he had never used they could have prevented his Google account from being taken over (though the most important damage, at Apple, would already have been done).

So what lessons can we draw from this?

For companies:
  1. State why you use security questions.  I care whether you're like ING and you might need a security question to access the account from an unknown location or if you're just using it for password resets.  Offer reasonable suggestions for them ("What year and model was your first car?" versus something more likely to be on the internet like "What's your pet's name?") but allow the user to type in his own.  I usually make this random gibberish, because I'd usually rather destroy an account than have it compromised, but if I set one I want it to be extremely complicated.  "What's your 10th grade math teacher's last name and the name of the street where you lived in 1995?"  Don't require them, but make it clear that it's going to be a pain to reset a password without them.
  2. Preferably require both the security question and a retrieval email to reset the password.  Ebay does this.  I'm much more comfortable with you sending a password reset to my registered email after I've entered the model of my first car than only one of them.
  3. If somebody doesn't know their password and can't access the standard retrieval mechanisms, be very suspicious.  They've already gone a long way to proving that they're not who they're claiming to be; don't trust them just because they know a billing address or a matter of public record (cough, Apple).  There was a comment on the original article that somebody was really happy with Amazon that they reset his AWS password with only his billing address after he forgot his password and entered gibberish as his security code, but in retrospect he's pissed.  He should be.  It should be hard to recover a password if you don't have the recovery options.  My preference would be send them a password reset via USPS to their registered mailing address.  If you don't have a credit card on file use a human to process it (Apple and Amazon both did this) and require something that only the account owner would know (neither Apple nor Amazon did this) the list of folders in your email, for instance.
  4. Track where your users login from.  Treat logins from unusual locations differently.  This doesn't necessarily mean deny them, but certainly be suspicious.  If somebody is trying to read their email from Nevada when they're usually in South Carolina, they might be on a business trip.  If they're trying to change the shipping address on a package and reset the password on the account, maybe you should require more authentication from them.
  5. Don't disable credential caching.  This is controversial and I'm somewhat torn about it.  I realize the number of browser based attacks out there, but lets face it the options for your average user isn't a super-secret password they cache in their browser or the same password they remember.  If you're lucky it's a decent random password that gets cached or the same password they use for their schnoodle owner's forum (which happens to be "schoodle07" because they got their schnoodle in 2007).  
For users:
  1. All the normal stuff about good passwords and bad passwords.  A good password is complicated, random, and only used once.  "d1pU{x,0D.2," is a great password if you're going to store it in your browser's credential's cache anyway, "gawkier729'acted" is almost as good and easier to remember if you're going to be typing it in. (Well, actually they're both horrible, because I already used them, but you get the idea.)  
  2. If you're asked for a security question, understand how they're going to be used, preferably by testing it, and set it to random gibberish if it's sufficient to reset your password.  As I said above in #2, I'm good with a security question being required to send a retrieval email.  I'm not okay with me having to enter a 30 character password every time I login when the actual security of the account is limited to around 300 models of cars I could possibly have owned as my first car (assuming a Ferrari is really a valid first car).  To get a feel for just how insecure that is, there are 46 normal keys on a keyboard (26 letters, ten numbers, ten punctuation marks) which shifted gives you 92 possibilities.  Which means a 2 character password (92**2) is about 30 times more secure than all models of car ever made.  Last names fare a bit better, but you're still well under 3 characters worth of entropy.  Luckily most sites use email recovery instead of questions.
  3. Your bank account password is not the most important password you have, the email account for your bank account's reset is.  The author of the original email recommends this be a distinct email.  I'm not sold on that: if you do that and don't check it then you don't get reset notices which is just as problematic, plus since places like Amazon use the primary account email for resets you also don't see notices that something has been purchased on your accounts.  What is completely necessary is making sure that the account to which your recovery passwords are sent is completely secure.  That means it needs a hard to guess password that's not used anywhere else, a recovery email that's just as secure, recovery questions that are impossible to figure out from public record, and be somewhere where they're not going to give it away without cracking that.
  4. The login for your various accounts should not be identical to your primary email address.  There's really no reason for you to use your primary email address for your Amazon account.  Gmail (and some other email providers) gives you the option of appending random strings to your email address and still having it delivered just like normal email (in fact, it's easier to filter this way).  If your email is joebob@gmail.com, your Amazon account can be joebob+mnhq@gmail.com and it will go to your gmail account just like normal, your browser will most likely just cache it, and it's much harder for somebody to get Amazon to reset your account because they now have half a million email addresses to try (26 letters to the fourth power.  This is not true if your Amazon account is joebob+amazon@gmail.com.  That's better than joebob@gmail.com, but only maginally)  This has the side benefit that when somebody sells your address to spammers they likely aren't smart enough to figure this out so you can figure out who it is by what suffix they used.
  5. Don't trust anybody else with something you can't recover if they screw up.  That's how I started this.  I read an article a while ago about some hacker who was supposedly just a system or so away from hacking computers with nuclear launch capabilities.  I was horrified that a system with nuclear launch capabilities was internet connected.  I would never willingly allow a company to remotely take down my desktop and I do my best to secure it, but I'm smart enough to know that if it's connected to the internet, it's open to attack.  The copy of it sitting in a drawer is a great deal harder.
  6. Don't trust Apple, at all.  This may seem unfair but this isn't a normal hack.  It's a major, fundamental flaw in their entire user security posture.  You might think I'm being unfair in not giving Amazon the same treatment, but I'm not.  I went into Amazon after this and tried to ship to another address using my current credit card.  You can't do it.  Amazon was boneheaded, and they should fix it, but the extent of the compromise is that they gave out the contents of his Kindle and what every gas station prints on your receipt.  Apple gave away the contents of an email account and allowed a hacker to erase a laptop using only information printed on that gas station receipt.