Open Source Quality Challenge Redux
I don’t have time to do a long blog post on Heartbleed, the new flaw in OpenSSL, but there’s one notion going around that needs to be squashed. Specifically, some people are claiming that open source software is inherently more secure:
Because so many people are working on the software, that makes it so it’s less susceptible to problems. For security it’s more important in many ways, because often security is really hard to implement correctly. By having an open source movement around cryptography and SSL, people were able to ensure a lot of basic errors wouldn’t creep into the products.Not so. What matters is that people really look, and not just with their eyes, but with a variety of automated static and dynamic analysis tools.
Secure systems require more than that, though. They require a careful design process, careful coding, and careful review and testing. All of these need to be done by people who know how to build secure systems, and not just write code. Secure programming is different and harder; most programmers, however brilliant they may be, have never been taught security. And again, it’s not just programming and it’s not just debugging; design—humble design—matters a great deal.
I wrote about this problem in the open source community five years ago. I haven’t seen nearly enough change. We need formal, structured processes, starting at the very beginning, before we’ll see dramatic improvement.
Heartbleed: Don't Panic
There’s been a lot of ink and pixels spilled of late over the Heartbleed bug. Yes, it’s serious. Yes, it potentially affects almost everyone. Yes, there are some precautions you should take. But there’s good news, too: for many people, it’s a non-event.
Heartbleed allows an attacker to recover a random memory area from a web or email server running certain versions of OpenSSL. The question is what’s in that memory. It may be nothing, or it may contain user passwords (this has reportedly been seen on Yahoo’s mail service), cryptographic keys, etc. From a theoretical perspective, this latter is the most serious; an attacker can impersonate the site, read old traffic that’s been recorded, etc. (Beside, cryptographers take key leakage very personally; that keys won’t leak is one of our core assumptions.) Is this a real risk, though? For many people, the answer is no.
In order to impersonate a site, an attacker has to redirect traffic you’re sending towards that site. If you only use the Internet via well-controlled networks, you’re probably safe. Yes, it’s possible to redirect traffic on the Internet backbone, but it’s rare and difficult. If a major intelligence agency is after you or that site, you’re at risk; most of us aren’t in that category. Cellular data networks are also in that category: it can be done, but it’s hard.
For most people, the weak link is their access network: their home, their workplace, the public or semi-public networks they use. It’s much easier to redirect traffic on a WiFi network or an Ethernet, and well within the capabilities of ordinary cybercriminals. If untrusted individuals or hacked machines use the same networks as you do, you’re at much more risk. Your residence is probably safe if there are no hacked machines on it and if you observe good security precautions on your WiFi network (WPA2 and a strong password). A small office might be safe; a large one is rather more dangerous. All public hotspots are quite exposed.
The other risk of Heartbleed is someone decrypting old traffic. That sounds serious, though again it’s hard to capture traffic if you’re not law enforcement or an intelligence agency. On exposed nets, hackers can certainly do it, but they’re not likely to record traffic they’ll never be able to decrypt. Law enforcement might do that, if they thought they could get assistance from the local spooks to break the crypto. They could also redirect traffic, with cooperation from the ISP. The question, though, is whether or not they would; most police forces don’t have that kind of technical expertise.
It’s important to realize that exposure isn’t all or nothing. If you regularly use a public hotspot to visit a social networking site but only do your banking at home, your banking password is probably safe. That’s also why your home network gear is probably safe: you don’t access it over the Internet. (One caveat there: you should configure it so that you can’t access it remotely, only from your home. Too much gear is shipped with that set incorrectly. If you have a router, make sure remote access to it is turned off.)
One more threat is worth mentioning: client software, such as browsers and mail programs, use SSL; some of these use OpenSSL and hence are vulnerable if you use them to connect to a hacked site. Fortunately, most major browsers and mailers are not affected, but to be safe, make sure you’ve installed all patches.
There’s one password you should change nevertheless: your email password. It’s generally used to reset all of your other accounts. "Probably safe" is not the same as "definitely". Accordingly, as soon as you know that your mail provider has patched its system (Google and Yahoo have, and Microsoft was never vulnerable), change it—and change it to something strong and use a password manager to save you from having to use the same new password everywhere.
Oh yes—if Martian Intelligence is after you (you know who you are), indeed you should be worried.
Doing Crypto
The recent discovery of the goto fail and heartbleed bugs has prompted some public discussion on a very important topic: what advice should cryptologists give to implementors who need to use crypto? What should they do? There are three parts to the answer: don’t invent things; use ordinary care; and take special care around crypto code.
- Don’t invent things
-
This oldest piece of advice on the subject is still sound;
everyone who teaches or writes on the subject will repeat it.
Never invent your own primitives or protocols.
Cryptographic protocols are fiendishly difficult to get
right; even pros often get them wrong. Encryption algorithms
are even harder to design.
It’s certainly true that there have been very few known attacks
on bad crypto by hackers not working for a major government.
But "few" is not the same as "none"—think of WEP—and
many commercial sites have been targeted by governments.
Besides, many crypto attacks are silent; the victims may
never know what happened.
Custom crypto: just say no.
- Use ordinary care
- Crypto code is code, and is therefore susceptible to all the natural shocks that code is heir to. All the usual advice—watch out for buffer overflows, avoid integer overflows, and so on—applies here; crypto code is almost by definition security-sensitive. There’s no shortage of advice and tools; most of this is even correct.
- Special care
-
Crypto code, though, is special; there are precautions
that need to be taken that are irrelevant anywhere else.
Consider things like timing attacks: if you’re
using RSA but haven’t implemented it with all due paranoia,
an attacker can recover your private key just by seeing how
long it takes you to respond to certain messages.
There are cache timing attacks: if the attacker
can run programs on the same computer as your crypto code
(and this isn’t a preposterous notion in a world of cloud computing),
it’s possible to figure out an AES key by watching what cache
lines are busy during an encryption or decryption operation.
Alternatively, consider how hard it is to implement obvious
advice like zeroing out keys after they’re used: if you write
code to assign zeros to a variable you’ll never use again,
a modern compiler can optimize that code out of existence.
The problems are subtle and there aren’t widely-known
sources of advice.
Nine years ago, Dan Boneh commented to me that "It’s amazing how tricky it is to implement Crypto correctly these days. I constantly tell my undergrad students that they should not try to implement existing systems themselves." It’s even more true today.
Some of these issues can be dealt with by sticking with a well-implemented, standards-adhering crypto library. Yes, there have been problems of late with several different such libraries—but most programmers will make at least as many mistakes on their own. Are we putting all of our eggs in one basket and exacerbating the monoculture problem? In this case, I prefer to fall back on Pudd’nhead Wilson’s advice here:
Behold, the fool saith, "Put not all thine eggs in the one basket"—which is but a manner of saying, "Scatter your money and your attention"; but the wise man saith, "Put all your eggs in the one basket and—WATCH THAT BASKET!"Crypto is hard.
What Does "Network Neutrality" Mean?
A lot of ink and pixels have been spilled about the FCC’s new rules for network neutrality. It’s impossible to comment sensibly yet about the actual proposal, since as far as I know it’s not been published anywhere, but the various news reports have left me confused about just what is being addressed. There are a number of different sorts of behavior that can result in performance differences to the end user; it isn’t always clear which are covered by the phrase "network neutrality". Furthermore, some of these items involve direct, out of pocket expenses for a improvments in connectivity; this cost has to come out of someone’s wallet. The purpose of this post is to give a simplified (with luck, not too horribly oversimplified) explanation of the different issues here.
Note carefully that in this discussion, I am not trying to imply that network neutrality is good or bad. I simply don’t know of a less loaded term.
Before I start, I need to define a few terms. Roughly speaking, there are three types of users of the Internet: content, eyeballs, and content distribution networks. "Content" is what we read, watch, or listen to: news sites, movies, audio, etc. "Eyeballs" are consumers, the people who read, watch, or listen to content. (Yes, eyeballs can listen to MP3s. I never said that this post was anatomically correct.) Content distribution networks (CDNs) are a special form of content: to avoid chokepoints when sending out many copies of large files, such as images and movies, content providers send out single copies to CDNs and let the CDNs redistribute them. A CDN is just a set of data centers inside multiple networks.
Some networks, such as most cable ISPs, are mostly eyeballs. (Again, I’m oversimplifying; some cable ISPs do provide other services.) There are others that run hosting data centers; these are mostly content. Consumers wouldn’t sign up for Internet service if there was no content to watch; content providers wouldn’t be online if there were no consumers to profit from. This raises the first interesting question: who benefits more? That is, who should pay what percentage of the cost for an interconnection between a content network and an eyeball network?
The really big ISPs (sometimes known as "Tier 1" ISPs, though that phrase has become overly beloved by marketers), like AT&T and Verizon, have both content and eyeballs. When they interconnect, it’s called peering: they’re equals, peers, so they split the cost of the interconnection. Well, we think they split the cost; peering contracts are bilateral and confidential, so no one outside really knows. (For those of you who expect—or fear—discussion of BGP, autonomous systems, and the default-free zone, you can relax; that’s a level of detail that’s not really relevant here, so I won’t be going into those issues.)
The situation is different for smaller players. They’re not the equal of the Tier 1s, so they have to buy transit. That is, for a customer of a small ISP to reach a customer of a Tier 1, the ISP has to pay some Tier 1 for transit services to the rest of the Internet. (Yes, I said was simplifying things; I know it’s more complicated.)
So what does all this have to do with network neutrality? Let’s look at several cases. The simplest is intranetwork, where the eyeballs and two or more content providers are directly connected to the same ISP. Will the consumer get the same experience when viewing both content providers? If not, that’s generally considered to be a violation of network neutrality. That is, if the ISP is making decisions on how much of its bandwidth the two content providers can use, there’s a violation.
Note that I’ve assumed that all parties pay enough for their own bandwidth needs. An HD video stream takes about 6 Mbps; if I’m a content provider trying to send different video streams to 100 customers simultaneously, I’d better have (and pay for) at least 600 Mbps of bandwidth to the ISP. If I don’t, my customers will get lousy service, but that’s not discrimination, it’s physics. Sometimes, though, it’s hard to provide enough bandwidth. Wireless ISPs (who have sometimes claimed that network neutrality rules would be a hardship for them) are limited by spectrum; their internal networks may have ample capacity, but there’s only a certain amount they can push out to people’s phones. This isn’t discrimination, either—unless the ISP makes a choice about which content to favor.
Life gets more complicated when traffic is entering the eyeball network from outside. Eyeball networks are generally not Tier 1s; they buy transit. If there’s more demand for bandwidth than their current links can handle, they have three choices: they can increase their bandwidth to their transit providers, i.e., they can pay more for transit to connect their eyeballs to the content they want, they can selectively favor certain external content providers (which will give good service to some and poor service to others), or they can let everyone suffer equally. This last strategy leads to unhappy customers; the first strategy, though, implies higher costs that someone will have to pay. And the middle choice? That’s where network neutrality comes into play: who makes the decision?
Some content providers are willing to connect directly to eyeball ISPs. Must the eyeball ISP accept connection requests from all content sources? Who pays for the interconnection? Note that at a minimum, the eyeball ISP will need a dedicated router port, and someone has to pay for this port the physical link to connect it to the content source.
The other choice, for a content source facing congested links, is to deal with a CDN. This begs the question, though: CDNs need a presence (or more than one) on every eyeball ISP. Who pays for this presence? Does the eyeball ISP offer different terms to different CDNs, or turn some down?
If you’re still with me, you can see that "network neutrality" can cover many different sorts of behavior.
- Intra-network behavior, especially by wireless ISPs
- Transit links purchased by eyeball ISPs
- Direct connections between eyeball and content ISPs
- Connections by CDNs
These are not idle questions. In one case that’s been drawing a great deal of attention, Netflix has blamed "ISP trolls" like Comcast for discriminatory policies. Others, though, have claimed that Netflix deliberately used an ISP (Cogent) with a congested link to Comcast because it was cheaper for them. Did Netflix make this choice? If so, why wasn’t the link between Comast and the ISP upgraded? Was one party refusing to pay its proportionate share? (This is the sort of fact question that can’t be answered in the abstract, and I’m not trying to say who’s right or even what the facts are.)
If eyeballs are to get the content they want, some interconnection facilities will have to be upgraded, either by new links or by higher-speed links; this in turn means that someone will have to pay for the upgrades, possibly beyond the routine continual upgrades that all players on the Internet have to do. The network neutrality debate is about who makes the decision and under what criteria. Furthermore, the answers may be different for each of the cases I’ve outlined above. When the FCC’s rules come out, pay attention to all of these points and not just the buzzwords: it isn’t nearly as simple as "fast lanes for big companies" versus "government meddling in the free market".