How to Abolish the DNS Hierarchy --- But it's a Bad Idea
There’s been a fair amount of controversy of late about
ICANN’s decision
to
dramatically
increase the number of top-level domains. With a bit of effort,
though — and with little disruption to the
infrastructure — we could abolish the issue
entirely. Any string whatsoever could be used, and
it would all Just Work. That is, it would Just Work
in a narrow technical sense; it would hurt innovation
and it would likely have serious economic failure modes.
The trick is to use a
cryptographic
hash function to convert a string of bytes into a sequence of
hexadecimal digits. For example, if one applies the
SHA-1 function
to the string
The technical limitation is that every end point would have to
be upgraded to do the hashing. Yes, that’s a problem, but we’ve
been through it before; supporting internationalized domain names
required the same thing. And it works:
But — how do endpoints know to do the hashing in this scheme?
Something in a the URL bar of a web browser? There are lots of
things on the net that aren’t web browsers; how will they
know what to do? You can’t necessarily tell from a string if it
should be used literally or via this hashing scheme; "Amazon.com"
appears to be the legal name of the corporation.
There’s another problem: canonicalization. Similar strings will
produce very different hash values. Here’s an example:
Amazon.com
the result is a46af6931d9dace2200617548fab3274549e308f.
Add a dot after every pair of hex digits, tacking on a suffix like
.arb (for "arbitrary", since .hash might be seen
as having other connotations), and you get
a4.6a.f6.93.1d.9d.ac.e2.20.06.17.54.8f.ab.32.74.54.9e.30.8f.arb
which looks like a domain name, albeit a weird one. It not only looks
like one, it is; that string could be added to the DNS today
with no changes in code. We could even distribute the servers;
at every level, there are 256 easy-to-split subtrees. So what’s wrong?
New York Times | 7e145e463809ea5e7c28f2ddf103499f942c9ea3 |
The New York Times | 1950c50c10f288dd6e9190361c968e1b8c4a3775 |
N.Y. Times | e69011929d6d30347ddca11c7955a07df8390984 |
NY Times | 48b6b7d57f0ed2885816f1df96da1ffa86f09dda |
We could no doubt define some set of rules that would handle many common cases. Equally certain, we’d miss many more. Companies could think of their own rules, but if they missed some we’d be back to cybersquatting and typosquatting. This would be worse, though, because the names are so spread out.
The real issue, though, is economic: who would run the different pieces of .arb? There are currently about 100M names in .com. Let’s allow for growth and assume 1,000,000,000 names. To handle canonicalization, assume another factor of 10, for about 10B names. Does that work? To a first approximation, sure; we can delegate at each period in the name, and there are 256 values at each level. That means that going down just two levels, we could have 65,536 different registries, each handling about 150K names. That’s easy to do, but a given registry could handle more than one zone. Let’s assume that 1.5M names is a good size (which is somewhat challenging, though it’s clearly possible since it works today). That means we’d need about 6,600 registries. But they have no way to do marketing; there’s no way to target any particular business segment, since names are mapped to more or less random parts of the name tree. If a registry failed, an unpredictable portion of the net would suddenly be unreachable.
Most of us never see registries; when we want to create a new domain, we do business with a registrar. But every registrar would need to do business with every registry! The number of relationships would get ungainly, and again, there’s no way to do targeted marketing. The registrars for, say, .museum can target museums, while ignoring, say, banks. With this scheme, everyone is doing business with everyone. It’s great to have a global market; it’s also very expensive.