At dinner with friends the other night, one asked why he had never seen emicons used in domain names considering their popularity in messaging. Think about it… Sacred Heart Hospital could use a heart symbol, Seattle Mariners could use a sad-face, etc. After losing them in explanation of ASCII-Punycode translation and IDN’s, I decided the quick way out of the conversation was that they could not be used very easily by most users and left it at that. It sparked my curiosity, though. What “symbols” were registered domain names and set out to find out.
What is an IDN Domain Name?
An Internationalized Domain Name is an Internet Domain Name or web address that is represented by local language characters. Most domain names are written using the 26 letter Latin/English alphabets and numbers – this coding is called ASCII. However, IDN allows for the use of non-ASCII characters in domain names. For example, here is the IDN for Starbucks Korea: http://xn--oy2b35ckwhba574atvuzkc.com/. IDNs enable domain names in non-ASCII characters, helping to improve the functionality and accessibility of the Internet. IDNs empower companies to maintain a single brand identity in many scripts and more Web users can navigate the Internet in their preferred script.
What Happens When an IDN Domain Name is Registered?
Most domain name registrars have a special page for International Domain Name (IDN) registration.
When an IDN is registered, the foreign characters are encoded in Punycode using a number of algorithms. Punycode is an ASCII version for the IDN, allowing it to resolve with the current Internet system. An example of Punycode domains can be identified by the “xn—” beginning (see section below with the symbol IDNs we listed).
One important thing to note is that you have to choose the language of the domain name and you cannot mix languages/scripts within a domain name.
Mixing scripts in a single domain is not allowed due to security problems where the letters in one script look very much like the letters form another. For example, if someone is able to create exampl℮.com (that final “℮” is the estimated symbol, not “e”) they could convince people to visit their site rather than example.com, which can lead to a number of issues including brand and trade mark issues. For more detail on the security implications, see Unicode Security Mechanisms: http://www.unicode.org/reports/tr39.
What are Symbol IDN Domain Names?
These are real domains that someone could use for a website, and they’d be very cool to feature on business cards and collateral (though you’d have to instruct people how to get to the domains!)
We looked at the IDNS for COM/NET/ORG/INFO/BIZ/US and found a list of 10,386 domains which are a single ‘character’. Here is a sampling of some of the most fun symbol IDNs we came across:
☂.com “xn--m3h.com” (this one is SO Seattle!)