KBD

Keith Devens .com

Sunday, September 7, 2008 Flag waving
When your enemy is making a very serious mistake, don't be impolite and disturb him. – Napoleon Bonaparte (allegedly)
← The provisional Iraqi constitutionAwesome shirt(s) →

Daily link icon Tuesday, March 9, 2004

Internationalized Domain Names

MozillaZine has info on International Domain Names. I'm interested in learning how they work. Here's a page on the standards involved, including a list of relevant RFCs, and here's a list of "technical documents".

What's interesting is the following:

Encoding Scheme
The encoding scheme for IDNs will be an ASCII Compatible Encoding (ACE) that will encode the local language characters of an IDN into ASCII characters such that DNS can accurately answer a request for an address record. There are several types of ACE. In order to select an ACE as the standard, IETF must consider the difficult balance between compression and implementation. The preferred ACE will allow the greatest number of characters (code points) to be represented and will not be difficult to deploy. The IETF has chosen an ACE known as Punycode to be the standard.

So it seems they aren't using UTF-8?? Here's the RFC for Punycode. One of the most obvious questions I can think of is "How does Punycode compare to UTF-16 and UTF-8?", yet they don't answer that in any of their FAQs.

OK, after reading a little of the RFC for Punycode:

Punycode is a simple and efficient transfer encoding syntax designed
for use with Internationalized Domain Names in Applications (IDNA).
It uniquely and reversibly transforms a Unicode string into an ASCII
string. ASCII characters in the Unicode string are represented
literally, and non-ASCII characters are represented by ASCII
characters that are allowed in host name labels (letters, digits, and
hyphens). This document defines a general algorithm called
Bootstring that allows a string of basic code points to uniquely
represent any string of code points drawn from a larger set.
Punycode is an instance of Bootstring that uses particular parameter
values specified by this document, appropriate for IDNA.

So, Punycode seems to be sort of a BASE64 encoding meant for Unicode strings that decomposes them into ASCII characters. It appears that the canonical form of an internationalized domain name will be in Punycode, so now my only question is how those are distinguished from ordinary domain names (how are the namespaces separate?)

Well, here are some datasheets and whitepapers. I'll have to figure out the rest later.

← The provisional Iraqi constitutionAwesome shirt(s) →

Comments XML gif

Gary krall (http://www.verisign.com/nds/naming/idn/) wrote:

Keith:

By way of introduction I am the technical director for the i-Nav family of plug-ins here at Verisign. Our plug-ins essentially provide resolution and display capabilities to IE, Outlook and Outlook Express. This is required until native support, as in the case of Mozilla is built into the application based on the standards you reference above.

You raised the question at the end:

"...so now my only question is how those are distinguished from ordinary domain names (how are the namespaces separate?)"

The point is that they are not. The encoding of the domain is an ASCII representation of that domain based upon the IDNA standards (which you reference above). As far as DNS is concerned this is in the same name space as any other ASCII domain. This is true for all domains irrespective of the TLD.

In other words lets take the domain: müller.de. The encoded form of this domain is: xn--mller-kva.de. If you did a DNS Dig of the name it would return the following:


; <<>> DiG 2.1 <<>> @dns1.menandmice.is xn--mller-kva.de A
; (1 server found)
;; res options: init recurs defnam dnsrch
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10
;; flags: qr rd ra; Ques: 1, Ans: 1, Auth: 2, Addit: 0
;; QUESTIONS:
;; xn--mller-kva.de, type = A, class = IN

;; ANSWERS:
xn--mller-kva.de. 3600 A 81.2.176.59

;; AUTHORITY RECORDS:
xn--mller-kva.de. 3600 NS ns1.nameservice.de.
xn--mller-kva.de. 3600 NS ns6.nameservice.de.

;; Total query time: 166 msec
;; FROM: us.mirror.menandmice.com to SERVER: default -- 0.0.0.0
;; WHEN: Wed Apr 14 18:24:01 2004
;; MSG SIZE sent: 34 rcvd: 98

Which as you can see is not in a separate namespace but directly in the dotDE zone.

Hope this helps.

Gary.

∴ Gary krall | 14-Apr-2004 8:25pm est | http://www.verisign.com/nds/naming/idn/ | #4367

Feel free to post a comment below. Please see my comment policy.

Formatting Rules (No HTML):

  • **bold**, *italic*, _underlined_, --strikeout--
  • "text"="url" creates a link, and URLs are auto-highlighted
  • Blockquote: Like e-mail, begin paragraph with > (greater-than sign)
  • Lists: begin paragraph with *,-, or + (unordered), or # (ordered)
  • Code block: ?!code:language=perl|php|sql|javascript|etc.{\n}...{\n}?!/code

:
(will be your IP address if blank)
: (optional)
(Will not be shown on site)

: (optional)
:

September 2008
SunMonTueWedThuFriSat
 123456
78910111213
14151617181920
21222324252627
282930 



RSS feed RSS feed for Keith's Weblog
Atom feed Atom feed for Keith's Weblog
Weblog archive
Recent comments
  on 6 posts

Recent comments XML

new⇒I hate Norton Antivirus

Long long live AVG I love you!...

kevin sands: Sep 6, 7:31pm

I hate ASP.NET

CF, why pick that piece of trash?​Cold Confusion. Is it finally​really a OO...

ColdConfusion: Sep 5, 8:36pm

Maps of Iraq

This is for Linda, I will be​visiting that site some time in the​near futur...

Bob: Sep 5, 1:20pm

Girls, please don't get breast implants

Well alright I just read my above​comment and I wanted to add​this...I shou...

76.66.140.8: Sep 4, 7:31pm

Spider solitaire

I don't think the question was​necessarily if there are unbeatable​games.  ...

Jared: Sep 4, 12:44pm

Convert Pantone Colors to RGB and Hex - Color Conversion Chart

The colors on those website don't​seem to relate to the pantone data​we hav...

blah: Sep 3, 10:12am

Generated in about 0.217s.

(Used 8 db queries)

mobile phone