IRIs and IDNs: Testing, Implementations, and Specification Evolvement

IUC 31, San Jose, 2007

Martin J. DÜRST

Aoyama Gakuin University


Internet/Web internationalization in waves:

What are IRIs and IDNs?

IRIs: Internationalized Resource Identifiers, internationalization of URIs (/URLs)

IDNs: Internationalized Domain Names, internationalization of domain names

Internationalization here means:

Why IRIs and IDNs?

Native script is easier to:

Because of higher familitarity and no need for transcription

URI/IRI structure


Example an actual URI containing all four parts:

hierarchical-part often includes a domain name ( in the above example)

Encoding of IRIs



Examples: Dürst → D%C3%BCrst, 渋谷駅 → %E6%B8%8B%E8%B0%B7%E9%A7%85

Encoding of IDNs



Examples: 渋谷駅.jp →, www.résumé.jp →


Foreign script not usable due to:



Why Testing?

Why testing first: Test-driven development

Charmod Testing Requirements

Axes listed in W3C Character Model 1.0: Resource Identifiers:

  1. IRIs in several document formats (HTML, CSS, SVG, Atom,...)
  2. IRIs in several locations in the same document format
  3. non-ASCII characters in different parts of an IRI (e.g. domain name part, path part)
  4. IRIs in documents with various widely used character encodings and with characters from various scripts
  5. Document-specific escapes in IRIs
  6. IRIs in various URI schemes
  7. Setup of various servers for IRIs
  8. Translation of IRIs into URIs (needed for all the above)

Over the years, IRI tests for various purposes have been created and made available at various locations. An overview is given at; if some tests are not listed there, please inform the author.

Testing Framework

Framework Idea and History

Test Types


Abstraction conveniently combining:

Human-oriented vs. Machine-oriented Tests


Version 0.10 of tests published today:

Next Steps

Can you ever have enough tests?

Other Tests

Overview page with pointers at

If you know about some test that is not linked, please tell me!

More tests needed for other aspects than resolution:

Browser Implementations

Coverage is reasonably good:

Other Implementations

Implementing IRIs/IDNs in cURL

Specification Update

The IETF Standards Track

IETF: Internet Engineering Task Force

Three standards levels:

RFC: Request for Comments (also: Experimental, Informational, Historical, Obsolete)

For implementers, Proposed Standard is good enough, even just an RFC is fine

Very few things make it to full Standard (currently 67, in: URIs, out: SMTP)

Current Issues in draft-duerst-iri-bis

Issues list at

Mailing list is (archives at

Open Issues

Email Address Internationalization

Top-Level Domain Names

Internationalization of URI Schemes

Conclusions & Outlook

