January 21st, 2010
From the W3C spec: The name “ruby” originated from the name of the 5.5 point font size in British printing, which is about half the 10 point font size commonly used for normal text.
A ruby annotation is a short piece of text in smaller font, written directly above or below or – with vertical text – to either side of the base text. It is most often used in East Asian typography in order to provide further information. Most commonly it shows the pronounciation of Chinese characters. Another use case is in text books, to give the foreign spelling of a native word, or vice versa. In literary text or Manga ruby is also sometimes used to specify a variant pronunciation of the underlying characters, to add some depth or twist to the normal understanding. However, there is no reason to limit ruby to East Asian text. It can also be applied to other languages for much the same purpose: to specify annotations or give further context.
In this regard note that ruby is a typographic tag rather than a semantic tag like the <abbr> or <dfn> tags that may seem to fill a similar role – by using <ruby> one specifies how the text is to be displayed rather than what its relation is to the base text.
The WebKit ruby implementation follows the HTML5 spec for the tags used. The tags themselves are rather simple:
<ruby> some base text <rp> ( </rp><rt> annotation </rt><rp> ) </rp></ruby>
|<ruby>||Encloses the whole annotated part of the text.|
|<rt>||Encloses the ruby text that is to appear in smaller font above the base text.|
|<rp>||Contains optional parentheses that appear if the user agent does not, in fact, render <ruby>.|
Regarding the <rp> tags: browsers that don’t “understand” ruby will just output the whole content as is, disregarding the unknown tags. This includes the opening and closing parentheses provided within the <rp> tags, as shown in the following image:
Browsers with ruby support, on the other hand, will ignore the contents of the <rp> tags and place what is contained within the <rt> tags in small print over the base text:
To give a different example, here is a sample text in Japanese annotated with English translations of the main words in Kanji characters:
Current State and Future Improvements
The current implementation follows the outline described above and should satisfy all the basic usage for ruby. This does not mean that implementation can be considered complete – there are two main areas that deserve future attention: character spacing and positioning.
Character spacing: Traditionally, in East Asian typography the characters that make up the ruby text annotation or (less commonly) the base text are spaced such that they line up cleanly.
Positioning: It is sometimes desirable to have the ruby text be positioned below (i.e., ‘after’ in HTML5 parlance) rather than above (’before’), or even in both places. One use case for the latter is in textbooks, e.g. to give a Japanese student both the Furigana reading of a particular Kanji word (usually displayed above the word) as well as the translation (usually below).
Both are addressed by the CSS3 spec, which is based off the XHTML ruby module. However, there are some small but significant differences between the HTML5 and CSS3 ruby specs. Most notably, HTML5 allows several spans of base text and annotations within a single <ruby> element. This is useful to annotate several chinese characters in a row, e.g. (example taken from the HTML5 spec):
<ruby> 漢<rt> ㄏㄢˋ</rt> 字 <rt> ㄗˋ </rt> </ruby>
The CSS3 ruby module on the other hand specifies a version of ruby called complex ruby, which separates ruby base text from annotations in a way similar to <table>. In the future we hope to bring these two conventions together cleanly.
For reference, CSS3 contains a few modules that more or less directly pertain to ruby rendering. Note that, as indicated above, the current implementation does not (yet) address these.
- Ruby module
- This is the main module concerning ruby.
- Requirements on Japanese Text Layout
- Has sections on ruby (probably more than you will ever want to know about ruby rendering), as well as general rendering guidelines for Japanese.
- Line module
- This module specifies the line-stacking-ruby property.
- Text module
- Although only marginally related to ruby, this contains additional rules for CJK line breaking.
Many East Asian scripts are traditionally written vertically rather than horizontally. Some of the conventions of ruby text stem from this tradition. Furthermore, with Chinese Bopomofo/Zhuyin-Fuhao, the ruby annotation may traditionally run vertically next to each character even for horizontal text (example from the CSS3 ruby spec):
Note that as shown in the above example, this is complicated by the fact that tone markers are not to be rendered in the normal vertical text stream, but to the side of it.
Unfortunately, vertical text does not lend itself easily to seamless integration with the way web pages are rendered today. Addressing this shortcoming would not only help further improve the visual quality of ruby, but open up ways to give East Asian web pages a more natural look. However, implementation of this is a challenge that still lies ahead.