Mapping of some composite Unicode characters to Latin



1. Indic characters with diacritics for other languages

The pattern is shown in the next table.

(a) Recommendation for original P-A. characters not already covered, ocurring in a scheme as in Table B, and using any of the following:

NUKTA
COMBINING COMMA ABOVE RIGHT U+0315
FULL STOP U+002E
DOT ABOVE U+02D9 (follows a character)
LEFT SINGLE QUOTATION MARK U+2018
COMBINING DIAERESIS U+0308 (with a spacing character if necessary)

(b) Some Indic characters with diacritics used for other scripts.

other composite


eyelash repha/urpha

3. Pure Indic consonants

There are special glyphs for the pure consonants Ben. t, Mal. k, n_dot-b, n, y, r, l, l_dot-b. These behave as ordinary characters. Unicode v.3.0 gives no rendering rule, but where the generic Indic character [XA] has the pure form [X] with glyph X, one may suggest:

[XA] + [ZWJ] + [VIRAMA] --> X final,

[XA] + [ZWJ] + [VIRAMA] + [ZWNJ] --> X medially

where [ZWNJ] is U+200C.

Last updated: 2 July 2002