Perso-Arabic/Indic interface

Summary of the transliteration problem

Abbreviations: BEN=Bengali script, DEV=Devanagari, P-A=Perso-Arabic, -a=above, -b=below, dia=diaeresis, und=underbar

The situation
In DEV and BEN, various schemes have been used to represent Perso-Arabic characters. These schemes involve a various numbers of modified Indic characters (where 'modified' means that a diacritic is added). Current Hindi uses at most the "basic five" modified DEV characters nuqta-ka, -kha, -ga, -ja, -pha, all of which are already allowed for in the draft transliteration standard. More elaborate schemes of modified characters in DEV and BEN are shown in Fig.1, where col.1 shows the original P-A characters and col.9 the proposed transliteration of the Indic characters. References etc. are given in the Appendix.

Fig.1. Modified Indic characters in seven schemes, and their proposed transliteration
[schemes]

The suggested transliterations apply to all the Indic characters in a given row (which therefore count as 'equivalent orthography' when, and only when, they represent Perso-Arabic characters).

In order to represent all Perso-Arabic signs, the older DEV schemes may call for two underdots to appear below DEV aa-maatraa, i, and DEV na-underbar may be used to represent tanwin. These elaborations probably do not need to be covered by the proposed international standard.

Pattern
Fig.2 simplifies Fig.1 to show its underlying pattern. The first column shows the Perso-Arabic character corresponding to an Indic character used in some particular scheme. The second column enables one to deduce what the transliteration of the associated Indic consonant should be. An Indic consonant being 'unique' means that it is used for only one Perso-Arabic consonant. A transliteration being 'dominant' means that this transliteration is to be used when an Indian character is used for more than one Perso-Arabic character. Only two consonants have been found to be dominant. (We are still only thinking of modified Indic characters, as in Fig.1.)

Fig.2. Proposed general method of transliteration (Names given here are for convenience of reference only)
[pattern]

The procedure for transliterating a set of Indic characters representing Perso-Arabic characters is then:

For reverse transliteration, one uses the Indic characters of the original Perso-Arabic to Indic scheme. In this way a large part of many schemes may be accomodated.

Electronic transmission and storage requiring 7-bit character set
The new Latin characters with diacritics may be included by a scheme such as:

s_und -> _s    z_dot-a -> ;z    s_dia-b )   ^s )   ~s )   ,s
z_und -> _z    z_dot-b -> .z    h_dia-b )-> ^h )or ~h )or ,h
               z_caron -> ^z    t_dia-b )   ^t )   ~t )   ,t 
opening inverted comma [before vowel] -> .

Appendix: References, etc. for Fig.1

Acknowledgments
I gratefully acknowledge the help kindly given me by various correspondents, including Abu Jar M. Akkas who told me about cols. 2 - 6 of Fig.1.


Copyright (C) Anthony P. Stone 1999. This material may be freely used, provided the author is acknowledged. | There is also a more extended account | Up to Transliteration top page

This version dated: Dec 3, 1999