text encoding, cont. (119)

Thu, 20 Apr 89 19:56:29 EDT

Humanist Mailing List, Vol. 2, No. 865. Thursday, 20 Apr 1989.

(1) Date: Thu, 20 Apr 89 12:47 (30 lines)
From: Wujastyk (on GEC 4190 Rim-D at UCL) <UCGADKW@EUCLID.UCL.AC.UK>
Subject: Devanagari, Sanskrit primers

(2) Date: 20 April 1989 09:56:10 CDT (55 lines)
From: "Michael Sperberg-McQueen 312 996-2477 -2981" <U35395@UICVM>
Subject: Sanskrit and the TEI

(3) Date: 20 Apr 89 17:26:36 bst (9 lines)
Subject: Tagged Hebrew Texts

(1) --------------------------------------------------------------------
Date: Thu, 20 Apr 89 12:47
From: Wujastyk (on GEC 4190 Rim-D at UCL) <UCGADKW@EUCLID.UCL.AC.UK>
Subject: Devanagari, Sanskrit primers

Patrick W. Conner <U47C2@WVNVM.bitnet> noted that he has studied
Sanskrit long enough to know that you cannot really handle Devanagari in
ASCII unless you ignore the ligatures.

Absolutely--that's not under discussion. What is being debated is
a coding scheme for Sanskrit in *roman transliteration*.

And yes, I believe Perry's Primer is still in print. I keep seeing it in
bookshops, although I cannot give you a specific shop from memory. Get
the full book reference and send it to the biggest shop you know, or try

South Asia Books,
P. O. Box 502,
Missouri 65205,
Phone (314) 449-1359

You never really get away from your first primer, I know. But many
people feel that Perry has been superseded by M. Coulson's, Teach Yourself
Sanskrit, which is also still in print.


(2) --------------------------------------------------------------59----
Date: 20 April 1989 09:56:10 CDT
From: "Michael Sperberg-McQueen 312 996-2477 -2981" <U35395@UICVM>
Subject: Sanskrit and the TEI

During the recent discussion of Mathieu Boisvert's proposal for the
encoding of Sanskrit and Pali, Charles Faulhaber asked what the Text
Encoding Initiative and the International Organization for
Standardization are doing about the subject. I apologize for the
belated reply, but I was out of town when the question was asked.

He is right, of course. The Text Encoding Initiative should and will
look at Sanskrit and Pali as well as other languages, both in their
conventional alphabets and in Latin transcription. We hope to be able
to include guidance for encoding texts in such languages which (a)
reflects consensus among the practitioners of a field (so agreement
among Sanskritists is important to us), (b) accords as well as possible
with relevant national and international standards, and (c) does not
conflict with other schemes and alphabets (since the mixing of alphabets
is a clear desideratum for many of us).

For obvious economic reasons, ISO and its member organizations focus
more on living languages and their alphabets than on dead ones. I don't
know of any ISO recommendations for Devanagari character sets, and don't
expect to see any soon. ISO character sets for all alphabets
consistently omit characters needed for older texts but not now in use.
So there probably won't be much resistance from ISO if academics develop
their own practice for such older characters.

There are some standards, however, which are worth looking at, since
libraries have been struggling with standardization and automation for
some time, and face an extreme version of the challenge which faces
us all, since their printed catalogs must handle languages of all
descriptions. ANSI Z39.47, for example, ("Extended Latin Alphabet
Coded Character Set for Bibliographic Use") has all the characters
needed to print romainzed Pali and Sanskrit, as well as eighty or
so other languages. Its primary drawback is that so few industry
people have heard of it or implement it.

It seems unlikely that one method will work for all environments,
so the TEI might well conclude by documenting several practices,
aimed severally at interchange, use on 7-bit devices, use on 8-bit
devices with overstrike, and use of 8-bit devices without overstrike.
Can we do that without complicating the matter unreasonably? We
are eager to hear your opinions, and I invite you to share them with

In order not to overburden Humanist with such technical discussion,
however, I suggest we move the topic over to the new public list
for discussion of the Text Encoding Initiative and text encoding
problems generally, namely TEI-L@UICVM (see separate announcement).

-Michael Sperberg-McQueen
ACH / ACL / ALLC Text Encoding Initiative
University of Illinois at Chicago
(3) --------------------------------------------------------------13----
Date: 20 Apr 89 17:26:36 bst
Subject: Tagged Hebrew Texts

There are about 7 or 8 different morphologically tagged texts
of the Hebrew Bible completed or in production in some 6 different
countries. Some are available at very different costs. Details
are in J.J.Hughes,Bits Bytes and Biblical Studies, Zondervan
1987 or 1988. David M.