10.0590 Unicode

WILLARD MCCARTY (willard.mccarty@kcl.ac.uk)
Tue, 14 Jan 1997 22:25:42 +0000 (GMT)

Humanist Discussion Group, Vol. 10, No. 590.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Centre for Computing in the Humanities, King's College London
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: Patrick Durusau <pdurusau@emory.edu> (44)
Subject: Re: 10.0585 progress & Unicode

[2] From: Jim Marchand <marchand@ux1.cso.uiuc.edu> (21)
Subject: Unicode, etc.

[3] From: John Unsworth <jmu2m@virginia.edu> (45)
Subject: Re: 10.0585 progress & Unicode

--[1]----------------------------------------------------------------
Date: Tue, 14 Jan 1997 08:07:29 -0500
From: Patrick Durusau <pdurusau@emory.edu>
Subject: Re: 10.0585 progress & Unicode

George Murphy's response on progress in the area of fonts that:

> There is a universal character encoding scheme which includes
> ancient Greek (as well as just about every letter or glyph for all known
> languages, alive and dead); it's called Unicode, it is an international
> standard (ISO 10646-1), and there is some impetus, at least in Europe, to
> implement it.

is correct, but fails to indicate what he means by "just about every
letter or glyph for all known languages, alive and dead...." Interested
humanists might want to visit:
http://www.cm.spyglass.com/unicode/standard/unsupported.html
for a listing of presently unsupported modern and archaic scripts. Under
"archaic and obsolete scripts" the some forty-five languages appear:

Ahom, Akkadian Cuneiform, Aramaic, Babylonian Cuneiform, Balinese,
Balti, Batak, Brahmi, Buginese, Chola, Cypro-Minoan, Etruscan, Glagolitic,
Hieroglyphic Egyptian, Hieroglyphic Hittite, Javanese, Kaithi, Kawi,
Khamti, Kharoshthi, Kirat (Limbu), Lahnda, Linear B, Mandaic, Mangyan,
Manipuri (Meithei), Meroitic (Kush), Modi, Numidian, Ogham (proposal
pending), Pahlavi (Avestan), Phags-pa, Pyu, Old Persian Cuneiform,
Phoenician, Northern Runes, Satavahana, Siddham, South Arabian, Sumerian
Cuneiform, Syriac, Tagalog, Tagbanuwa, Tircul, and, Ugaritic Cuneiform.

While the usefulness of the Unicode standard versus ASCII cannot be
disputed, it is also a uniform display standard that may not reflect the
nuances of the script in which an ancient text was written. When encoding
non-modern texts, I think the better practice is to use defined entity
sets (for definition guidelines see the TEI Guidelines implementation
of SGML) which are then mapped to Unicode code points where they exist.
This preserves the information represented in the original text while
using the convenience of Unicode for display.

Patrick

Patrick Durusau
Information Technology
Scholars Press
pdurusau@emory.edu

--[2]----------------------------------------------------------------
Date: Tue, 14 Jan 97 10:02:20 CST
From: Jim Marchand <marchand@ux1.cso.uiuc.edu>
Subject: Unicode, etc.

As Pogo put it, "We have met the enemy and he is us." There is absolutely no
reason why you cannot type Greek on your computer and see it on your screen;
I do it all the time, using WordPerfect 5.1 on an old 386. If you use
Windows NT (not Windows 95), you will find that it is Unicode compatible.
Gamma Productions, 12625 High Bluff Drive, Suite 218, San Diego, CA 92130,
USA, Tel. 619-794-6399, Fax: 619-794-7294, will render your Win95 Unicode
(they call theirs UniType) compatile. With Unicode, you can type anything
(even Dingbats) on your screen and even send to compatible machines. If you
want to know more, see _The Unicode Standard_ 2 vols. The Unicode Consortium
(Addison-Wesley, 1991) [I have misplaced vol. 2 temporarily]. There are
some good references out there: Peter Kahrel, _Working with Foreign
Languages and Characters in WordPerfect (5.1 and WP for Windows)_
(Philadelphia: Benjamins, 1992); Nadine Kano, _Developing Interational
Software for Windows 95 and Windows NT_ (Microsoft Press, 1995).

The problem is that there are so many platforms, programs, etc. which
are incompatible each with the other, and there is no standard. If we all
adopted Unicode, we could write any writing system in the world; if one is
missing, we can add it. This would not be difficult to accomplish, since
Unicode is in principle an addressing system. As a bunch, we are too lazy
and often indifferent to the needs of other members of our community.
Jim Marchand.

--[3]----------------------------------------------------------------
Date: Mon, 13 Jan 1997 20:42:31 -0500
From: John Unsworth <jmu2m@virginia.edu>
Subject: Re: 10.0585 progress & Unicode

Some relevant news: IATH has (for some time now) been working
on software tentatively called Babble, to display and manipulate Unicode
text. Recent developments in the Java programming language have made it
possible take what had been a Unix prototype and turn it into working Java
software. Babble will *not* be a Unicode editor: instead, its purpose will
be to display, search, and manipulate texts which have already been created
in Unicode. Babble will provide linked scrolling, linked searching,
multiple text display, and some SGML awareness (basically, it will know the
difference between what's inside a tag and what's between tags, and it will
be able to hide or display tags). As a java application, Babble should run
on Macs, Windows95, and Unix platforms. It will use system fonts, so if you
don't have a particular font installed, you'll need to get it--but at least
for texts distributed from IATH, we will make the necessary fonts available.

If you're wondering how to create Unicode texts in the first place, I can
offer a couple of pointers. The Duke foreign language computing folks
provide a software package called Wincalis, that runs on Windows machines
and is a Unicode text editor. More info is at
http://www.lang.duke.edu/nogfx/index.htm
--but if it's ancient greek that you're interested in, you'll
be frustrated by its apparent inability to deal with accented Greek.
Wincalis does handle lots of languages, though, and it is reasonably
priced.

Another software company with Unicode software is Gamma Productions (more
info at http://www.gammapro.com/). Right now, the only relevant software
they ship is Unitype, which is essentially an overlay for typing alternate
character sets into existing Windows word-processing software. Due out soon
(watch the web site) though, is a stand-alone Unicode editor called Universe.

If you're interested in being apprised of developments in Babble, and/or
acting as a beta tester, send email (to me, not to Humanist) and I'll put
you on the list. I expect a testable version of Java Babble some time later
this semester. IATH has nothing to do with Wincalis or any GammaPro
software, so I can't answer questions about Wincalis or Uni*, and the
information above is purely that--not an endorsement, just information.

John Unsworth / Director, IATH / Dept. of English
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
http://jefferson.village.virginia.edu/~jmu2m/