5.0366 Multilingual WP -- Larger Reflections (2/154)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Tue, 8 Oct 1991 20:52:43 EDT

Humanist Discussion Group, Vol. 5, No. 0366. Tuesday, 8 Oct 1991.

(1) Date: Mon, 7 Oct 91 17:07:50 MET (61 lines)
From: Harry Gaylord <galiard@let.rug.nl>
Subject: multilingual text processing

(2) Date: Mon, 7 Oct 91 10:03:17 PDT (93 lines)
From: "John J Hughes" <XB.J24@STANFORD.BITNET>
Subject: Multilingual Word Processing, Continued

(1) --------------------------------------------------------------------
Date: Mon, 7 Oct 91 17:07:50 MET
From: Harry Gaylord <galiard@let.rug.nl>
Subject: multilingual text processing

I am surpised that there is such a dearth of multilingual users who need
several applications working on the same texts and who are exchanging
these files with each other. I am sure we are out here. I did not
react to this discussion when it was about word processing, but this is
only one field of applications, i.e. how I get something on paper. I
spend much more time with the analysis of texts in machine readable form
and extracting the material I want to include in my work.

The best solution to the problem is for manufacturers to produce products
which incorporate internationally recognised standards for non-English
alphabets. Instead of doing this manufacturers seem to opt out for
reinventing the wheel each time they put another language module in
their software or operating system. If they followed the ISO standards,
we could import and export files to a large extent, at least in modern
languages. The ISO 8859 series covers the repertoires of nearly every
European language, plus Hebrew and Arabic. Parts of it have been
officially available from ECMA since 1985. DEC is the only company
which has taken this seriously as far as I know. Their 330 model
terminals can communicate with the 8859 series. The KERMIT project has
taken very seriously the problem of character sets and provided good
facilities in translating between character sets. Xwindows has standard
8859-1 and there was some discussion that they would also add other char
sets from 8859 in later versions.

This situation will be improved when the Universal Coded Character Set
(UCS) becomes available as an international standard next year. Final
decisions on the draft will be made next week at an ISO subcommittee
meeting in Rennes, France. This new 10646 will be the result of the
merger of ISO and Unicode work in this field. Then it is up to the
hardware, software and networking manufacturers to put away their
proprietary solutions and implement a common solution once and for all.
Organisations such as IBM, Microsoft, and Apple have not shown much
inclination to do this so far, but they do have repesentatives on the
ISO committee.

Yet even without this much is already in place. Not only are the coded
character sets available, but within the SGML standard there are
extensive sets of character entities and the possibility of creating new
character entities. One can export through tables from one application
into entities and import to another application from the entities. Each
side (human or computer) need only have reversible tables for a specific
application. The TEI Guidelines provide useful information for these
sorts of issues as well and when we have readily available software
which can use the facilities of using different character sets and
entities in SGML it will be a lot easier. In the meantime, I use
internally the NotaBene system internally and can export multilingual
files with other characters in entities for import elsewhere.

A word on Megawriter. Bob Kraft can say more on this. I think there
were troubles with its license on ChiWriter, produced by Horstmann
Software in California. ChiWriter was first designed for producing
texts with mathe- matical formulas. There are a number of character
sets available from them and a font designer is included. There are
some difficulties in converting files in its format into other ones,
though it can be done. This was especially true of the Greek and Hebrew
of Megawriter. A more recent version of ChiWriter is available.

(2) --------------------------------------------------------------102---
Date: Mon, 7 Oct 91 10:03:17 PDT
From: "John J Hughes" <XB.J24@STANFORD.BITNET>
SUBJECT: Multilingual Word Processing, Continued

In a recent HUMANIST note, Richard Goerwitz invites me to respond to his
response to my response to his previous note. Here goes!

As I've reread the correspondence, it seems that Richard and I are
approaching the topic of multilingual word processing and text
manipulation from significantly different perspectives. Richard is
coming at the topic more from the direction of a programmer and
developer. I'm approaching it more from the direction of an "end user."
Richard addresses a host of problems in the DOS world that make it
difficult for programmers to program multilingual applications. My
responses have primarily tried to mention available DOS applications for
multilingual uses. Richard bashes DOS and PCs as dead ends for the
development of multilingual programs. I've tried to argue that from an
end-user's perspective, PCs still have a lot of life left in them.
Richard has argued for the need for portability and indicated how
difficult this is on DOS-based machines. I agree that portability is an
important concept and that in an ideal world (see below) it is a
desideratum. But I've responded by asking how many multilingual PC users
really care about this issue if they have solved their own multilingual
needs.

In previous messages, I've agreed that "in the best of all possible
computing worlds, all programs on all platforms running under any number
of operating systems should be able to read one another's files in a way
that recognizes and preserves all characters--roman and
nonroman--scripts, formatting and mark up." Ideally, as Richard
indicates, fonts, scripts, and other language-related items should be
handled by the operating system and interface. The Macintosh does this
(more or less). But look at the practical side--the financial and
business side--of the sort of operating system and interface we are
asking for. Writing operating systems is a very expensive process that
takes far longer and that is far more complex than writing an
application. Practically speaking, would Microsoft or Apple or NeXT
recoup their costs if they decided to provide HUMANISTS with the kind of
operating system and interface we want? At the risk of sounding really
provincial, will most of the PC-buying world--the business
community--really care if Microsoft develops a new operating system that
supports nonroman fonts and right-to-left scripts and all the other
multilingual-related things it would be lovely to be able to have and
use? I suspect that if Microsoft thought there was enough of a need to
make such an operating system financially viable, they would develop
one. As Judy Koren (from Haifa) said in her recent HUMANIST note:
"Again, if you want vowels and cantillation marks, you have no choice
but a software solution [as opposed to Hebrew DOS], the Israeli market
is business-oriented and has no patience with academics, Bible scholare
and similar weirdos :-!!" I'm afraid all major software development is
business-driven, including the development of any future operating
systems. All I'm saying is that we can wish for the moon, but getting
there costs a lot of money. Who is going to pay for the trip, and how
will they justify the expense? Is this whole discussion of the kind of
operating system and interface it would be nice to have just so much
wishful thinking? If there is little or no hope of getting Microsoft or
Digital or Apple or NeXT or someone to develop it, is there any
practical point to continuing to dream about it and to continuing to
bash current operating systems?

Richard seems to think I use computers primarily or only as typesetters!
Yes, I use computers for typesetting, but I use them in many other ways
as well. For example, I use multilingual text retrieval programs on my
PC and Macintosh to search Greek and Hebrew texts. True, if I wish to
take the results of a search done with the PC application and import
it into Sprint or WordPerfect or what have you, I'll have to do some
work to make the files compatible and to translate the codes used to
represent Greek and Hebrew in the source program into those used by the
recipient program. This may be seen as an inelegant and "clunky"
procedure, but it is a workable one.

Finally, I'd like to clarify one statement in my original HUMANIST note
that I believe Richard and John Baima have significantly misunderstood.
I said: "More importantly, the ability to preserve font and formatting
information from application to application is a function of the
applications, not of the operating system or platform." This statement
is a reference to how DOS applications currently work, not a statement
about what should be the case in an ideal computing environment. I think
my statement has been misunderstood as a formula for future
development, instead as a simple description of what currently is the
case in the DOS world.

So, Richard, have I missed the point about what _should_ be the case in
an _ideal_ computing world, or have we been speaking at cross
purposes--you with an eye on programming and development and better
operating systems and me with an eye on present solutions for end users
under existing operating systems?

Your turn!

John