4.0802 Genelex Project (1/99)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Mon, 3 Dec 90 22:14:14 EST

Humanist Discussion Group, Vol. 4, No. 0802. Monday, 3 Dec 1990.

Date: Mon, 3 Dec 90 16:14 EST
From: Jean Veronis <VERONIS@VASSAR>
Subject: GENELEX Project

From: Marc Nossin <mn@gsierli.uucp>


Bernard Normier
Marc Nossin

1, place des Marseillais
94227 Charenton-le-Pont Cedex, FRANCE

[the original version of this text contained charts. If you want to
receive them, send your FAX number to Marc Nossin <mn@gsierli.uucp>]

Since many years, computational linguistics activity is no longer only
academic but an industrial activity. During the meeting of the european
ministers held in the month of June in Rome, the importance and the
maturity of this discipline was acknowledged, thus by accrediting one of
the biggest EUREKA project of this year, the project GENELEX.

This project will last with a budget of 250 million francs for a period
of four years and will reunite

France : Bull, Gsi-ERLI, Hachette, IBM, LADL (Paris VII) and Sema-

Italy : Lexicon, Research consortium of Pise, Servedi (joint company
of two italian editors, Utet and Paravia);

Spain : Salvat (editor), Tecsidel, University of Barcelone.

Computational linguistics : The era of implementation

The language being the main vector to information, the applications
needing a complexe processing of this information under the typed form
are legion. Many of the following have been realized in ERLI:

- Automatic Indexing. A program analyses a document, then the program
associates the pertinent concepts which will then be exploited by the
research phase in the base which reassembles all the documents. With
these tools, we can also analyse, automatically, a question in natural
language and via the concepts which indexed the document, find the reply
to the question (information retrieval).

- Telematic Interfaces : Particularly in France, the development of
Telematics generates the needs of natural dialogues (i.e. which do not
use a computational language) directly between the general public and
the different services available on french network Minitel.

- Automatic Translation or Computer Assisted Translation (CAT). A
program analyses a sentence in a given language, and builds a more or
less abstract representation of this sentence, and then generates the
target sentence from this representation. We can also name the
interrogation of relational databases in natural language, the automatic
generation of correspondance, etc. This type of implementation can be
needed as far as there are together two main factors of the modern
society, so as to say the computers and language.

Natural Language Processing : the tools.

Compared with Expert Systems, this field has remained on the margin of
the real advanced media. It was due to the fact that it was possible to
conceive generic commercial tools (expert systems generators) all in
regularising the real problems (creation of the rules) started up by the
user. In the same time, Natural Language yet at its stammering stage,
prefered to deal with application development for identified clients,
rather than to take the risk of investing in the development of
products. To resume, the market of expert systems was guided by the
supply and that of Natural Language by the demand. In Gsi-Erli, 90 % of
the work done until now has consisted in development of customized
application rather than in product development.

Experience gained by the implementation of various Natural Language
applications has lead to the possibility of developping tools which are
of general value rather than specific for each application. It is the
matter of developping generic tools so as to reduce substantially the
cost of Natural Language applications, to introduce products on the
market. Let us put in detail the genericity problems of each component
which forms the heart of a Natural Language application.


[A complete version of this announcement is now available through the
fileserver, s.v. GENELEX PROJECT. You may obtain a copy by issuing
the command -- GET GENELEX PROJECT HUMANIST -- either interactively or
as a batch-job, addressed to ListServ@Brownvm. Thus on a VM/CMS system,
you say interactively: TELL LISTSERV AT BROWNVM GET filename filetype
HUMANIST; if you are not on a VM/CMS system, send mail to
ListServ@Brownvm with the GET command as the first and only line. For
more details see the "Guide to Humanist". Problems should be reported
to David Sitman, A79@TAUNIVM, after you have consulted the Guide and
tried all appropriate alternatives.]