Humanist Discussion Group, Vol. 14, No. 792.
Centre for Computing in the Humanities, King's College London
Date: Tue, 10 Apr 2001 08:46:51 +0100
From: email@example.com (Francois Lachance)
Subject: Webs & lexicography
I was wondering if any of the subscribers to Humanist might know of any
methodological reflections in lexicography about the status of headwords
(lemma) that parallel the discusssions in area of retrieval/searching
about full text versus indexed searching.
I know there has been much work done on the question of automatic lemma
generation with the use case of linking a given segment of text to a
given dictionary entry. I was wondering about linking a given
segment of text to several entries from different dictionaries. Of
course the question aligning variant spellings or even multilingual
entries becomes interesting when dictionaries themselves become
considered as linkable texts.
There are simple markup solutions to this type of situation. The solutions
that come to mind involve a "web" of cross-referencing that would allow
users to query into a mixed dictionary database to receive results
returned with their own query string as the "headword" to the entries. I
was wondering how such mechanisms might affect notions of the "lemma" as a
discrete word serving an index function. Would a "lemma" come to be
considered as any pointer to a "bundle" of grammatical paradigm,
lexical definition and other forms?
It seems to me that "to lemmatize" has come to mean to match a given
occurance with a headword (lemma) in a print dictionary entry (even if
this dictionary has been converted to an electronic form). I wonder if
"to lemmatize" will not come to also mean, in general terms, "to point
towards linguistic resources".
The challenge of course in the electronic medium is to create pointers and
resources that do not overwhelm the user and to create interfaces that
allow users to easily access more of the richly encoded information should
they so desire. If we are to avoid the type of interface that replicates
the badgering assistant which is either on (and excessively intrusive) or
off (and removed from the users mind), we might do well to ponder how
mildly sophisticated users read print which does not appear to rely on the
"servant" model. Of course, there are many scholarly productions in print
filled with abreviations well know to the expert and a mystery to the
novice and only unravelled with the help of a patient librarian.
Will the promise of the "shapable" electronic text that is gearable
to both novice and expert be worth the investment? The larger question is
of course "scholarship for whom?"
Francois Lachance, Scholar-at-large
some threads tangle in tassles, others form the weft
This archive was generated by hypermail 2b30 : Tue Apr 10 2001 - 04:00:46 EDT