13.0119 ELRA news; AMICO & OhioLINK

Humanist Discussion Group (humanist@kcl.ac.uk)
Tue, 3 Aug 1999 07:12:21 +0100 (BST)

Humanist Discussion Group, Vol. 13, No. 119.
Centre for Computing in the Humanities, King's College London

[1] From: Willard McCarty <willard.mccarty@kcl.ac.uk> (138)
Subject: ELRA news

[2] From: NINCH-ANNOUNCE <david@ninch.org> (141)
Subject: AMICO & OhioLINK Reach Agreement: Museum multimedia on
campus networks

Date: Tue, 03 Aug 1999 07:11:33 +0100
From: Willard McCarty <willard.mccarty@kcl.ac.uk>
Subject: ELRA news

>> From: Valerie Mapelli <mapelli@elda.fr>

European Language Resources Association

*** Dutch PAROLE Corpus and Lexicon ***

We are happy to announce the availability of the Dutch PAROLE resources via


LE-PAROLE project (MLAP/LE2-4017) aims to offer a large-scale harmonised
set of "core" corpora and lexica for all European Union languages.

Language corpora and lexica were built according to the same design and
composition principles, in the period 1996-1998.

More details on the PAROLE project at:
(on the Dutch PAROLE corpus and lexicon, see: http://www.inl.nl)

2) ELRA-W0019 Dutch PAROLE Distributable Corpus

The Dutch PAROLE Distributable Corpus is a 3 million words selection
fromthe 20 million words Dutch PAROLE Reference corpus

The Dutch corpus annotation and checking was made accordingly to the common
core PAROLE tagset. The Dutch data were also checked for type.

The Dutch PAROLE Distributable Corpus contains the following texts:
Van Sterkenburg:
Wdlijst tot wdboek 1984 65,344
Taal vt Journaal 1989 56,215
WNT-portret 1992 60,133

Short Newspaper texts:
MN_Collection 1986-1988 19,537
CVNP(S)-Collection 1983-1990 179,220

PERIODICAL Short texts from
- Local Papers 1985-1988 47,019
- Magazines 1985-1989 164,589

Texts to be read out in
TV-news broadcasts for:
- General audience 1992-1995 1,285,824
- Youth 1991-1995 1,008,658
Short texts from
Ephemera 1985-1986 131,692

TOTAL 3,018,231

Over 250,000 words of corpus texts have been PoS-tagged automatically. A
total of 59,798 running words has been manually corrected and checked at
least two times with respect to maximal granularity, according to a
lexicographer=92s manual. The extra 9,000 words over the required 50,000
words compensate for the occurrence of ca. 5,300 =91keywords=92 in the
original texts. The fully corrected material has been subjected to an
automated post-control operation, checking the pertinence relations between
the various feature values, and instantiating default values in case a
mismatch (indicating a correction error) was found. Ca. 200,000 words have
been checked once for PoS and type. In addition tothe required PoS, type
was checked for reasons of quality. This material hasbeen subjected to an
automated correction procedure addressing the feature slots (positions)
beyond the first two for PoS and type so as to solve discrepancies between
the manually corrected PoS and type, and the possibly erroneous,
automatically assigned values of the remaining slots.

Special price for academic users from the Netherlands and Belgium: 150 EURO
(the data will be supplied directly by the Instituut voor Nederlandse
Lexicologie, http://www.inl.nl)

Price for ELRA members
For academic use: 270 EURO
For research use by a commercial organisation: 800 EURO
For commercial use: 1600 EURO

Price for non members
For academic use: 300 EURO
For research use by a commercial organisation: 1300 EURO
For commercial use: 2500 EURO


3) ELRA-L0031 Dutch PAROLE lexicon

The entry list of the lexicon consists of about 20,200 entries distributed
over 13 parts of speech (POS). The entries have been described along the
dimensions of morphosyntax and syntax. Morphosyntactic information consists
of various lexical properties, like gender, number, case, person,
inflection, etc. Syntactic descriptions consist of typical complementation
patterns associated with the various lemmata.

The composition of the entry list of the lexicon is based on 3 corpora from
the Instituut voor Nederlandse Lexicologie (INL) and 2 lexica. The corpora
contain a total of about 54 million words and have been automatically
annotated for part-of-speech and lemma. The lexica contain morphosyntactic
information of various kinds. For verbs, nouns, adjectives and adverbs,
lemmata that were covered by at least 2 corpora and the 2 lexica were
selected on the basisof cumulative frequency, coverage (distribution over
sources) and inflected forms. For the smaller parts of speech, these
selection requirements appeared tobe too strict. Entry selection for these
parts of speech was based on ranked frequency.

The entries, uniquely defined by the combination of part of speech (e.g.
noun) and subtype (e.g. common vs. proper noun), are provided with
morphosyntactic information according to the Dutch set of PAROLE categories
and features,and, where available, with syntactic information.
Morphosyntactic information is automatically extracted from the INL lexica.
Syntactic data have been collected manually, by inspection of corpus data
and - where necessary - consultation of reference works. The corpus
consulted consists of the newspaper componentand the varied component of
the 38 Million Words Corpus 1996.

Word forms in the Dutch PAROLE lexicon are not inflected according to
general paradigms, but are related to their lemma by a set of string
procedures. These procedures are not unique. They can be shared by many
other word forms. An example is suffixation with e for adjectives, which
produces =91goede=92/good from =91goed=92. Inflected forms can be derived
directly by applying the string procedures to the lemma they are connected

The lexicon is set up as an SGML file (over 30 MB of plain ASCII). Its
contents have been encoded in a distributed manner: all formative entities
(like lemmata, syntactic phrases, feature bundles) are SGML entities,
related by a pointer mechanism to other entities.

The lexicon contains the following categories: adjectives (3,298 entries),
adpositions (80 entries), adverbs (554 entries), articles (3 entries),
conjunctions (70 entries), determiners (59 entries), interjections (235
entries), nouns (12,279 entries), numerals (77 entries), pronouns (85
entries), residuals (186 entries), unique (1 entry), verb (3,274 entries).

Special price for academic users from the Netherlands and Belgium: 200 EURO
(the data will be supplied directly by the Instituut voor Nederlandse
Lexicologie, http://www.inl.nl)

Price for ELRA members
For academic use: 300 EURO
For research use by a commercial organisation: 1600 EURO
For commercial use: 8000 EURO

Price for non members
For academic use: 400 EURO
For research use by a commercial organisation: 3000 EURO
For commercial use: 10000 EURO

In case of potential cooperation between a user and the Instituut voor
Nederlandse Lexicologie with mutual revenues, specific conditions will apply.

Nota: The prices of the Dutch PAROLE corpus and lexicon have been amended since
their publication in the last ELRA Newsletter Vol.4 N.2

For further information, please contact :

ELRA/ELDA Tel : +33 01 43 13 33 33
55-57 rue Brillat-Savarin Fax : +33 01 43 13 33 30
F-75013 Paris, France E-mail : mapelli@elda.fr

or visit our Web site:


Date: Tue, 03 Aug 1999 07:12:14 +0100
From: NINCH-ANNOUNCE <david@ninch.org>
Subject: AMICO & OhioLINK Reach Agreement: Museum multimedia on campus networks

News on Networking Cultural Heritage Resources
from across the Community

August 2, 1999

>Date: Mon, 2 Aug 1999 16:12:13 -0400
>To: mcn-l@mcn.edu
>From: "J. Trant" <jtrant@archimuse.com>
AMICO Press Release
July, 1999

The Art Museum Image Consortium and OhioLINK Reach
An Agreement on Statewide Distribution of the AMICO Library

AMICO Headquarters; Pittsburgh, PA

The Art Museum Image Consortium (AMICO), a growing not-for-profit
consortium currently made up of 27 museum members in North America, has
reached a distribution agreement with the Ohio Library and Information
Network (OhioLINK), a consortium of Ohio's college and university libraries
and the State Library of Ohio. Through this agreement students,
professors, and staff at 17 public universities, 23 community/technical
colleges, and 35 private colleges in the state of Ohio will have access to
the AMICO Library through OhioLINK's Digital Media Center starting in the
fall of 1999. "OhioLINK already has an established expertise in delivering
library resources such as the AMICO Library, so the fit was really natural
for us," commented AMICO Executive Director, Jennifer Trant. "With this
Agreement the broad, diverse community of OhioLINK institutions have full
access to the AMICO Library through a familiar portal. Our hope is that
our relationship with OhioLINK will become a model for similar statewide
distribution agreements," stated Ms. Trant.

The 1999 edition of the AMICO Library documents over 50,000
different works of art, from prehistoric goddess figures to contemporary
installations. More than simply an image database, works in the AMICO
Library are fully documented and may also include curatorial text about the
artwork, detailed provenance information, multiple views of the work
itself, and other related multimedia. "The AMICO Library is a welcome
addition to our digital resources collection because it will expand the
Digital Media Center with a rich image and multimedia database focused on
art objects," states Charly Bauer, Assistant Director of Library Systems -
Digital Media. Additionally, the Cleveland Museum of Art (CMA), an AMICO
Member, looks forward to this agreement building bridges to Ohio professors
and students. "With this agreement providing AMICO Library access to so
many universities and colleges across Ohio it's akin to having a traveling
exhibition of our permanent collection visiting each school for an entire
year," observes Stephanie Stebich in the Director's Office of the CMA. She
goes on to say, "We hope this added exposure to the museum's fine works
will enhance users' knowledge and draw visitors in the museum itself."

AMICO envisions the Library functioning in innovative ways that
traditional resources can not. For instance, students may curate online
exhibitions using AMICO images, professors could give "on the fly" lectures
searching the AMICO Library in real-time class settings, restricted-access
course web sites could be created for review purposes with AMICO images
incorporated in them, and much more. To investigate how the AMICO Library
may be used in educational institutions AMICO has just completed a yearlong
University Testbed with 16 universities across the United States and
Canada. A summary of many Testbed projects may be found on the AMICO web
site at the following address,
ts.html>html>. "The AMICO
Library should be quite useful to our member institutions and a great
complement to the prodigious resources that our establishment already
provides. We hope that educators and students from many disciplines will
see the creative possibilities of the Library and infuse their educational
efforts in a new way," said Charly Bauer.

The AMICO Library is accessible over secure networks on an
institutional subscriber basis. Images of artworks from museums such as The
Metropolitan Museum of Art, the National Gallery of Canada, the Art
Institute of Chicago, and the J. Paul Getty Museum are included in the
AMICO Library. A recent agreement with the Artists Rights Society provides
AMICO Library users unprecedented access to modern and contemporary works.
Interested users may preview a Thumbnail Catalog of the AMICO Library and
get further information at

The Ohio Library and Information Network, OhioLINK, is a consortium
of Ohio's college and university libraries and the State Library of Ohio.
Serving more than 500,000 students, faculty, and staff at 76 institutions,
OhioLINK offers access to more than 31 million library items statewide.
OhioLINK also provides access to 95 research databases, and many full-text
resources. Through OhioLINK's Electronic Journal Center, users have access
to more than 2400 electronic journal subscriptions and over one-million
journal articles. OhioLINK also offers user-initiated online borrowing,
the ability to electronically request items while searching the OhioLINK
central catalog, and a delivery service among member institutions to speed
the exchange of library items. To date, the OhioLINK central catalog
contains more than 7 million master records from its 76 institutions,
encompassing a spectrum of library material including law, medical, and
special collections.

OhioLINK's Digital Media Center will provide access to images,
audio, video, and other types of digital information in a variety of
disciplines such as art and architecture, medicine, and geography. The
Digital Media Center will serve as a publishing outlet for OhioLINK members
to contribute digital resources from their own unique collections.

The AMICO Library is a product of the Art Museum Image Consortium
(AMICO). Founded in October 1997 as a program of the Association of Art
Museum Directors (AAMD) Educational Foundation, Inc., AMICO was separately
incorporated as an independent non-profit corporation in June of 1998,
ending its direct connection with the AAMD. The Consortium is today made
up of 27 major museums in North America and is open to interested
institutions with a collection of art. Its innovative collaboration
shares, shapes, and standardizes information regarding visual data
collections and enables its educational use. A full list of members can be
found at <<http://www.amico.org>http://www.amico.<http://www.amico.org>org>.

Contact Information:

Jennifer Trant Charly Bauer
Executive Director Assistant Director of Library
Art Museum Image Consortium Systems - Digital Media
2008 Murray Avenue, Suite D Ohio Library & Information
Pittsburgh, PA 15217 2455 North Star Road, Suite
Phone (412) 422 8533 Columbus, OH 43221
Fax (412) 422 8594 Phone (614) 728 3600 ext. 338
Email: jtrant@amico.org Fax (614) 728-3610
Email: charly@ohiolink.edu
NINCH-Anounce is an announcement listserv, produced by the National
Initiative for a Networked Cultural Heritage (NINCH), a diverse coalition
of arts, humanities and social science organizations created to assure
leadership from the cultural community in the evolution of the digital

The subjects of these announcements are not, unless otherwise noted, the
projects of NINCH; neeither does NINCH necessarily endorse the subjects of

We attempt to credit all re-distributed news and announcements and
appreciate reciprocal credit.

For questions, comments or requests to un-subscribe, contact the editor:

David L. Green
Executive Director
21 Dupont Circle, NW
Washington DC 20036
202/296-5346 202/872-0886 fax

See and search back issues of NINCH-ANNOUNCE at

Humanist Discussion Group
Information at <http://www.kcl.ac.uk/humanities/cch/humanist/>