10.0607 anaphora in Madrid

WILLARD MCCARTY (willard.mccarty@kcl.ac.uk)
Fri, 17 Jan 1997 21:25:50 +0000 (GMT)

Humanist Discussion Group, Vol. 10, No. 607.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Centre for Computing in the Humanities, King's College London
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: Ruslan Mitkov <r.mitkov@wlv.ac.uk> (149)
Subject: ACL'97 / EACL'97 workshop on anaphora

ACL'97 / EACL'97 Workshop
Madrid, Spain



After considerable initial research in algorithmic approaches to
anaphora resolution in the seventies and after years of relative silence
in the early eighties, this problem has again attracted the attention of
many researchers in the last 10 years, with much new and promising work
reported recently. Inspired by the increasing volume of such work, this
workshop calls for submissions describing recent advances in the field
and focusing on "robust", "parser-free", "corpus-driven",
"empirically-based", and/or other practical approaches to resolving
anaphora in unrestricted texts.

Strategies for algorithmic anaphora resolution---arguably among the
toughest problems in Computational Linguistics and Natural Language
Processing---so far have exploited predominantly traditional linguistic
approaches. A disadvantage, however, of implementing such approaches
stems from the need for representation and manipulation of the
variegated types of linguistic and domain knowledge, with the
concomitant expense of human input and computational processing. Even
so, effectiveness still tends to depend on imposing suitable
restrictions to the domain.

While various new alternatives have been proposed, e.g. making use of a
situation semantics framework or principles of reasoning with
uncertainty, there is still a strong need for the development of robust
and effective methods to meet the demand of practical NLP systems (with
tasks ranging from content analysis to machine translation to discourse
and dialogue processing), and to enhance further the automatic
processing of growing language resources (e.g. by automatically
annotating corpora with anaphor-antecedent links). This need for
inexpensive, practical and, possibly, corpus-related approaches suitable
for unrestricted texts has fuelled renewed research efforts in the
field. Several proposals have already addressed the anaphora resolution
problem by deliberately limiting the extent to which they rely on domain
and/or linguistic knowledge, and by moving away from the traditional
domain/sublanguage restriction. Observing a very clear trend towards
inexpensive, knowledge-poor, corpus-based methods---which remain robust
and scale well---it is clear that there is scope for much more to be
done in this direction.

A core issue here is that of optimal use of a set of contributing
factors: these include, for instance, gender and number agreement,
c-command constraints, semantic consistency, syntactic parallelism,
semantic parallelism, salience, proximity and so forth. It is possible
to impose an ordering on such factors, with respect to both their
overall utility to the resolution process, and the expense associated
with their computation in a particular linguistic framework and
processing environment. The computational linguistics literature uses
diverse terminology for these, reflecting their different operational
status and, hence, contributing weight in the resolution process: for
instance, "constraints" tend to be absolute, and therefore
"eliminating"; "preferences", on the other hand, tend to be relative,
and therefore require the use of additional criteria. One of the major
difficulties with scaling up the strong, linguistically derived
procedures to real data stems from the lack of systematic understanding
of the interactions between, and limitations of, the plethora of factors
posited by the different methods under names such as "constraints",
"preferences", "attributes", "symptoms", and so forth.

This workshop, therefore, has a dual focus. It solicits submissions
describing work which addresses the practical requirements of
operational and robust anaphora resolution components. It also seeks to
investigate the role of, and interactions among, the various factors in
anaphora resolution: in particular those that scale well, or that
translate easily to knowledge-poor environments. The following
questions are for illustrative purposes only:

= Is it possible to propose a core set of factors used in anaphora
resolution? Are there factors that we are not fully aware of? Which
of these are better suited for robust approaches, and what is their
dependence upon strategies?
= When dealing with real data, is it at all possible to posit
"constraints", or should all factors be regarded as "preferences"?
What is the case for languages other than English?
= What degree of preference (weight) should be given to "preferential"
factors? How should weights best be determined? What empirical data
can be brought to bear on this?
= What would be an optimal order for the application of multiple
factors? Would this affect the scoring strategies used in selecting
the antecedent?
= Is it realistic to expect high precision over unrestricted texts?
= Is it realistic to determine anaphoric links in corpora automatically?
= Are all CL applications 'equal' with respect to their requirements
from an anaphora resolution module? What kind(s) of compromises
might be possible, depending on the NLP task, and how would
awareness of these affect the tuning of a resolution algorithm for
particular type(s) of input text?


Dr. Ruslan Mitkov Dr. Branimir K. Boguraev,
School of Languages and European Studies Apple Research Laboratories
University of Wolverhampton Apple Computer, Inc.
Stafford St. One Infinite Loop, MS: 301-3S
Wolverhampton WV1 1SB Cupertino, CA 95014
United Kingdom USA
Tel (44-1902) 322471 Tel: (1-408) 974 1048
Email r.mitkov@wlv.ac.uk Email: bkb@research.apple.com


Breck Baldwin (University of Pennsylvania)
Branimir Boguraev (Apple Computer, Cupertino)
David Carter (SRI, Cambridge)
Megumi Kameyama (SRI, Menlo Park)
Christopher Kennedy (University of California, Santa Cruz)
Shalom Lappin (University of London)
Tony McEnery (Lancaster University)
Ruslan Mitkov (University of Wolverhampton)
Celia Rico Perez (University Alfonso X el Sabio, Madrid)
Frederique Segond (Rank Xerox Research Centre, Grenoble)
Sandra Williams (BT Research Labs, Ipswich)


Authors are asked to submit previously unpublished papers; all
submissions should be sent to Ruslan Mitkov. A limited number of
position papers could also be considered. Each submission will undergo
multiple reviews. The papers should be full length (not exceeding 3200
words, exclusive of references), also including a descriptive abstract
of about 200 words. Electronic submissions are strongly preferred,
either in self-contained LaTeX format (using the ACL-97 submission
style; see: ftp://ftp.cs.columbia.edu/acl-l/, as well as the submission
guidelines for the main conference, at http://www.ieec.uned.es/cl97/),
or as a PostScript file. In exceptional circumstances, Microsoft Word
files will also be accepted as electronic submissions, provided they
follow the same formating guidelines. Hard copy submissions should
include eight copies of the paper. A separate title page should include
the title of the paper, names, addresses (postal and e-mail), telephone
and fax number of all authors. Any correspondence will be addressed to
the first author (unless otherwise specified). Authors will be
responsible for preparation of camera-ready copies of final versions of
accepted papers, conforming to a uniform format, with guidelines and a
style file to be supplied by the organisers.


Presentations will be allocated 30 minutes slots each, distributed over
a morning and an afternoon sessions, including an invited talk and a
(closing) general discussion.


Due to space constraints, workshop attendance will be limited to about
40 participants. Priority will be given to authors of submissions; the
rest of the participants will be registered on a first-come, first-serve
basis. Details about registration will be included in the second
announcement. Please note that according to the ACL/EACL workshop
guidelines, all workshop participants must register for the ACL/EACL
main conference as well.


Submission deadline: 14 March 1997
Notification of acceptance: 14 April 1997
Camera-ready versions of accepted papers due: 05 May 1997
Workshop: 11 or 12 July 1997


For further information concerning the workshop, please contact the
organisers. For information about the main ACL'97/EACL'97 conference,
see http://horacio.ieec.uned.es/cl97/.