4.1048 TEI Progress Report, Feb 1991 (1/256)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Sun, 17 Feb 91 21:20:23 EST

Humanist Discussion Group, Vol. 4, No. 1048. Sunday, 17 Feb 1991.

Date: Thu, 14 Feb 91 18:19:18 CST
From: Michael Sperberg-McQueen 312 996-2477 -2981 <U35395@UICVM.BITNET>
Subject: TEI progress report, February 1991
Original to: Text Encoding Initiative public discussion list <TEI-L@UICVM>
Forwarded by: Allen Renear <allen@brownvm.brown.edu>

[ This was posted to TEI-L (Entry #243). I think it will interest
many Humanists who are not subscribed to that list. -- Allen]

-----------------------------------------------------------------

... [this is] ... a brief summary of what has been happening
in the TEI since the distribution of the first Draft Guidelines last fall.
Sincere apologies to those who feel such a report is long overdue!

1. TEI Deliverables

1.a. Documents

First, a brief recap on the project's overall timescale and objectives.
What will the TEI deliver in June 1992, when the funding dries up? It
seems clear that a single massive report (a revised and extended version
of the current document TEI P1) will not be enough. The need for a
brief introductory guide, setting out the basic TEI framework and
philosophy, has been repeatedly pointed out to us, sometimes privately
and often publicly, as has the pressing need for tutorial material, and
for demonstrations of TEI encoded texts in action. No effort was put
into producing these in the first cycle, for the good reason that we did
not at that time know what exactly we would be providing an introductory
guide to! Now that the basic TEI framework is a little less nebulous,
it seems appropriate to address these problems.

Preparations for the forthcoming TEI Workshop at Tempe will provide one
important source of such materials, and input from the affiliated
projects another. It's possible that readers of this list may also have
prepared some summary or explanatory material which might be of use --
don't be shy about letting us know about it, if you have. (For
starters, we were recently delighted to receive a translation into
Hungarian of the four page `executive summary' of P1).

1.b. Software -- a non-deliverable

After tutorial and introductory materials the most frequently expressed
desire at present seems to be for TEI-conformant software: systems which
behave like the analytic packages we all know and love, but can also
take advantage of the new capabilities offered by SGML. As a first
step, we need programs (filters, as they are known in the trade) to
translate from the TEI encoding scheme to those required by the
application programs we use, and back in the other direction. For
rolling one's own software, the community needs generally available
routines which can read and understand TEI documents and which can be
built into software individuals or projects develop for themselves or
others (TEI parsers). Equally important for the usability of the
encoding scheme in the community at large will be TEI-aware data-entry
software -- editors and word processors which can exploit the rich text
structure provided by SGML, simple routines to allow TEI tags to be
entered into a text with a keystroke or two instead of ten or twenty (or
in extreme cases even more!), and other tools to help make new texts in
the form recommended by the TEI.

Approximations to some of these are already available, and we hope to be
demonstrating some of them at the Tempe Workshop. As we have often
said, the TEI is not in the business of software development:
nevertheless, it's clear that when any opportunity of steering software
developers into channels likely to benefit the TEI community presents
itself, we'd be foolish not to take it. So far, only encouraging noises
have been heard from most, but products like DynaText (from Electronic
Book Technologies) are a clear indication of the kinds of software we
should expect to be able to choose amongst by the time the project ends.
The Metalanguage Committee has accepted a `watching brief' to monitor
and report on the features of commercially available SGML software, and
has already produced a preliminary working paper (ML P28) which lists
several products of interest to the TEI community, as well as a revised
and expanded version of Robin Cover's monumental bibliography of SGML
related information (ML W14). (These are not yet publicly available; ML
P28 is being revised to correct a slip or two, and ML W14 will be put on
the TEI-L file server just as soon as we can sweettalk the UIC system
management into the necessary megabyte or so of disk space and move the
data to Chicago from Kingston.)

1.c. And more documents

Just as many people have asked for some description of TEI encoding less
technical and formal than TEI P1, so also some have asked for a more
formal treatment of the scheme, so that it would be easier to write the
TEI-conformant software they'd like to develop. In this connection,
some work is proceeding (slowly!) on a formal presentation of the subset
of SGML required by the TEI; the Metalanguage committee is also working
on a more explicit definition of the notion 'TEI conformance'; this
concept was intentionally left vague in the first draft but it appears
that such vagueness has less to recommend it than we thought.

2. TEI Workplans

If we're not producing any software, and only grudgingly getting round
to explaining the work done in the first cycle, what, you might
reasonably enquire, are we in fact doing? The major objective during
the second funding cycle will be to extend the scope and coverage of the
Guidelines. Those who have read P1 closely will be aware, as we are, of
the very large number of topics sketched out, adumbrated or downright
neglected therein. We remain confident that P1 provides a good general
framework for most forms of text-based scholarship, but we need to put
this claim to the test in more (and more different) areas of
specialisation than was possible during the first cycle.

How will this be done? One way, as we've already indicated, will be
through the testing of the Guidelines in a practical situation which the
Affiliated Projects will carry out. The other will be through the
setting-up of a number of small but tightly-focussed working groups to
make recommendations in specified areas, either directly where an area
is already well-defined, or indirectly by sketching out a problem domain
and proposing other work groups which need to be set up within it. Each
work group will be given a specific charge and will work to a specified
deadline. So far, about a dozen such groups have been set up, most of
which are due to report back by the end of March: a list of currently
active work groups and their heads is given below:

TR1: Character sets (Harry Gaylord, University of Groningen)
TR2: Text criticism (Robert Kraft, University of Pennsylvania)
TR3: Hypertext and hypermedia (Steven DeRose, EBT)
TR4: Mathematical formulae and tables (Paul Ellison, University
of Exeter)
TR6: Language corpora (Douglas Biber, Northern Arizona University)
AI1: General linguistics (Terry Langendoen, University of Arizona)
AI2: Spoken texts (Stig Johansson, University of Oslo)
AI3: Literary studies (Paul Fortier, University of Manitoba)
AI4: Historical studies (Daniel Greenstein, University of Glasgow)
AI5: Machine-readable dictionaries (Robert Amsler, Mitre Corporation)
AI6: Computational lexica (Robert Ingria, BBN)

Each group is formally assigned to one of the two major working
committees of the TEI, depending on whether its work is primarily
concerned with Text Representation (TR) or Text Analysis and
Interpretation (AI). These two committees will then review and endorse
the findings of each work group, though we expect that for some areas we
will also seek expert outside reviewers, perhaps with the assistance of
the Advisory Board.

A number of other work group topics have already been identified, and
are in the process of being set up: these include the following:

TR5: Newspapers
TR7: General reference works
TR8: Physical description of manuscripts and incunabula
TR9: Analytic bibliography
AI7: Terminological data

For some of these we have already identified suitably qualified members;
for others (in particular the first two)
* * * * * * * * * * * * * * * * * * * * * * * * *
* we are soliciting volunteers or nominations. *
* * * * * * * * * * * * * * * * * * * * * * * * *
If there is an area of textual scholarship which you feel has been
unjustly neglected by the current draft, please don't hesitate to let us
know about it! Among other areas already proposed for consideration
are

- version control and the gradual enrichment of
machine-readable texts
- ephemera (tickets, matchbooks, advertising)
- fragmentary ancient media (potshards, inscriptions etc.)
- emblems (both isolated and libri emblematum)

A meeing was held in Oxford in early December for the heads of all
then-constituted workgroups, and some workgroups are already well
advanced in their work. As reports become available, their existence
will be publicized on this list and elsewhere. (You have already seen
one working paper produced by the work group on literary studies.) In
addition, of course, we will be making a full TEI progress report at the
Tempe conference.

3. TEI Working Documents

We are in the process of revising and making more accessible the TEI
document register at Chicago, which holds information about all
TEI-related working papers, reports and publications. Wherever
possible, we will try to make sure that finalized reports of general
interest are posted on this ListServ in the usual way. To find out what
is currently available, send a note to LISTSERV@UICVM containing the
line GET TEI-L FILELIST. Specific documents can be requested in the
same way, or by contacting Wendy Plotkin (U49127@UICVM) who looks after
the register.

The one document most requested (P1 itself) is still, we regret, not
available in electronic form -- we just haven't buckled down to the task
of recoding its current rather esoteric markup. Please bear with us!
However, the following documents are now or will soon be available (as
are others of ephemeral or less general interest -- contact Wendy
Plotkin for a full list), some tagged in TeX, some in (an extended form
of) Waterloo or IBM GML, some without explicit tags in a form designed
for reading onscreen or simple printing:

TEI PC P1 The Preparation of Text Encoding Guidelines
(closing statement of the planning meeting in Poughkeepsie, NY,
November 1987 -- often referred to in TEI documents as the
"Poughkeepsie Principles")
TEI AB P1 Closing Statement of the Text Encoding Initiative Advisory
Board Meeting, February 1989
(just what the title says)
TEI J6 Welcome to TEI-L
TEI J10 Guide to the Structure of the TEI
(September 1989 -- now slightly out of date, since this document
doesn't cover the work groups described above)
TEI PO A1 List of Participating Organizations
TEI ED P1 Design Principles for Text Encoding Guidelines
(a statement of basic design goals for the TEI)
TEI ED P3 Theoretical Stance and Resolution of Theory Conflict
(possible outcomes in fields with competing theoretical approaches)
TEI ED W5 Tags and Features
(a stab at a basic taxonomy of tags and textual features, with the
specification of a database record design for a database of tags;
rather technical, has been described as unreadable by some readers,
as fairly useful by others)
TEI ML W13 Guidelines for TEI Use of SGML
(virtually identical with section 2.2 of TEI P1; rather technical)
TEI ML W14 SGML Bibliography (Barnard and Cover)
(very large bibliography of work on SGML and text encoding; will be
available soon electronically from TEI-L and as tech report from
Queen's University, Ontario)
TEI AI3 W4 Literature Needs Survey Results
(responses to a survey on needs of literary scholars conducted
by the work group for literary studies)
TEI AI3 W5 The TEI Guidelines (Version 1.1): A Critique by the
Literature Working Group
(a detailed commentary on TEI P1 from the point of view of literary
scholars)
TEI AI1 W2 List of Common Morphological Features for Inclusion in TEI
Starter Set of Grammatical-Annotation Tags
(list of grammatical features and the values they may take, for the
languages of the EEC and Russian; makes no concessions for the
non-linguist and does not discuss the mechanisms required for
abbreviating grammatical annotation)
TEI AI1 W3 Feature System Declarations and the Interpretation of
Feature Structures
(technical treatment of problems arising in use of feature structures
as defined in TEI P1 chapter 6, and proposal for a method of
solving them with a specialized SGML document declaring the feature
system in use. No concessions for lack on linguistic or SGML
knowledge.)

4. A plea for help

We've said it before and we'll say it again: the TEI will only succeed
with the active critical participation of the community it aims to
serve. If you have views on any of the topics addressed by the TEI we
want to hear them. Post a note to this bulletin board, or to us
directly: we may not respond as fully or as quickly as we might wish
to, but be sure that your comments will be taken note of and forwarded
to the appropriate technical committee or workgroup. We are committed
to respond to and summarize all comments on our proposals, and it is a
commitment we take very seriously indeed. (A summary of comments
received through November is in progress, as are formal replies to
them.) At the very least, we want to hear from everyone who received a
copy of TEI P1 -- so please don't forget to complete and send in the
'User Response and Comment' form that came with your copy, if you have
one!


Lou Burnard (LOU@VAX.OXFORD.AC.UK)
Michael Sperberg-McQueen (U35395@UICVM.BITNET)
(virtually identical with section 2.2 of TEI P1; rather technical)
TEI ML W14 SGML Bibliography (Barnard and Cover)
(very large bibliography of work on SGML and text encoding; will be
available soon electronically from TEI-L and as tech report from
Queen's University, Ontario)
TEI AI3 W4 Literature Needs Survey Results
(responses to a survey on needs of literary scholars conducted
Received: from BROWNVM (EDITORS) by BROWNVM.BROWN.EDU (Mailer R2.07) with BSMTP
id 1904; Sun, 17 Feb 91 21:20:33 EST
Date: Sun, 17 Feb 91 21:20:23 EST
From: Elaine Brennan & Allen Renear <EDITORS@BROWNVM>
Subject: 4.1048 TEI Progress Report, Feb 1991 (1/256)
To: Humanist Discussion <HUMANIST@BROWNVM>


Humanist Discussion Group, Vol. 4, No. 1048. Sunday, 17 Feb 1991.

Date: Thu, 14 Feb 91 18:19:18 CST
From: Michael Sperberg-McQueen 312 996-2477 -2981 <U35395@UICVM.BITNET>
Subject: TEI progress report, February 1991
Original to: Text Encoding Initiative public discussion list <TEI-L@UICVM>
Forwarded by: Allen Renear <allen@brownvm.brown.edu>

[ This was posted to TEI-L (Entry #243). I think it will interest
many Humanists who are not subscribed to that list. -- Allen]

-----------------------------------------------------------------

... [this is] ... a brief summary of what has been happening
in the TEI since the distribution of the first Draft Guidelines last fall.
Sincere apologies to those who feel such a report is long overdue!

1. TEI Deliverables

1.a. Documents

First, a brief recap on the project's overall timescale and objectives.
What will the TEI deliver in June 1992, when the funding dries up? It
seems clear that a single massive report (a revised and extended version
of the current document TEI P1) will not be enough. The need for a
brief introductory guide, setting out the basic TEI framework and
philosophy, has been repeatedly pointed out to us, sometimes privately
and often publicly, as has the pressing need for tutorial material, and
for demonstrations of TEI encoded texts in action. No effort was put
into producing these in the first cycle, for the good reason that we did
not at that time know what exactly we would be providing an introductory
guide to! Now that the basic TEI framework is a little less nebulous,
it seems appropriate to address these problems.

Preparations for the forthcoming TEI Workshop at Tempe will provide one
important source of such materials, and input from the affiliated
projects another. It's possible that readers of this list may also have
prepared some summary or explanatory material which might be of use --
don't be shy about letting us know about it, if you have. (For
starters, we were recently delighted to receive a translation into
Hungarian of the four page `executive summary' of P1).

1.b. Software -- a non-deliverable

After tutorial and introductory materials the most frequently expressed
desire at present seems to be for TEI-conformant software: systems which
behave like the analytic packages we all know and love, but can also
take advantage of the new capabilities offered by SGML. As a first
step, we need programs (filters, as they are known in the trade) to
translate from the TEI encoding scheme to those required by the
application programs we use, and back in the other direction. For
rolling one's own software, the community needs generally available
routines which can read and understand TEI documents and which can be
built into software individuals or projects develop for themselves or
others (TEI parsers). Equally important for the usability of the
encoding scheme in the community at large will be TEI-aware data-entry
software -- editors and word processors which can exploit the rich text
structure provided by SGML, simple routines to allow TEI tags to be
entered into a text with a keystroke or two instead of ten or twenty (or
in extreme cases even more!), and other tools to help make new texts in
the form recommended by the TEI.

Approximations to some of these are already available, and we hope to be
demonstrating some of them at the Tempe Workshop. As we have often
said, the TEI is not in the business of software development:
nevertheless, it's clear that when any opportunity of steering software
developers into channels likely to benefit the TEI community presents
itself, we'd be foolish not to take it. So far, only encouraging noises
have been heard from most, but products like DynaText (from Electronic
Book Technologies) are a clear indication of the kinds of software we
should expect to be able to choose amongst by the time the project ends.
The Metalanguage Committee has accepted a `watching brief' to monitor
and report on the features of commercially available SGML software, and
has already produced a preliminary working paper (ML P28) which lists
several products of interest to the TEI community, as well as a revised
and expanded version of Robin Cover's monumental bibliography of SGML
related information (ML W14). (These are not yet publicly available; ML
P28 is being revised to correct a slip or two, and ML W14 will be put on
the TEI-L file server just as soon as we can sweettalk the UIC system
management into the necessary megabyte or so of disk space and move the
data to Chicago from Kingston.)

1.c. And more documents

Just as many people have asked for some description of TEI encoding less
technical and formal than TEI P1, so also some have asked for a more
formal treatment of the scheme, so that it would be easier to write the
TEI-conformant software they'd like to develop. In this connection,
some work is proceeding (slowly!) on a formal presentation of the subset
of SGML required by the TEI; the Metalanguage committee is also working
on a more explicit definition of the notion 'TEI conformance'; this
concept was intentionally left vague in the first draft but it appears
that such vagueness has less to recommend it than we thought.

2. TEI Workplans

If we're not producing any software, and only grudgingly getting round
to explaining the work done in the first cycle, what, you might
reasonably enquire, are we in fact doing? The major objective during
the second funding cycle will be to extend the scope and coverage of the
Guidelines. Those who have read P1 closely will be aware, as we are, of
the very large number of topics sketched out, adumbrated or downright
neglected therein. We remain confident that P1 provides a good general
framework for most forms of text-based scholarship, but we need to put
this claim to the test in more (and more different) areas of
specialisation than was possible during the first cycle.

How will this be done? One way, as we've already indicated, will be
through the testing of the Guidelines in a practical situation which the
Affiliated Projects will carry out. The other will be through the
setting-up of a number of small but tightly-focussed working groups to
make recommendations in specified areas, either directly where an area
is already well-defined, or indirectly by sketching out a problem domain
and proposing other work groups which need to be set up within it. Each
work group will be given a specific charge and will work to a specified
deadline. So far, about a dozen such groups have been set up, most of
which are due to report back by the end of March: a list of currently
active work groups and their heads is given below:

TR1: Character sets (Harry Gaylord, University of Groningen)
TR2: Text criticism (Robert Kraft, University of Pennsylvania)
TR3: Hypertext and hypermedia (Steven DeRose, EBT)
TR4: Mathematical formulae and tables (Paul Ellison, University
of Exeter)
TR6: Language corpora (Douglas Biber, Northern Arizona University)
AI1: General linguistics (Terry Langendoen, University of Arizona)
AI2: Spoken texts (Stig Johansson, University of Oslo)
AI3: Literary studies (Paul Fortier, University of Manitoba)
AI4: Historical studies (Daniel Greenstein, University of Glasgow)
AI5: Machine-readable dictionaries (Robert Amsler, Mitre Corporation)
AI6: Computational lexica (Robert Ingria, BBN)

Each group is formally assigned to one of the two major working
committees of the TEI, depending on whether its work is primarily
concerned with Text Representation (TR) or Text Analysis and
Interpretation (AI). These two committees will then review and endorse
the findings of each work group, though we expect that for some areas we
will also seek expert outside reviewers, perhaps with the assistance of
the Advisory Board.

A number of other work group topics have already been identified, and
are in the process of being set up: these include the following:

TR5: Newspapers
TR7: General reference works
TR8: Physical description of manuscripts and incunabula
TR9: Analytic bibliography
AI7: Terminological data

For some of these we have already identified suitably qualified members;
for others (in particular the first two)
* * * * * * * * * * * * * * * * * * * * * * * * *
* we are soliciting volunteers or nominations. *
* * * * * * * * * * * * * * * * * * * * * * * * *
If there is an area of textual scholarship which you feel has been
unjustly neglected by the current draft, please don't hesitate to let us
know about it! Among other areas already proposed for consideration
are

- version control and the gradual enrichment of
machine-readable texts
- ephemera (tickets, matchbooks, advertising)
- fragmentary ancient media (potshards, inscriptions etc.)
- emblems (both isolated and libri emblematum)

A meeing was held in Oxford in early December for the heads of all
then-constituted workgroups, and some workgroups are already well
advanced in their work. As reports become available, their existence
will be publicized on this list and elsewhere. (You have already seen
one working paper produced by the work group on literary studies.) In
addition, of course, we will be making a full TEI progress report at the
Tempe conference.

3. TEI Working Documents

We are in the process of revising and making more accessible the TEI
document register at Chicago, which holds information about all
TEI-related working papers, reports and publications. Wherever
possible, we will try to make sure that finalized reports of general
interest are posted on this ListServ in the usual way. To find out what
is currently available, send a note to LISTSERV@UICVM containing the
line GET TEI-L FILELIST. Specific documents can be requested in the
same way, or by contacting Wendy Plotkin (U49127@UICVM) who looks after
the register.

The one document most requested (P1 itself) is still, we regret, not
available in electronic form -- we just haven't buckled down to the task
of recoding its current rather esoteric markup. Please bear with us!
However, the following documents are now or will soon be available (as
are others of ephemeral or less general interest -- contact Wendy
Plotkin for a full list), some tagged in TeX, some in (an extended form
of) Waterloo or IBM GML, some without explicit tags in a form designed
for reading onscreen or simple printing:

TEI PC P1 The Preparation of Text Encoding Guidelines
(closing statement of the planning meeting in Poughkeepsie, NY,
November 1987 -- often referred to in TEI documents as the
"Poughkeepsie Principles")
TEI AB P1 Closing Statement of the Text Encoding Initiative Advisory
Board Meeting, February 1989
(just what the title says)
TEI J6 Welcome to TEI-L
TEI J10 Guide to the Structure of the TEI
(September 1989 -- now slightly out of date, since this document
doesn't cover the work groups described above)
TEI PO A1 List of Participating Organizations
TEI ED P1 Design Principles for Text Encoding Guidelines
(a statement of basic design goals for the TEI)
TEI ED P3 Theoretical Stance and Resolution of Theory Conflict
(possible outcomes in fields with competing theoretical approaches)
TEI ED W5 Tags and Features
(a stab at a basic taxonomy of tags and textual features, with the
specification of a database record design for a database of tags;
rather technical, has been described as unreadable by some readers,
as fairly useful by others)
TEI ML W13 Guidelines for TEI Use of SGML
(virtually identical with section 2.2 of TEI P1; rather technical)
TEI ML W14 SGML Bibliography (Barnard and Cover)
(very large bibliography of work on SGML and text encoding; will be
available soon electronically from TEI-L and as tech report from
Queen's University, Ontario)
TEI AI3 W4 Literature Needs Survey Results
(responses to a survey on needs of literary scholars conducted
by the work group for literary studies)
TEI AI3 W5 The TEI Guidelines (Version 1.1): A Critique by the
Literature Working Group
(a detailed commentary on TEI P1 from the point of view of literary
scholars)
TEI AI1 W2 List of Common Morphological Features for Inclusion in TEI
Starter Set of Grammatical-Annotation Tags
(list of grammatical features and the values they may take, for the
languages of the EEC and Russian; makes no concessions for the
non-linguist and does not discuss the mechanisms required for
abbreviating grammatical annotation)
TEI AI1 W3 Feature System Declarations and the Interpretation of
Feature Structures
(technical treatment of problems arising in use of feature structures
as defined in TEI P1 chapter 6, and proposal for a method of
solving them with a specialized SGML document declaring the feature
system in use. No concessions for lack on linguistic or SGML
knowledge.)

4. A plea for help

We've said it before and we'll say it again: the TEI will only succeed
with the active critical participation of the community it aims to
serve. If you have views on any of the topics addressed by the TEI we
want to hear them. Post a note to this bulletin board, or to us
directly: we may not respond as fully or as quickly as we might wish
to, but be sure that your comments will be taken note of and forwarded
to the appropriate technical committee or workgroup. We are committed
to respond to and summarize all comments on our proposals, and it is a
commitment we take very seriously indeed. (A summary of comments
received through November is in progress, as are formal replies to
them.) At the very least, we want to hear from everyone who received a
copy of TEI P1 -- so please don't forget to complete and send in the
'User Response and Comment' form that came with your copy, if you have
one!


Lou Burnard (LOU@VAX.OXFORD.AC.UK)
Michael Sperberg-McQueen (U35395@UICVM.BITNET)