4.1114 Q: Database & Collation SW for Editions (1/63)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Sat, 2 Mar 91 22:10:43 EST

Humanist Discussion Group, Vol. 4, No. 1114. Saturday, 2 Mar 1991.

Date: Fri, 1 Mar 91 02:42:25 EST
From: markt@umd5.umd.edu (Mark Turner)
Subject: computerizing an edition of Shelley

I am posting this query for my colleague Neil Fraistat and his
collaborator Donald H. Reiman, who are looking for help in
computerizing a critical edition of the complete poems of
Percy Bysse Shelley. Please send responses to
Professor Fraistat (nf5@umail.umd.edu) and to me (markt@umd5.umd.edu).

We are planning a critical edition of the complete poems of Percy Bysshe
Shelley. This decade-long, multi-volume project involves the collation
of perhaps fifteen texts or more, including several manuscripts. The
collation of a half-dozen of these variants will appear in an apparatus
at the bottom of the page; the collation of these plus the remainder
will appear in an appendix at the end of each volume. The types of
variables we want to reflect in the apparatus will differ in each of
these instances; the bottom of the page should include additions,
deletions, and changes in word choice and punctuation. The more
comprehensive historical collation in the appendix will involve these
kinds of changes plus changes in line indentation and capitalization.

We would like to automate this process as far as possible. We are
looking for information on both a database in which to store our texts
and a collation program that, optimally, will produce a virtually
camera-ready master text and apparatus.

As we see it now, the database will have about a half-dozen fields, most
of them relatively brief--including such items as an ID number field, a
variant ID field (which may be a manuscript number or an edition date),
a composition date field, a release date field, and a date/ID of entry
into the system field. Our text field will need to be much larger,
perhaps 2000 lines of poetry.

We are not familiar with database technology, and any suggestions on
streamlining the above would be helpful. The rationale for the ID
number field is as follows: in order to link all versions of a poem as
the same object, we need an invariant identifier. All the facts about
each poems--dates, titles, etc.--are subject to controversy. The ID
number links all incarnations of a given poem. If there are other ways
to provide such a link, we'd like to know.

We'd like the collation program to be able to accept text from disk,
specifically, from our database text field, and we'd also like to have
the option of collating the text interactively, as we enter it, and
subsequently entering it into the database. The collation program will
deal with poetry, so printing and preserving line numbers is important.
There will be a number of messy problems to automate: variants with
different numbers of lines, with chunks of text missing or added;
variants which may differ radically for a series of lines.

We hope to be able to use scanning to convert our sources to
machine-readable copy. We'd appreciate any advice or comment on the
success of scanning. We have both Kurzweiler and Opti-scan available on
campus.

We'd like to do this project on PCs. We currently employ IBM
compatibles, and, of course, it would be convenient to continue to use
them. However, if alternative hardware would greatly improve our
automation options, we will consider changing.

Thanks for any advice on automating our project. Since we expect to be
at this task for many years into the future, we would like our current
hard- and software choices to anticipate the technology to come. We'd
also like our database to be a resource for future scholarship in a
variety of areas we are not immediately involved in ourselves, such as
manufacturing concordances, doing linguistic studies, and so forth.