3.1282 preparation of e-texts (176)

Willard McCarty (MCCARTY@vm.epas.utoronto.ca)
Mon, 9 Apr 90 22:26:33 EDT

Humanist Discussion Group, Vol. 3, No. 1282. Monday, 9 Apr 1990.

(1) Date: Sat, 7 Apr 90 13:34:20 EDT (31 lines)
From: "Michael S. Hart" <HART@UIUCVMD>
Subject: Re: 3.1270 scanning and correcting (109)

(2) Date: Sat, 7 Apr 90 18:39:02 EDT (33 lines)
From: Robert Hollander <bobh@phoenix.Princeton.EDU>
Subject: Re: 3.1278 correcting and scanning (70)

(3) Date: Sat, 07 Apr 90 21:24 CST (66 lines)
From: Alvin Snider <ASNIDEPD@UIAMVS>
Subject: correcting the e-text

(4) Date: Mon, 9 Apr 90 15:29:07 EDT (20 lines)
From: "Michael S. Hart" <HART@UIUCVMD>
Subject: Re: 3.1278 correcting and scanning (70)

(1) --------------------------------------------------------------------
Date: Sat, 7 Apr 90 13:34:20 EDT
From: "Michael S. Hart" <HART@UIUCVMD>
Subject: Re: 3.1270 scanning and correcting (109)

re BobH

How long is the shortest of the six commentaries remaining to be proofread
in English? I will see if some of the Project Gutenberg people would read
it a few times each for you. (This is contrary to our stated policy: the
primary purpose of Project Gutenberg is the creation and distribution of a
set of etexts made from original works - not of commentaries).

re Amsler

I renew my comment that 99% of the work should not be undertaken to assist
1% of the user population, who will want works they can read and search in
Word, WordPerfect, WordStar, LIST, etc.

re the comment on scanning taking longer than typing.

I would suggest you contact the manufacturer of your hardware and software.
Cheap ($2,000) scanners and ($800) software have been available for several
years, which yield a time factor of 10% of what it took to do typing; while
we usually refrain from recommending hardware publicly, we have had a MAJOR
success with Apple scanners (we use a flatbed SCSI on a Mac), with OmniPage
and TextPert software for quite some time now. The time when keyboarding a
text could compete with scanning and proofreading, passed several years ago
when these products first entered the market. As with all such innovations
it will take a little time before everyone can use them efficiently. Would
anyone who is unhappy with their scanning be willing to have the equipment,
and work, take a place at Project Gutenberg? Michael S. Hart
(2) --------------------------------------------------------------43----
Date: Sat, 7 Apr 90 18:39:02 EDT
From: Robert Hollander <bobh@phoenix.Princeton.EDU>
Subject: Re: 3.1278 correcting and scanning (70)

Germaine Warkintin may have missed my posting in response to Michael
Hart's optimistic views on text-correction and, in her response to MH,
asks, "How about it, Hollander?" Since I've already had my say, I'll
merely repeat what I've already indicated: e-texts need to be as close
to totally accurate as we can make them. That would indicate that
someone has got to be responsible for their initial accuracy; and
someone (else?) has got to be in charge of the eventual further
corrections which will surely accrue. A further wrinkle. I will
bet that something like 50% of all eventual "corrections" of the
Dartmouth Dante database that will be offered by users will be
corrections of older spellings in the documents which in fact
should _not_ be changed. Someone has to have the originals available
for checking before entering these changes.

On scanning, I can report that the first document which Dartmouth
scanned, in order to test the capacity of the KDEM which D'mth had
purchased, came out gibberish after the first few pages. The folks
up there quickly learned that good scanning is labor-intensive,
requiring skilled operators (Bob Kraft knows a lot about all this).
On the other hand, if one has a text which is reasonably "legible,"
the result is generally impressive: ca. 99.99% accurate. That
does not get away from the need to proof-read, but it does greatly
speed that onerous process. There are no certain answers in all
this, even at any given moment. And since the situation of text
production is in flux, we simply need to be able to make the most
informed decisions we can.

Robert Hollander
(3) --------------------------------------------------------------73----
Date: Sat, 07 Apr 90 21:24 CST
From: Alvin Snider <ASNIDEPD@UIAMVS>
Subject: correcting the e-text

>I realize that the notion of an "established" text no longer
>has the validity (false, I agree) which it once had. But even those who
>are at work editing or studying what are now called "open texts" would,
>I believe, insist that responsibility to the original, whatever they
>determine that to be, is a scholarly necessity. Hart's concept of text
>(which I have been watching with increasing unease in his messages of
>recent days) seems to be a very different one, and surely requires
>fuller demonstration and careful argument.

For both consumers and producers of e-texts, Germaine Warkentin's
comments point the discussion in a useful direction. I doubt that
we can welcome the advent of the open text without cutting adrift
received notions of "definitive" editorial work. Breaking down the
division of textual/critical labor requires a major readjustment to
the whole concept of textual authority. Recent discussion, however,
suggests a rate of acceptable error (one per 2,000 characters) and a
dislocation of the editorial function that is bound to fuel controversy.

Of course, you can always redefine textual "error" to exclude whole
classes of anomalies. But modern editors perversely hold themselves
accountable to rigorous standards. They may differ over what constitutes
reasonable "accuracy," but once committed to a norm they don't abdicate
responsibility for its enforcement. An incorrect citation in a
scholarly article doesn't affect our sense of the article's value, as
long as we can locate cited materials. But readers trust in the
expertise of editors, bibliographers, and other "harmless drudges." In
fact, they _demand_ an obsessive care with details precisely because
they feel uneasy with their own competence to perform such work. If
Treadmill Pub. Co. produces an edition of Shakespeare that violates its
own editorial principles on nearly every page, readers who have plunked
down $50 to $100 for the book have every right to feel resentful.
E-text conversion projects face the same, if not higher expectations
from end-users unmoved by their technical problems.

Now, you might say, if people think they want "accuracy," which is
chimerical anyway, they can always keyboard in their own "corrections"
(I can't resist pointing out again how regressive this ideal is). Yet
readers, like most people in our society, rely on professionalized
groups to perform tasks they'd rather not undertake: pitching fastballs,
programing in C, repairing auto transmissions, and editing scholarly
texts all fall into this category. Who, if anyone, finally arbitrates
the cumulative "corrections," especially in older texts where
determinations are often problematic? Since every correction requires
adjudication we haven't really disposed of authoritative editors, just
authoritative texts. What we've gained in their place is a principle of
editorial deniability, and a glimpse (by the way) of Willard McCarty's
nightmare future of semi-autonomous electronic agents devised to
circumvent human interests and values.

Even in a post-Gutenbergian cultural economy, you can expect to hear
howls of protest from readers gone soft from the luxury of using
reliable (i.e. stable) texts. Maybe what we need is a version of the
intellectual boot camp proposed on Humanist a week ago, in which
everyone showers before dawn and undergoes indoctrination in assembly
language, textual studies, and classical philology. This, however, is
no regimen for the faint hearted, among whom I number myself.

Alvin Snider <asnidepd@uiamvs>

(4) --------------------------------------------------------------28----
Date: Mon, 9 Apr 90 15:29:07 EDT
From: "Michael S. Hart" <HART@UIUCVMD>
Subject: Re: 3.1278 correcting and scanning (70)

re Germaine Warkentin <WARKENT@vm.epas.utoronto.ca

Since the members of Humanist actively use texts, and correct them,
make commentaries, additions, deletions, etc, in their public lives,
it would appear obvious how etexts would benefit their work, as well
as how their work would benefit etexts.

Thank you for your interest,

Michael S. Hart, Director, Project Gutenberg
National Clearinghouse for Machine Readable Texts