11.0026 editing for whom/what?

Humanist Discussion Group (humanist@kcl.ac.uk)
Tue, 13 May 1997 19:34:48 +0100 (BST)

Humanist Discussion Group, Vol. 11, No. 26.
Centre for Computing in the Humanities, King's College London

[1] From: Ian Lancashire <ian@chass.utoronto.ca> (16)
Subject: Re: 11.0019 editing for computers

[2] From: Hope Greenberg <hope.greenberg@uvm.edu> (29)
Subject: Re: 11.0019 editing for whom/what?

[3] From: "Amsler, Robert" <Amsler@dyniet.com> (22)
Subject: RE: 11.0019 editing for whom/what?

Date: Mon, 12 May 1997 22:37:18 -0400 (EDT)
From: Ian Lancashire <ian@chass.utoronto.ca>
Subject: Re: 11.0019 editing for computers

Writing a program that can convert between roman and arabic numbers,
someone in language technology writes, is a typical first-year
problem for computer-science undergraduates. The task shouldn't be
beyond a browser, let alone a tagger. I wrote harder programs for my
first-year computer science course in 1980.

If we tag our texts semi-automatically with sed or perl scripts, why
can't a program manage the job? I'm not suggesting semantic disambiguation,
lemmatization, or literary interpretation. Most tagging handles simple
textual features that can be recognized by their format.

Again, my question: why should we tag for computers as if they are in
need of help? One just whipped a chess grandmaster. Let the computers be
taught to recognize our texts. Let tools do jobs for us, rather
than vice versa.

I think you all could spend your time more wisely if the humanities
took advantage of 50 years of computer science and turned to generate
better software.

Ian Lancashire

Date: Tue, 13 May 1997 10:13:01 -0400
From: Hope Greenberg <hope.greenberg@uvm.edu>
Subject: Re: 11.0019 editing for whom/what?

Call me greedy, but I want e-texts that are readable, no useable, by ME!

That means, if such a thing exists, I want:

- to see a reasonable facsimile of the original as the author created it
- to see a reasonable facsimile of the published version, including all
the original technologies applied to that version--things like line
numbers and page numbers
- to see a transcription of it in a manipulatable form, by which I mean
something I can reformat, take apart, reassemble, read comfortably on
screen, and otherwise muck about with
- to toss it into my "text stewpot" with all the other texts and analyse
it in a variety of ways (rather more than just plain "machine readable")
- to hear and/or see a reading of the work by the author, especially if
it is poetry, or, if the author is no longer living, to hear a
historically informed rendition. (By the way, we have had audio
recordings for close on a century now. Try listening to speeches from
several decades. It is a fascinating look at how speaking changes.)
- to follow up on a text in a variety of ways, by connecting with
information about the author(s), the historical/cultural setting, the
critical setting
- and I want all of this from wherever I am: at the office, in the
classroom, in the library, out of town, or curled up in my favorite
rocking chair with a very large cup of tea nearby.

- Hope
(who is very grumpy this morning, having spent last night doing
something grandiosely called "research" that, if it had been online,
would have taken seconds instead of three wasted hours)

Hope Greenberg
University of Vermont

Date: Tue, 13 May 1997 10:27:12 -0400 (EDT)
From: "Amsler, Robert" <Amsler@dyniet.com>
Subject: RE: 11.0019 editing for whom/what?

What is interesting about this discussion is that e-texts by their very
nature require computer software to be read. You can stare at your disk
drive, floppy disk, or a CD-ROM very intently, but will see nothing
resembling "text" except on the packaging label. E-text editing assumes
the existence of an e-text editor that can figure out how to read the
bits and interpret them as characters. What else it does is entirely up to
the software. If you feed an e-text into a file dump utility, it might
portray the text as carefully arranged columns of 0's and 1's; or as
Hexadecimal characters. If the e-text is the output of a commercial
software editor, it will have to be read and interpreted by a comparable
software package that can remove all the machine instructions, redundant
coding, etc. which the commercial software inserts.

The moral here is that editing e-texts is ALWAYS the result of applying a
creatively designed e-text interpreting software package to the text. What
the software does or doesn't do is a matter of what is possible and what was
implemented by the software designer. You can filter OUT information, but
you cannot readily ADD information to the e-text. That is, if there are no
font codes or layout information in the text; they can only be guessed at by
clever software or else the text will appear as plain characters. In this
regard you clearly want a LOT of information in an e-text making possible a
very clever editing presentation software package which can use that
information. If you have a lot of information in an e-text, but no clever
software to use it; it is not the fault of the information in the e-text.