14.0299 primitives

From: Humanist Discussion Group (willard@lists.village.Virginia.EDU)
Date: 10/04/00

  • Next message: Humanist Discussion Group: "14.0300 recommended readings?"

                  Humanist Discussion Group, Vol. 14, No. 299.
          Centre for Computing in the Humanities, King's College London
            Date: Tue, 03 Oct 2000 13:16:02 +0100
            From: Wendell Piez <wapiez@mulberrytech.com>
            Subject: Re: 14.0295 primitives
    Osher Doctorow writes:
    > Could I prevail
    >upon Wendell to possibly restate his thesis, if any, in one sentence
    >comparable to my political history-prehistory declaration that permutations
    >of A, B, and N in Shakespearean play contexts contain all the content of
    >political history-prehistory?
    >Yours Faithfully,
    >Osher Doctorow
    I'm afraid not: I really have no talent for such flights at least in the
    context of an e-mail list. Rather, let Osher take the post for what it's
    worth to him -- if that's not much, that's perfectly fine; I don't expect
    any post I write to be on target for all readers.
    Instead (and as long as I'm being summoned back to the floor), I'd like to
    try and take the discussion a step further -- I accept Mr. Doctorow's
    challenge to be more abstract and far-reaching, even if I'm not more
    concise and conclusive. There are five points; please feel free to use your
    delete key (or the moral equivalent thereof).
    1. There is apparently a difference between "methodological primitives" in
    the sense that Ott, Bradley and myself were taking them, which is to say
    core operations to be performed on a specified data set via an automated
    process, and in the sense that Prof. Unsworth is meaning them, as
    irreducible operations performed by a scholar as he or she goes about the
    work of tracing, understanding, and presenting a thesis about a text or
    subject of research. (I'll let Willard speak for himself.) There is also,
    at least potentially, a relation between these two things, as many of us
    have experienced in our own work. The implication has been that if we have
    the first (paraphrase this as "if we can teach our computers to help us
    read, find, sort, filter and so forth") we can facilitate the second.
    2. A key difference between what a computer does in performing operations
    on a text, and what a human reader does, is that the data set (the "input")
    on which the computer operates is finite and bounded, whereas what the
    human reader brings is unknown and variable. It may be finite, although
    large, but since its bounds are unknown, and since no two human readers (or
    even readings) bring the same context to bear on a text, practically
    speaking, it is infinite and unknowable.
    (Caveat: the Internet and the web now make it possible for a computer's
    inputs themselves be practically infinite and unbounded, because
    unknowable; nevertheless we have hardly begun to think about what this may
    mean for automated processing of texts.)
    3. One ancient technique for bridging this gap, is to teach the computer
    something about what we know about a text, and to design its interfaces and
    its processes in such a way to give us better access to the full range of
    this knowledge, than we can ourselves achieve unaided. I say "ancient"
    because this work is far older than digital processing. Add a table of
    contents or an index to a text, or line and verse numbering, or lay out the
    text on the page with chapter titles in a larger type face, and you are
    beginning to "teach [the book] to help us read, find, sort, filter and so
    forth". With computers, examples of this practice would include text
    encoding, or markup, as well as the addition of external sources of
    information such as databases, dictionaries, "knowledge bases" etc.
    4. Historically, one barrier to this work has been (as far as computers and
    automation have been concerned) that to design these interfaces and
    processes, we have had to invest in technologies and methods that mask the
    processes as much as they reveal them. This has largely been because of the
    design of our tools and the esoteric knowledge they have themselves
    required. It is as if we had created indexed commentaries on Classical
    Chinese poetry, but written them in English (finding that with our
    keyboards it is easier to compose an alphabetical index in English),
    thereby requiring our Chinese audience to learn English (on top of
    Classical Chinese) to get the benefit of the commentaries. (Not only that,
    but we have used a dialect of English that will be largely obsolete in five
    This problem has been faced not only by "Computing Humanists" but also by
    the culture as a whole (or marketplace, if you like), that has invested
    untold millions in systems of computer-based automation that, whatever
    benefits they have delivered, have always fallen short of promises.
    Consequently, there have been waves of development working to ameliorate
    the problem in one way or another. The emergence of object-oriented
    programming methodologies, including the notion of "strong data typing", is
    one such wave; the emergence of standards-based markup languages is
    another. My earlier post tried to trace how these two developments should
    in theory complement one another, and how industry is now moving forward
    quickly on that basis to deal with its own analogous problems.
    Nevertheless, I argued, in the context of Humanities research we have a
    considerable way to go, even to match what has long been done with such
    structures as indexes and footnotes in the printed book -- at least, that
    is to say, if we want to do it on a basis that can reach beyond that
    five-year half-life that computer applications have faced.
    5. Even so, the gap remains between an automated process, working on known
    inputs, and a human process, working with who-knows-what "extraneous" but
    all-important -- all-pervasive and all-conditioning --knowledge, memory,
    intuition, assumptions, imagination. Human readers perceive in a text (just
    for example) the implicit logics of narrative ordering; intertextual
    references; metaphorical correspondences; ironies. What would it take to
    teach a computer to perceive these on our behalf? Out of what
    methodological primitives, subject to automation, can such operations be
    Wendell Piez                            mailto:wapiez@mulberrytech.com
    Mulberry Technologies, Inc.                http://www.mulberrytech.com
    17 West Jefferson Street                    Direct Phone: 301/315-9635
    Suite 207                                          Phone: 301/315-9631
    Rockville, MD  20850                                 Fax: 301/315-8285
      Mulberry Technologies: A Consultancy Specializing in SGML and XML
                           Humanist Discussion Group
           Information at <http://www.kcl.ac.uk/humanities/cch/humanist/>

    This archive was generated by hypermail 2b30 : 10/04/00 EDT