Humanist Discussion Group

Humanist Archives: Feb. 13, 2019, 6:29 a.m. Humanist 32.451 - the McGann-Renear debate

                  Humanist Discussion Group, Vol. 32, No. 451.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org


    [1]    From: Jim Rovira 
           Subject: Re: [Humanist] 32.446: the McGann-Renear debate (73)

    [2]    From: philomousos@gmail.com
           Subject: Re: [Humanist] 32.446: the McGann-Renear debate (146)

    [3]    From: Peter Stokes 
           Subject: Re: [Humanist] 32.446: the McGann-Renear debate (79)


--[1]------------------------------------------------------------------------
        Date: 2019-02-12 16:35:07+00:00
        From: Jim Rovira 
        Subject: Re: [Humanist] 32.446: the McGann-Renear debate

I'm kind of facing the situation Michael describes right at the moment, and
his response really speaks to me, especially the thought of trying to
integrate yet another witness of the text should one show up. My problem is
that I've never met an algorithm I could trust. Not just those dark alley
algorithms either; I'm talking about the ones you meet at company parties,
the convivial ones who know all the best jokes. Has anyone seen Michael's
suggestion work in practice?

Jim R

On Tue, Feb 12, 2019 at 1:14 AM Humanist (humanist@dhhumanist.org) wrote:

>
> In my own experience of textual edition in TEI - which I admit isn't\n\n
> enormous - dealing with these sorts of issues requires endless workarounds
> and incredible labour. And the result is not a 'true' representation of the
> complete text, because in reality the elements of the text are not nested.
> There is an elegant solution: store the different witnesses in different
> files, and perform the string alignment algorithmically. Rather than trying
> to encode the 'complete text' in a single hierarchical file, versions can
> be collated pairwise as required. Pairwise collation can be updated
> manually if the algorithm is found to be inadequate. This also has the
> heartening result that the edition will not be ruined by the discovery of a
> new witness of the text, which would shatter any carefully constructed
> system of element ids.
>
> Michael Falk, Western Sydney University

 --
Dr. James Rovira (http://www.jamesrovira.com/)
Bright Futures Educational Consulting
(http://www.brightfuturesedconsulting.com)

   - *Writing for College and Beyond* (A first year writing textbook. Lulu
   Press, forthcoming. .pdf files available for preview if you're interested
   in considering this text for your classroom. It is fully customizable for
   departmental orders.)
   - *Reading as Democracy in Crisis: Interpretation, Theory, History
   (https://interpretationtheoryhistory.wordpress.com/)*  (Lexington Books,
   in production)
   - *Rock and Romanticism: Post-Punk, Goth, and Metal as Dark Romanticisms*
   (https://www.palgrave.com/us/book/9783319726878) (Palgrave Macmillan,
   May 2018)
   - *Rock and Romanticism: Blake, Wordsworth, and Rock from Dylan to U2*
   (https://jamesrovira.com/rock-and-romanticism-blake-wordsworth-and-rock-from-
dylan-to-u2/)
(Lexington
   Books, February 2018)
   - *Assembling the Marvel Cinematic Universe: Essays on the Social,
   Cultural, and Geopolitical Domains*
   (https://mcfarlandbooks.com/product/assembling-the-marvel-cinematic-
universe/),
   Chapter 8 (McFarland Books, 2018)
   - *Kierkegaard, Literature, and the Arts*
   (http://www.nupress.northwestern.edu/content/kierkegaard-literature-and-
arts),
   Chapter 12 (Northwestern UP, 2018)
   - *Blake and Kierkegaard: Creation and Anxiety*
   (http://jamesrovira.com/blake-and-kierkegaard-creation-and-anxiety/)
(Continuum,
   2010)

Active CFPs

   - *Women in Rock/ Women in Romanticism*
   (https://rockandromanticism.wordpress.com/call-for-papers-rock-and-
romanticism-women-in-rock-women-in-romanticism/),
   edited anthology
   - *David Bowie and Romanticism*
   (https://rockandromanticism.wordpress.com/call-for-papers-rock-and-
romanticism/),
   edited anthology


--[2]------------------------------------------------------------------------
        Date: 2019-02-12 15:17:47+00:00
        From: philomousos@gmail.com
        Subject: Re: [Humanist] 32.446: the McGann-Renear debate

I want to spend some more time thinking about Peter Robinson's formulation
of text completeness. I don't think I agree with all of it, but it seems
useful -- I should add that Peter's scholarship has been very influential on
my own thinking about text and I am grateful for his work.

One small addition I might make is the axiom that if a representation of a
text can be transformed into a different representation and back again
without discarding information, then those two representations are
equivalent, whatever aspect of the text one or the other might privilege.
This might help us avoid much of the pointless bickering over the
(in)adequacy of different ways of modeling texts. (I believe James Cummings
was saying more or less exactly this several posts ago.)

Michael Falk's contribution seems to me to exemplify many of the kinds of
error we see in these sorts of discussions

>
> May I just reiterate the point Bill Pascoe made a few emails ago. It is not
> the case that: "3. Each aspect may be represented as a OHCO: an ordered
> hierarchy of content objects, a tree." This is the central weakness of XML
> as a universal markup language. It insists on an impossibly strict nesting
> of elements.
>

I'm afraid if you're going to make assertions about impossibility, you will
have a very hard time proving them.

>
> The essence of scholarly editing is the collation of different textual
> variants.

Surely the essence of scholarly editing is the argument the editor wishes
to make about the text? Collation is an important way of discerning what's
happening in the tradition of a text, but it may not always be practical or
even possible.


> Revisions frequently create overlapping elements that XML
> struggles to encode.


Depends what you mean by revision, and even here at the beginning of your
argument, I'm puzzled at what you're trying to do. What is this text? What
are you trying to do with it? Is this a single document, where the author
has made three passes through the poem, scratching out or erasing written
text and replacing it? Or are there actually three successive versions of
the poem? If the former, I would say TEI is well-equipped to represent it,
but that your encoding below misrepresents the source document. If the
latter, then I'm confused: why would you imagine that it's a good idea to
take three texts like this and attempt to mash them up into one? These
aren't different witnesses to a single work, so I don't know why you'd use
TEI's text critical elements, and they're not really alternatives either,
so I don't know why you'd use choice. As you've presented them, they're
three freestanding versions that show an author revising a work. I'd
probably encode them as three TEI documents with links, notes, etc..

But let's assume you want to present the final version and show how the
author got there, treating the prior versions as witnesses. That's probably
not an unreasonable thing to want to do. It may surprise you to know that
it's perfectly possible to model this sort of thing in TEI.

...To represent the flow complex clow of

textual revision, each atom of each revision would need to be separately
> recorded and linked through a complex series of letter codes to indicate
> their spatial position in each line group, and their temporal position in
> the fluttering history of textual insertion and deletion.
>

You're exaggerating the difficulty, but yes. Text, and the interpretation
of text, is complex. Why would you assume models of texts would therefore
be simple? I don't think your proposed solution does this either.

>
> In my own experience of textual edition in TEI - which I admit isn't\n\n
> enormous - dealing with these sorts of issues requires endless workarounds
> and incredible labour.

Critical editions typically take years to produce. It's not uncommon for a
second editor to complete the work of a first who has died before
finishing. They do require incredible labor. And I'll certainly grant you
that trying to create a new model of a text without prior examples is a
large undertaking. But the magical thing about markup is its
transformability. If we can agree on what you're trying to do, we can take
a shorthand or standoff annotation and turn it into, say TEI.


> And the result is not a 'true' representation of the
> complete text, because in reality the elements of the text are not nested.
>

Is *any* model true? I don't think that's a useful characterization at all.
Better to ask if it's useful. Does it help the editor present their
argument about the text?

> There is an elegant solution: store the different witnesses in different
> files, and perform the string alignment algorithmically. Rather than trying
> to encode the 'complete text' in a single hierarchical file, versions can
> be collated pairwise as required.


Sure, if what you're after is just to show how one of your versions above
differs from another, then diff is your friend. And that's perfectly fine.
I will point out that while your automated collation will show you what's
changed, it won't permit you to show the ways in which the change occurred.
It will also just fall flat on its face in certain circumstances.


> Pairwise collation can be updated
> manually if the algorithm is found to be inadequate.


If you're going to run an algorithm to produce a data structure and then
manually edit that structure to show what you want, then I worry you might
end up with something isomorphic to the TEI file you refused to produce!


> This also has the
> heartening result that the edition will not be ruined by the discovery of a
> new witness of the text, which would shatter any carefully constructed
> system of element ids.
>

The addition of a new witness might change the editor's interpretation of
the text in any number of ways. Surely if something like that were to
happen, a new edition would be warranted? I don't think you understand how
element ids work, if you think anything would be shattered.

As someone who mainly works in programming and dabbles in text editing, I
am continually astounded by the faith people have in algorithms. Are we
really at the point where there is no longer any need for editors, and we
can switch to simply transcribing sources and feeding them into the hopper
so that our editions can be mechanically generated? Nothing I've seen or
read leads me to think that is the case.

I apologize if this comes across as unnecessarily harsh. I get very
frustrated by dismissive, hand-waving arguments that attempt to assert that
I (and my many colleagues) are doing something wrong by producing editions
with TEI. Can we have critiques that address real problems? I promise you
there are many, and that we'll be grateful to hear them. If we're just
going to embark on another circle of the textodrome, I'm afraid I will have
to respectfully take my leave.

All the best,
Hugh


--[3]------------------------------------------------------------------------
        Date: 2019-02-12 07:30:36+00:00
        From: Peter Stokes 
        Subject: Re: [Humanist] 32.446: the McGann-Renear debate

Dear all,

(I sent this a few days ago but I think it was eaten by the Humanist gremlins,
so I'm trying again now. Apologies if you receive it twice.)

It seems to me this discussion is tending to stray into a debate over which
model of text is inherently "best": whether "text as a sequence of
characters", "text as OHCO", "text as leaves with multiple trees",
etc. But this or arguing which model is complete or even adequate without
specifying for what purpose seems to me to be missing points that Willard and
others on and off this list have been making for a long time. With apologies for
repeating what"s probably familiar to us all, I do think these have to be
taken into account:

All models (whether digital, print or other; whether implicit, explicit, formal
or informal) are by definition incomplete. No model is adequate for all
purposes. This is their virtue: they make the intractable tractable, they allow
us to see our material differently. Of course OHCO is an incomplete view of the
text and is inadequate for some purposes, as is 'text as a sequence of
characters'. But all knowledge is incomplete, and nothing is universally
adequate. This is a problem 'only if we miss the lesson of modeling and mistake
the artificial for the real' (McCarty, 'Knowing' 2008).

All models present particular theories and views of that being modelled. None is
inherently better or worse, assuming they're all based on a more or less
rigorous analysis of that being modelled. (Cf. Armando Petrucci on
palaeographical terminology: 'Infatti ogni terminologia paleographica legata
ad una particolare visione storica del fenomeno scrittorio; ... ma
legittimamente utilisabili risulteranno comunque tutte quelle fondate su
premesse metodologiche valide e su rigorose analisi grafiche.' La
descrizione del manoscritto, 2001.) My insisting that you use a given model is
therefore to insist that you take the same theoretical viewpoint as I do, but
there are and must be many different views depending on discipline, research
questions, and so on.

The value of models is in their usefulness, not their verisimilitude (McCarty
again, I think, though I don"t have the reference to hand), and so discussing
the adequacy of a model is meaningless without being clear about the intended
application. Both the OHCL/XML model and the "text as a sequence of
characters" model are used certainly by millions and I would guess probably
billions of people every day, many people clearly use TEI every day, etc., so to
deny the inherent usefulness or adequacy of these seems to me to be obtuse, and
criticising someone"s use of it per se seems unhelpful. If I use a model
that"s inadequate for my purpose when there"s a practical and more
appropriate alternative then yes, I deserve criticism, but that"s not what I
see in the current discussion. In fact, as a palaeographer none of these models
is particularly useful or adequate for me, but that"s my problem not the
models.

The challenge of course is finding the right model for the right purpose. If I
want to do an NLP analysis of Elizabethan literature then I (probably) want
'text as a sequence of characters'. If I want to analyse references to place in
speeches delivered by Hamlet then I (probably) want TEI or similar. If I want an
analysis of speeches by Hamlet using NLP then I (probably) want both. Yes, I
could use NLP in an attempt to find references to place by Hamlet but my results
would be very inaccurate and why would I when I have an alternative that"s
easier, much more suited to a small corpus like this, and is designed precisely
for this purpose?

Of course this necessary multiplicity of models poses a huge challenge to
standards, interoperability and interchange, but that"s nothing to do with any
specific model and is something the TEI folk know very well. It"s also a
different question and is probably for another day.

All the best,

Peter


--
Peter Stokes
Directeur d'études
École Pratique des Hautes Études - Université PSL
Section des Sciences Historiques et Philologiques
Savoirs et Pratiques du Moyen Âge au XIXe siècle (EA 4116)
Patios Saint-Jacques
4-14, rue Ferrus - 75014 Paris
peter.stokes@ephe.psl.eu
https://www.ephe.fr


_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php
Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)
This site is maintained under a service level agreement by King's Digital Lab.