Home | About | Subscribe | Search | Member Area |
Humanist Discussion Group, Vol. 32, No. 516. Department of Digital Humanities, King's College London Hosted by King's Digital Lab www.dhhumanist.org Submit to: humanist@dhhumanist.org Date: 2019-03-05 17:21:33+00:00 From: raffaeleviglianti@gmail.com Subject: Re: [Humanist] 32.505: standoff markup & the illusion of 'plain text' Dear Desmond, I'm joining this conversation late and perhaps a bit unprepared not having followed it through completely; however, I would like to address some of the points in your latest message from the perspective of the Shelley-Godwin Archive (S-GA - the project discussed in the article by Muñoz and I that you cited) and some of my latest thinking on the subject. You ask why should we seek to make a rough approximation of a manuscript page through encoding. I think the answer is the same reason why printed facsimile editions with transcriptions and images side-by-side are created (such as the Garland Shelley facsimile editions). At the very least, they provide a map to the facsimile page and a readable transcription by scholars who know these documents deeply. Encoding and the digital medium can be used to produce digital versions of these publications with new affordances, so that's why this is done. The goal is not to create a rough approximation of the manuscript page, but to formalize scholarship around the page and the manuscript and provide a useful tool to others. I think S-GA, in its current form, mostly matches this ideal. In our (Muñoz and I) paper we discuss how it is not easy to use this approach to encoding that I just described to display a reading text, but this is quite different from saying that it cannot be used as such at all. In fact we provide an automatically generated reading text on the site that, while imperfect, it is still useful. To return to my comparison with printed facsimile editions: you wouldn't use those as a reading text either: the goals are different. Text encoding reflects your editorial goals. This doesn't prevent anyone from wanting for more than one goal for a given textual resource and I agree that standoff is a viable solution for this. The TEI Guidelines document a number of standoff techniques, yet they are often used at a small scale because they can be challenging to apply and validate. But standoff markup on plain text must encounter the same issues, with the further impediment of not being able to rely on markup for simple references to identifiers. In short, further hierarchies can be layered on on top of (or adjacently to) an existing encoding through the use of pointers (I discuss some this in a forthcoming JTEI article). To give an example, Elisa Beshero-Bondar and I (et al) have been working around this concept in the creation of a variorum edition for Frankenstein that incorporates S-GA TEI data without modification through the use of pointers. We have a central 'spine' (or stand-off collation) that uses pointers to other TEI-encoded documents to identify variants. See Beshero-Bondar and Viglianti 2018 for an overview. The point I want to make is that I don't think that standoff requires a 'plain text' to be targeted by a number of annotations and hierarchies. In fact, there is no such thing as a plain text: even non-XML encoded text contains markup, often imposed by convention. Computer representation of textual information was developed around the idea of text-as-string and we must not mistake this representation for what text really is. Best, Raff Viglianti Beshero-Bondar, Elisa E., and Raffaele Viglianti. "Stand-off Bridges in the Frankenstein Variorum Project: Interchange and Interoperability within TEI Markup Ecosystems." Presented at Balisage: The Markup Conference 2018, Washington, DC, July 31 - August 3, 2018. In Proceedings of Balisage: The Markup Conference 2018. Balisage Series on Markup Technologies, vol. 21 (2018). https://doi.org/10.4242/BalisageVol21.Beshero-Bondar01. -- Raffaele Viglianti, PhD Research Programmer Maryland Institute for Technology in the Humanities University of Maryland On Mon, Mar 4, 2019 at 2:58 AM Humanistwrote: > Humanist Discussion Group, Vol. 32, No. 505. > Department of Digital Humanities, King's College London > Hosted by King's Digital Lab > www.dhhumanist.org > Submit to: humanist@dhhumanist.org > > > Date: 2019-03-03 21:00:58+00:00 > From: Desmond Schmidt > Subject: Re: [Humanist] 32.499: standoff markup & the illusion of > 'plain text' > > Patrick, > > you touch upon an important point: that it has been the goal of > XML-based editions for the past 15 years or so to get ever closer to > recording the spatial relationships between pieces of text on a page. > And bound up with this goal is the idea that a perfect capture of such > information would unlock multiple ways to investigate the text which > would then be a kind of blending of markup, annotation and "plain" > text much as you describe. > > As you probably have already guessed, I don't share this idea. I have > encountered in practice some serious problems with this approach to > making a digital edition. > > The first question is why? Why should we seek to make a rough > approximation of a manuscript page that can be precisely photographed > (not without loss of information of course) but still vastly inferior > to the page-facsimile image that already captures the spatial > relationships between fragments of text? > > A text encoded for spatial information can't even be used as a reading > text. This is something that Munoz and Viglianti (2015) pointed out > recently. To produce a reading text we actually need another encoding > that conflicts with the spatial perspective and requires a re-ordering > of textual fragments. > > Also you can't compare two versions of a text that have been encoded > in this way. Comparison tools can only compare linear transcriptions > of one version or layer of a document but not text mixed up with > arbitrary alternatives and information about where text-blocks go on > the page. This severely limits what we can do with our edition. > > It makes it very hard to edit. All that complex markup to record the > position of textual fragments prevents ordinary human editors who are > not technical experts from participating in the edition. They can't > share their transcriptions with their peers because no one can agree > on how particular features should be recorded, or they misunderstand > the complex record of features made by someone else. Damage will > result and collaboration will fail. You of all people should > appreciate this because you wrote about it in 2006. > > I'm a strong believer in divide and conquer, and in the KISS > principle. If we are going to make digital editions that are > affordable and easy for everyone to do, and if we are going to > collaborate in making them, we need a simple interface that anyone can > use. And for a text to be fit for many purposes, we actually need a > simple not a complex textual representation at its core. > > ------------------------------- > Trevor Muñoz and Raffaele Viglianti (2015) Texts and Documents: New > Challenges for TEI Interchange and Lessons from the Shelley-Godwin > Archive JTEI 8 http://jtei.revues.org/1270 > > Patrick Durusau (2006) Why and How to Document Your Markup Choices in > L. Burnard, K O'Bren O'Keefe and J. Unsworth (eds) Electronic Textual > Editing, pp.299-309. > ----------------------- > > Desmond Schmidt > eResearch > Queensland University of Technology _______________________________________________ Unsubscribe at: http://dhhumanist.org/Restricted List posts to: humanist@dhhumanist.org List info and archives at at: http://dhhumanist.org Listmember interface at: http://dhhumanist.org/Restricted/ Subscribe at: http://dhhumanist.org/membership_form.php
Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)
This site is maintained under a service level agreement by King's Digital Lab.