Humanist Discussion Group, Vol. 32, No. 505. Department of Digital Humanities, King's College London Hosted by King's Digital Lab www.dhhumanist.org Submit to: firstname.lastname@example.org  From: Desmond Schmidt
Subject: Re: [Humanist] 32.499: standoff markup & the illusion of 'plain text' (94)  From: Wendell Piez Subject: Re: [Humanist] 32.499: standoff markup & the illusion of 'plain text' (32)  From: Iian Neill Subject: Re: [Humanist] 32.499: standoff markup & the illusion of 'plain text' (27) -------------------------------------------------------------------------- Date: 2019-03-03 21:00:58+00:00 From: Desmond Schmidt Subject: Re: [Humanist] 32.499: standoff markup & the illusion of 'plain text' Patrick, you touch upon an important point: that it has been the goal of XML-based editions for the past 15 years or so to get ever closer to recording the spatial relationships between pieces of text on a page. And bound up with this goal is the idea that a perfect capture of such information would unlock multiple ways to investigate the text which would then be a kind of blending of markup, annotation and "plain" text much as you describe. As you probably have already guessed, I don't share this idea. I have encountered in practice some serious problems with this approach to making a digital edition. The first question is why? Why should we seek to make a rough approximation of a manuscript page that can be precisely photographed (not without loss of information of course) but still vastly inferior to the page-facsimile image that already captures the spatial relationships between fragments of text? A text encoded for spatial information can't even be used as a reading text. This is something that Munoz and Viglianti (2015) pointed out recently. To produce a reading text we actually need another encoding that conflicts with the spatial perspective and requires a re-ordering of textual fragments. Also you can't compare two versions of a text that have been encoded in this way. Comparison tools can only compare linear transcriptions of one version or layer of a document but not text mixed up with arbitrary alternatives and information about where text-blocks go on the page. This severely limits what we can do with our edition. It makes it very hard to edit. All that complex markup to record the position of textual fragments prevents ordinary human editors who are not technical experts from participating in the edition. They can't share their transcriptions with their peers because no one can agree on how particular features should be recorded, or they misunderstand the complex record of features made by someone else. Damage will result and collaboration will fail. You of all people should appreciate this because you wrote about it in 2006. I'm a strong believer in divide and conquer, and in the KISS principle. If we are going to make digital editions that are affordable and easy for everyone to do, and if we are going to collaborate in making them, we need a simple interface that anyone can use. And for a text to be fit for many purposes, we actually need a simple not a complex textual representation at its core. ------------------------------- Trevor Muñoz and Raffaele Viglianti (2015) Texts and Documents: New Challenges for TEI Interchange and Lessons from the Shelley-Godwin Archive JTEI 8 http://jtei.revues.org/1270 Patrick Durusau (2006) Why and How to Document Your Markup Choices in L. Burnard, K O'Bren O'Keefe and J. Unsworth (eds) Electronic Textual Editing, pp.299-309. ----------------------- Desmond Schmidt eResearch Queensland University of Technology > > But "plain text" in an electronic system is an illusion. Why not abandon > the distinction between text, markup and annotations, capturing all of > them in a database, upon which queries then search and/or render a > particular "view" of a "text" for your viewing? > > If you desire XML, for further processing, that is one rendering of a > text, as is rendering in SVG, for example, such that readers can choose > dynamic renditions of variant versions, with or without a base version > being displayed. > > Or any annotation of a text, as well as annotations ofÂ annotations. > > If, as you say, we should stop clinging to the file metaphor for > annotations, let's free ourselves of it with regard to texts. > > Granting that for many purposes I would prefer a rendering that mimics a > hand written mss, but that is only one possibility out of many. > > Gaps, spaces, margins, etc., can all have unique records in a database, > or even records based on unique x - y coordinates on a physical witness. > > With that change, we can speak of renderings of texts, even renderings > that we claim match physical witnesses. Some renderings carry > annotations, some don't. > > Hope you are having a great weekend! > > Patrick > -------------------------------------------------------------------------- Date: 2019-03-03 18:12:22+00:00 From: Wendell Piez Subject: Re: [Humanist] 32.499: standoff markup & the illusion of 'plain text' Dear Willard, Goodness, now we are recommending text bases instantiated as a graph model: both of these sound a lot like Luminescent's internal model, or for that matter the experimental system CMSMcQ mentioned way back in the dawn of this thread: Haentjens Dekker, Ronald, and David J. Birnbaum. “It's more than just overlap: Text As Graph.” Presented at Balisage: The Markup Conference 2017, Washington, DC, August 1 - 4, 2017. In Proceedings of Balisage: The Markup Conference 2017. Balisage Series on Markup Technologies, vol. 19 (2017). https://doi.org/10.4242/BalisageVol19.Dekker01. Or for old timers: https://github.com/wendellpiez/Luminescent (I'm not dead yet, I think I'll go for a walk!) Yet I fail to see why any of this once and future promising work invalidates XML in any way, whether XML is viewed as some sort of arguable abstraction, or a practical technology that none of us (even experts) can see in its entirety? Regards, Wendell -- Wendell Piez | wendellpiez.com | wendell -at- nist -dot- gov pellucidliterature.org | github.com/wendellpiez | gitlab.coko.foundation/wendell - pausepress.org -------------------------------------------------------------------------- Date: 2019-03-03 11:59:55+00:00 From: Iian Neill Subject: Re: [Humanist] 32.499: standoff markup & the illusion of 'plain text' Hi Patrick, I'm not sure I entirely understand what you mean when you suggest abandoning the distinction between text, markup and annotations; at least, I can't visualise what data structure this would be represented by, nor how it would be edited by the user (e.g. adding and removing characters and annotations). I find the concept intriguing, I just don't follow the technical realisation of it. In some ways, though, 'Codex' may fulfill some of the other requirements you suggest (although I may have misunderstood you). For example, a 'standoff property text' (SPT) in 'Codex' is represented by a (:Text) node for the raw text and a cluster of (:StandoffProperty) nodes for the annotations. These (:StandoffProperty) nodes can also be linked to other nodes in the graph (e.g., people, places, assertions). Further, an SPT can be thought of as analagous to the text content of an XML element; which means that (:Text) nodes can (and are) related to other (:Text) nodes as required. For example, the default text type in the system is 'page body' which can be linked to other texts fulfilling the function of margin note, footnote, end notes and even intertexts (like hypertexts). And because the editor suports zero-width annotations (between characters) you can inject references to other texts at any point without disturbing the 'text flow'. Best regards, Iian _______________________________________________ Unsubscribe at: http://dhhumanist.org/Restricted List posts to: email@example.com List info and archives at at: http://dhhumanist.org Listmember interface at: http://dhhumanist.org/Restricted/ Subscribe at: http://dhhumanist.org/membership_form.php
Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)
This site is maintained under a service level agreement by King's Digital Lab.