Home About Subscribe Search Member Area

Humanist Discussion Group

< Back to Volume 32

Humanist Archives: Feb. 25, 2019, 5:23 a.m. Humanist 32.489 - editions, in print or in bytes

                  Humanist Discussion Group, Vol. 32, No. 489.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                Submit to: humanist@dhhumanist.org

    [1]    From: Willard McCarty 
           Subject: beyond the textual edition? (46)

    [2]    From: Desmond  Schmidt 
           Subject: Re: [Humanist] 32.487: editions, in print or in bytes (160)

        Date: 2019-02-25 05:17:22+00:00
        From: Willard McCarty 
        Subject: beyond the textual edition?

I've just returned from a very worthwhile conference at Wuppertal,
"Annotation in Scholarly Editions and Research: Function,
Differentiation, Systematization"*. One of the questions raised there
concerned enlarging the subject from specifically textual editions to
critical annotative work on other kinds of objects, such as images,
sound and material things. One effect this would have would be to remind
us not to flee so rapidly from physical realia to immaterial schemata.

My particular concern was with human and artificial agencies in the act
of note-making and what I called knowing-by-doing -- not just Ryle's
'knowing how' but the knowing in enacting how. The psychology of art
would seem a place to go. I would appreciate not just specific pointers
into Rudolf Arnheim's work but also any other suggestions that you might 

To my mind the rather feeble attempts to represent the physicality of
reading a codex (e.g. the sense of where one is in the book) suggest
that we need to think a lot harder about giving up so much to be
digital. I'm not saying that everyone who speaks XML has abstracted
themselves away from the fully real world, rather that we need to be
careful about abandoning inherited and other non-digital ways of working
in order to take up these (relatively) new tools. I would be so bold as
to say that nothing replaces the face-to-parchment (skin-to-skin) work
with manuscripts, even though getting to see these rare items is
expensive and time-consuming. Similarly, I'd say that nothing replaces
the physical manipulation of notes. Perhaps a Minority Report device
could equal index cards on a table or floor, but who could afford that?

(You are not hearing the sound of a cane thumping the floor! :-)



Willard McCarty (www.mccarty.org.uk/),
Professor emeritus, Department of Digital Humanities, King's College London;
Adjunct Professor, Western Sydney University; Editor, Interdisciplinary
Science Reviews (www.tandfonline.com/loi/yisr20) and Humanist

        Date: 2019-02-24 19:54:23+00:00
        From: Desmond  Schmidt 
        Subject: Re: [Humanist] 32.487: editions, in print or in bytes


I presume by these two long postings your intention is to swat the
annoying fly that I have become. I'm sorry to disappoint you.

> At least one
> or two participants in the discussion have argued that any attempt to
> represent the text of multiple textual witnesses in a single
> electronic document will necessarily cause painful difficulties in the
> electronic document, and further that the hierarchical structure of
> SGML and XML documents makes the difficulties even worse than they
> would otherwise be.

This is precisely what my model of text addresses and delivers on: Any
number of versions in one electronic document, each of which is simple
and easy to edit. If you agree here with this summary the SGML/XML
model fails us in this essential task.

> (However, neither XML nor SGML are relevant for 1980, since neither
> existed then.)

OK I got the date wrong from memory, it was published in 1986 although
it was under development at IBM from 1978 (Goldfarb 1986).

> In what sense is a distinction between elements and attributes a
> 'legacy of print'?  Do all systems for generation of print have such a
> distinction?  I don't see it in troff, or Runoff, or Script, or TeX,
> or LaTeX, or Scribe; am I missing it?

I didn't make any claim about those other programs, just
GML->SGML->XML. Why divide metadata about the text into elements and
attributes at all? You give no explanation. The reason was as I
suggested: that the attributes are the arguments to the functions
meant to format the text. Goldfarb's description of GML's predecessor,
CMS script, took arguments to processing instructions (Goldfarb 1997).
GML first had actual attributes, e.g. :h0 id=part2. (GML 1991)

> The assumption that "elements and attributes" constitute "metadata" is
> also not one I think can be taken for granted.  The idea that "markup"
> is always and only "metadata" is not hard to find, and is often useful
> when teaching beginners the rudiments of markup, but it's hard to take
> seriously as a philosophical statement and -- like the concept of
> "metadata" itself -- does not (in my limited experience) withstand
> sustained scrutiny.

I used the term metadata because I just wanted a general term to hang
elements and attributes off. Both are "data about data" - the
definition in the dictionary - which is good enough for me. In
SGML/XML they both describe the text nodes, the content, in layman's
terms the stuff not in angle-brackets. They still do that even when we
philosophise about whether metadata are also data.

>> The deliberate decision to introduce explicit hierarchies
>> was another,

> Like the preceding sentence, this seems to assume a line of argument
> with which I am not familiar.  Why should a tree-structured
> organization of the input, or the ability to describe a document
> format with a context-free grammar, be a legacy of print?

Hierarchies were not present in 'generic' or 'generalized' markup
products that preceded GML (Goldfarb, 1973). They were not present in
COCOA, either, an early humanities markup system, or in many other
markup schemes for individual humanities projects before SGML.
Goldfarb (1973) explains that hierarchies were added to store the
structure envisaged by the typesetter for a particular block of text.
In 1986 he explicitly says the hierarchies were introduced into GML,
which was primarily aimed at and designed for print.

> (1) What leads DS to believe that
> processing instructions were originally intended specifically for
> printers and not for other processors, such as editors, stylesheet
> processors, plotters, or formatting engines (just to stay within a
> paper-oriented work flow)?

They were in GML. They were originally "Process Specific Controls" to
manage the printer directly (GML, 1991, Ch. 11). Sure, later on they
were expanded to include other functions, but originally they were for
printing instructions only.

> I think DS has succumbed here to the intentional fallacy.

>> JSON has done away with attributes and even though it is not a
>> document format, it shows that they were superfluous.

> Of course attributes are superfluous, in the sense that a version of
> SGML or XML which lacked them would lose no expressive power ...
> This has been known since ... gosh, I
> don't know when.

Of course I didn't mean that attributes were simply optional in SGML.
They are heavily used in TEI. Does XML have attributes in its
specification? Yes. Does JSON have attributes in its specification?

As for your long discussion about what can or cannot be done with
interlinked elements I'm afraid my interest in XML waned after 2004
when I saw that it couldn't do what I wanted. IDs and IDREFs are in
any case attributes and any *connections* between them are outside the
grammar of the language.

Nowhere in your extremely long postings do you provide any explanation
as to how elements, attributes and hierarchies arose. They did not
arrive magically one day on a cloud. They arose from print, and just
because the GML->SGML->XML textual model has been widely used for
digital text doesn't mean it was designed originally for that purpose.
When many digital humanists complain about problems of overlap or
interoperability in their texts you have no answer other than to turn
those texts into a kind of digital spaghetti of interlinked elements,
whose significance and function depend on the encoder and his or her
mood on a particular day. We need better.

GML. (1991). GML Starter Set User’s Guide, http://

Goldfarb, C. (1973) Design Considerations for Integrated
Text Processing Systems, IBM Cambridge Scientific
Center Technical Report No. 320-2094.

Goldfarb, C. (1997). SGML: the reason why and the first
published hint. Journal of the American Society for
Information Science, 48(7): 656–61. http://www.sgmlsource.com/history/jasis.htm

Goldfarb, C. (1996). The Roots of SGML – A Personal

Hockey, S. and Martin, J. (1988). The Oxford Concordance Program.
Oxford: Oxford University Computing Service. (for COCOA)


when you said:

> I'm less interested in the who's in and who's discussion and more
> interested in how to expand the digital scholarly edition beyond the
> limitations of the codex without having to spend $1million+ to get it done.

I couldn't agree more. But the problem with existing standards is that
$1million cost for a digital scholarly edition is no exaggeration.
Instead I want to make them easy for anyone to create and maintain
without specialised technical training.

I'm not sure though that the MLA are in a position to assess the
technical or practical issues surrounding the creation of digital
scholarly editions (DSEs). The theoretical stuff sounds fine (I only
had time to read "Scholarly Edition in the Digital Age"). I'm all for
the ideals expressed there, but the interoperability of XML is just
assumed. It isn't true for DSEs.

Desmond Schmidt
Queensland University of Technology

Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php

Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)

This site is maintained under a service level agreement by King's Digital Lab.