10.4 citing Web documents

Humanist (mccarty@phoenix.Princeton.EDU)
Tue, 7 May 1996 19:32:00 -0400 (EDT)

Humanist Discussion Group, Vol. 10, No. 4.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: Willard McCarty <Willard.McCarty@utoronto.ca> (31)
Subject: impermanence

[2] From: "Peter Graham, RUL" <psgraham@gandalf.rutgers.edu> (10)
Subject: Re: 9.774 citing Web documents

[3] From: Andrew Burday <andy@dep.philo.mcgill.ca> (46)
Subject: Re: 9.774 citing Web documents

Date: Mon, 6 May 1996 21:36:18 -0400
From: Willard McCarty <Willard.McCarty@utoronto.ca>
Subject: impermanence

Bob Amsler, in Humanist 9.774, puts his virtual finger on a stubborn reality
of electronic publication: "The basic philosophical problem of citing the
Web is that it is a fundamentally transitory reference." Are we trying our
best to ignore this problem because we STILL cannot quite see that the Web
is not paper and print? that it has its own intrinsic tendencies which we
are silly, or worse, to resist?

Let me put the matter another way. Consider all the scholarship we publish
that has value for the moment, for six months, a year, five years, but that
we would be well rid of. (I am assuming, for purposes of argument, that
everything which is published has value for some amount of time, however
small.) Would this kind not be better published in "a fundamentally
transitory" medium?

Perhaps a more interesting question concerns the effects rapid, transitory
publication might have on the humanities. What might happen if our research
were more conversational, as in the social sciences? How might the academic
professions be affected if paper-publication were reserved for material
chosen on the basis of its long-term interest?

Some years ago I was fortunate to hear a Stanford economist (whose name,
alas, I have forgotten) brilliantly address the crisis in scholarly
publishing. In essence what he pointed out was that scholarly publishing, as
we know it, is an integral part of the academic world, that it cannot simply
be changed without profound consequences, and that changes are not going to
be easy because the system within which this publishing is integral will
resist our efforts. Since our livelihoods and way of life depend on this
integral system, should we not be examining the big picture? It seems to me
that we must understand the sociology and philosophy of knowledge in order
to know what to do with the new medium. Or we can just let others, such as
our friends in the infotainment industry, make the decisions for us.



Willard McCarty, Univ. of Toronto || Willard.McCarty@utoronto.ca

Date: Tue, 7 May 96 10:44:04 EDT
From: "Peter Graham, RUL" <psgraham@gandalf.rutgers.edu>
Subject: Re: 9.774 citing Web documents

From: Peter Graham, Rutgers University Libraries
John Unsworth suggests avoiding the angle brackets on a citation because the
sgml- or html- aware program will not know what to do with it. But this is
undoubtedly the reason the form suggested by Berners-Lee includes the term
URL as the initial element, as in the following citation* to, say, his home
page. As has been pointed out, various software tools use the URL forms to
highlight them or make automatic links out of them. --pg


Peter Graham psgraham@gandalf.rutgers.edu Rutgers University Libraries
169 College Ave., New Brunswick, NJ 08903 (908)445-5908; fax (908)445-5888

Date: Tue, 7 May 1996 11:23:43 -0400 (EDT)
From: Andrew Burday <andy@dep.philo.mcgill.ca>
Subject: Re: 9.774 citing Web documents

On Mon, 6 May 1996, Humanist wrote:

> From: John Unsworth <jmu2m@virginia.edu>
> >
> May I suggest that whatever one does with URLs in citation, putting them
> inside angle brackets (e.g. <http://www.virginia.edu/> is a bad idea,
> since those angle brackets indicate, to any sgml- or html-aware program,
> that the contents of the brackets constitute a tag. An unrecognized
> "tag" such as a URL inside angle brackets will either be ignored (and not
> displayed) or rejected by almost any of these programs.

Um, sorry, but this just isn't right. First of all, I'm not sure what you
mean by an "html-aware" program. I guess the term is often used to refer
to programs with heuristics to pick out URLs, but that's not html. In any
case, two related points can be made. First, surely Netscape and other
web browsers count as "html-aware". But any properly designed web browser
will do what you describe *only* in an html context. A document's content
type may be identified by the header sent to it by an http server, or by
the extension if it's a local file or if the header doesn't identify it.
Any of those mechanisms can create what I'm calling an "html context".
If, based on one of those mechanisms, the browser "thinks" it's displaying
a plain text file, it will happily display any markup that happens to
appear in the file, including angle brackets, and it will not highlight
anchors or do anything if you select them. Second, if the program you're
using -- mail reader, news reader, or whatever it is -- is ignoring text
in angle brackets *outside of an html context*, it is badly designed. All
kinds of strings get put in angle brackets, for all kinds of reasons. It
is just unreasonable to assume -- outside of an html context -- that every
string in angle brackets is an html tag.

In other words, outside the context of a document which has (implicitly or
explicitly) been declared to be html, the most "html-aware" programs there
are will do nothing special with material in angle brackets. That is the
most reasonable behavior, outside of an html context. And the question I
was originally trying to address was how to identify URLs outside of html
contexts. Again, I don't think it's worth trying to adjust our practices
to fit whatever the writers of some particular mail or news reader have
chosen to code into their software. The point is to have a consistent,
standard way to refer to URLs outside the context of html. The software
authors should be supporting the standards, not the other way around.

> Since the URL
> string begins with one of a few predicatable (and probably not randomly
> occuring) strings (http:// or ftp:// or gopher:// etc.), I suggest that
> no additional representation is necessary.

I appreciate your taking the effort to come up with three counterexamples
to your own thesis, so that I don't have to! ;*>


Andrew Burday