9.767 citing URLs

Humanist (mccarty@phoenix.Princeton.EDU)
Sun, 5 May 1996 09:09:33 -0400 (EDT)

Humanist Discussion Group, Vol. 9, No. 767.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: Andrew Burday <andy@dep.philo.mcgill.ca> (88)
Subject: URLs outside html; citations; RFCs [was: Re: 9.759
quoting URLs]

[2] From: "Peter Graham, RUL" (143)
Subject: Was: Re: 9.759 quoting URLs--electronic bib. citation

Date: Thu, 2 May 1996 17:35:10 -0400 (EDT)
From: Andrew Burday <andy@dep.philo.mcgill.ca>
Subject: URLs outside html; citations; RFCs [was: Re: 9.759 quoting URLs]

I was pleasently surprised to see that my note on quoting URLs outside of
html documents has generated a little discussion. I wasn't sure it was
even worth posting when I sent it. I hope one more note will be helpful.

Unfortunately, the discussion I've seen (inluding private email to me)
seems to miss the point. Let me distinguish between three separate

(1) How to cite electronic sources in scholarly publications;
(2) How some particular software (e.g. Eudora) handles URLs;
(3) A standard for how URLs should be represented outside of html
(World Wide Web) documents.

(2) seems to me to be of little interest. Today Eudora does things one
way; tomorrow it may do them some other way. Eudora runs on some
platforms, but not on others. Some people use it, and some people don't.
I don't mean to pick on Eudora, which is a very nice program, but what we
need are standards that can apply consistently across platforms and
programs. The idea of the Internet is to communicate, and not just with
people who happen to be using your favorite software on your favorite

(3) is what I was trying to get at in my original note. Standards are not
just an issue for anal-retentive chipheads. Standards are what allow the
wide variety of enormously complex computer systems that make up the
Internet to work with each other.

(1) is also an important issue, and I thank Al Magary for pointing out
some sources that address it. However, it is not the issue I was trying
to address in my original note. The question there was: what is a
reliable way to indicate the presence of a URL in plain text documents
(especially electronic ones)? E-mail may mention a URL without needing to
provide a proper citation, and proper citations of electronic sources will
show up in paper documents; so these two issues cross-cut each other.

Incidentally, I am doubtful about the ultimate utility of citations based
on URLs. A URL is basically an instruction for retrieving a particular
object from the net. Giving a URL as a citation is like citing a book by
telling the reader where to find it in your local library, in the
following sense: if the book gets moved one shelf over, your citation will
no longer be of any use; and if one letter in the URL gets changed, it
will no longer be of any use either. (Yes, I realize there are all kinds
of other differences!) The information we give in standard citations
names a work and give some crucial information about it, but allows the
reader to figure out the best way to retrieve the work (go to the library,
go to the bookstore, call the publisher, borrow it from a friend,
whatever). We need a similar level of abstraction for e-texts.

A couple of further points:

On Sun, 28 Apr 1996, Humanist wrote:

> From: Al Magary <almagary@cris.com>
> This citation style originates with Tim Berners-Lee, at the CERN high-energy
> physics lab in Switzerland. He may someday bear the title Father of the

Berners-Lee is now at the WWW Consortium (see URL below), in Cambridge,
Mass. I don't know about titles, but he single-handedly invented the WWW.
Others added images, forms, tables, and much other good stuff, but they
were all building on Berners-Lee's work.

> Mesdemoiselles Li and Crane do *not* suggest text coloring
> or underlining/italicizing of URLs or other electronic addresses in
> citations. Their Web URL:

And well they shouldn't, if I understand you correctly. Colors,
underlines, bolds, highlights, italics, and such effects are entirely
platform-dependent. Lynx (a Unix text mode browser), using the curses
libraries, can easily display highlights, but it can't do italics at all.
Netscape, using the facilities provided by MS Windows, the Mac OS, or X
Windows, can easily do italics, but highlighting would be a pain. Again,
the point is to have a standard -- in this case, for scholarly citations
-- that does not depend on the properties of any particular hardware or
software platform.

> From: Willard McCarty <Willard.McCarty@utoronto.ca>
> In Humanist 9.753 Andrew Burday referred to the RFC (which stands for...?)

Sorry. An RFC is a "Request For Comments", part of the traditional method
for establishing Internet standards. If a need for a standard on some
issue is generally perceived, somebody with relevant expertise writes an
RFC. Based on the response from others in the field, it may be adopted as
a standard by the Internet Engineering Task Force, or it may be modified,
or dropped altogether. If that sounds like a rather informal system, it
is. It dates to the days when the Net was purely a research project for
computer scientists and engineers. Whether it will persist, given the
very different nature of the net today, is a question I don't know how to

The authoritative list of RFCs and other Internet documentation may be
found at:


RFCs and other information related to the WWW may be found -- all over the
place! -- but the authoritative site, and the site run by people who are
more interested in establishing standards for communication than in
forcing the whole wide world to use their browser, is the WWW
Consortium, at:



Andrew Burday

Date: Fri, 3 May 96 12:57:35 EDT
From: "Peter Graham, RUL" <psgraham@gandalf.rutgers.edu>
Subject: Was: Re: 9.759 quoting URLs--electronic bib. citation forms.

From: Peter Graham, Rutgers University Libraries

Al Magary says, i.a., about citing electronic sources,
>Xia Li and Nancy B. Crane, reference librarians at the University of
Vermont, are two leaders in this field.< and he gives examples of their

They are no more leaders in the field than is Tim Berners-Lee, whom he ad
hominem says is not entitled to speak on bibliographic style, but I don't
know why not; his only venture in the field is clear, unambiguous, and
helpful. The same cannot be said for Li and Crane, who are getting clearer
and more helpful but fundamentally, I'm sorry to say, don't display a sense of
what the network is about, which flaws their enthusiastic efforts.

My published critique of their first edition of
_Electronic_Styles_ may also be found
through my Web pages below. I've just looked at their new Web pages in
preparation for their next edition, which Al M. kindly pointed me to. I see
improvements in their approach, but I still see evidence of trying to map
electronics to the print environment and of unnecessary ambiguity and
redundancy. It may be unfair to say this until the book comes out, but the
Web pages show no evidence of thinking about the issues, i.e. of trying to
come up with generalizations and principles to follow instead of simply
following what's there. They adapt the APA and MLA styles without saying
why. (In their first edition they endorsed only the APA style; why
the change?) The APA style I am not familiar with. The MLA recent
edition is quite flawed when it comes to electronic citations; it's a good
start, but incomplete.

The examples on the Li/Crane Web page show several redundancies and
potentials for confusion. Once again, also, their prescription doesn't follow
from their examples, as in the following case (APA pages):
Author. (Year). Title. Journal Title [Type of medium], volume(issue),
paging or indicator of length. Available Protocol (e.g., HTTP):
Site/Path/File [Access date].

Carriveau, K. L., Jr. [Review of the book Environmental hazards:
Marine pollution]. Electronic Green Journal [Online], 2(1), 3
Available Gopher:
[1995, June 21].
One of two things is wrong with their prescription for how to cite the URL;
either they don't understand the URL structure (scheme:path) or they are
being careless. In all their examples they repeat the scheme name, as they
did with "gopher" here ("Available Gopher:[cr]gopher....). But their
prescription speaks of the "Protocol" and then elides directly to the path,
which they call "Site/Path/File" (and of course the path may be much more
complex than that and is not normally, or ever, defined in these
terms in the networking community). In addition to these confusions,
their method leads to a redundancy in every citation as the scheme is

Al M. notes their punctuation and sees it as a potential confusion. I
agree with Li and Crane that punctuation should be part of what we
expect, e.g. a trailing period at the end of a sentence with a URL.
The way to do this properly is to set off the URL from normal text,
rather as we cite a journal using italics. Berners-Lee proposed the
angle brackets, a very reasonable proposal, which Magary for some
reason discards out of hand by dismissing him.

(Remember through all of this that citations are starting to appear
for electronic information most of all in other electronic texts; thus
the ability to cut and paste, to easily identify a character string,
and to be unconcerned with surrounding text becomes more important
than in the print environment. The Li/Crane web pages don't say
whether they are considering the medium of publication of the
citations or not; the book may.)

The Li/Crane use of "Available" continues from their first edition. I
don't know why; we don't bother with it for print. Possibly they see
the need to distinguish a URL from surrounding text; again, the angle
brackets would help, as would the notation URL which Berners-Lee
suggests (and would be of assistance to Li/Crane from distinguishing
addressable citations from, say, email citations).

Thus my suggested format for what they have above would be
....paragraphs <URL:gopher://gopher.uidaho.edu/11/UI_gopher/libr-
ary/egj03/carriv01.html> (1995, June 21).

I admit to not being sure about what to do with the date, but you'll
notice the punctuation all works out here with no worries. (Why
did they use square brackets for the date instead of parentheses?)

Please note that the URL is hyphenated. Berners-Lee (who as Magary
tells us is no authority, though he invented URL's)
has foreseen the need to treat long URL
lines, as well, and his rule explicitly says that the hyphen within a
URL at the end of a line is to be ignored (without a blank being
substituted for it). I await the Li/Crane book to see if they address
the long URL problem.

Another example from the Li/Crane web page
Discussion List Messages<<<

Basic forms:

Author. (Year, Month day). Subject of message. Discussion List
[Type of medium]. Available E-mail: DISCUSSION LIST@e-mail address
[Access date].

Author. (Year, Month day). Subject of message. Discussion List
[Type of medium]. Available E-mail: LISTSERV@e-mail address/Get
[Access date].
Email: They again make the distinction between "discussion list" and
"listserv", which is rather like distinguishing between vegetables and
spinach. Their example for "discussion lists" is an email address, so
they don't mean a newsgroup (which is otherwise unmentioned, but
without the full book before me this would be an unfair criticism).
They give two basic forms for "discussion list" and "listserv" so they
seem to think they are different, but in fact a listserv is a specific
software implementation of what they call a discussion list (not a
familiar term in the networked world); others are of course listproc,
majordomo, and the like. There is no bibliographic need to
distinguish amongst them (or is there?).

I might add that the use of the term "Available" here begs a very big
question, for list messages are very unavailable except through
archives which only some maintain, and the means of getting at the
archive is not what is prescribed in their example.

"* Author's login name, in uppercase, is given as the first element."
Why? Few people do that. By "login name" they seem to mean the
userid (a more familiar term) which is only part of the author's
address, which would be far more helpful and a better identifier for a
citation. And in fact, in their email examples, this is what they
use; why the difference?

Thus in their first example instead of RRECOME for an
"author" I would suggest "rrecome@statecoll.edu". The question arises
how to distinguish this lower case from surrounding text; I'm sure Li
and Crane could work something out if they had some basic principles
to follow, but they don't seem to. (Second thought: they do address
it in the "email" examples.)

I find I've gone on further on this than I expected to. I know it
sounds like negative criticism. On the other hand, Li/Crane and I had
several useful exchanges of correspondence during and after my
previous review, and I think it's fair to say I displayed my interest
in the topic. I would have been pleased to comment on any further
thoughts they have had as they prepared their next edition, but I've
not been asked (nor have I seen any open call for comments);
so I don't mind commenting publicly. In their first
edition or in remarks to me they spoke of getting a "life preserver"
out for the "masses" to use at a time when there was no other
published help. I hope we're not going to get a second
edition of a life preserver.

To get back to where I started: I disagree with Magary that Li/Crane
are "leaders" in this field, except that they got out there first.
Many people, of course, will follow whoever shows up first.
Eventually it will all work out. --pg

Peter Graham psgraham@gandalf.rutgers.edu Rutgers University Libraries
169 College Ave., New Brunswick, NJ 08903 (908)445-5908; fax (908)445-5888