10.0728 serious content analysis

WILLARD MCCARTY (willard.mccarty@kcl.ac.uk)
Thu, 20 Feb 1997 22:22:40 +0000 (GMT)

Humanist Discussion Group, Vol. 10, No. 728.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Centre for Computing in the Humanities, King's College London
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: David Hoover <hoover@is.nyu.edu> (25)
Subject: Re: 10.0723 a Serious Request

[2] From: Patricia Galloway <galloway@mdah.state.ms.us> (6)
Subject: Re: 10.0723 a Serious Request

[3] From: John Unsworth <jmu2m@virginia.edu> (21)
Subject: Re: 10.0723 a Serious Request

[4] From: Tom Landauer <landauer@psych.colorado.edu> (23)
Subject: An enquiry from Her Majesty's Govt

--[1]----------------------------------------------------------------
Date: Thu, 20 Feb 1997 15:27:20 -0500 (EST)
From: David Hoover <hoover@is.nyu.edu>
Subject: Re: 10.0723 a Serious Request

On Wed, 19 Feb 1997, WILLARD MCCARTY wrote:

> [1] From: Lou Burnard <lou.burnard@computing- (30)

> Well, if you had all the scripts on computer disks, you could run them
> through a program which would calculate some kind of content descriptor,
> some kind of rating. Then parents could program their TVs to block out any
> programmes with certain kinds of descriptor or rating, right? Some kind of
> automatic categorization, topic identification, that kind of thing. Is
> no-one doing research on that sort of text analysis? Specifically as applied
> to broadcasting? Has anyone tried to apply automatic content identification
> methods to this kind of domain?
>
> I'll be glad to digest any suggestions and pass them on to our man in
> Whitehall. Suggestions relating to the question asked, that is. Comments
> about the state of British politics in general, and the current government
> in particular, should be sent to a more appropriate forum. And if there's
> any research funding in this, just remember, I saw him first.
>

The only reasonable suggestion (that's polite enough to post) is "forget
it." More seriously, I'll undertake the following: when you find someone
who will produce such software, I will undertake to alter a script that
the software finds acceptable for a "G" rating so as to turn it into at
least an "R" without changing the software rating.

David L. Hoover, Assoc. Prof. of Engl. hoover@is.nyu.edu 212-998-8832
Webmaster, NYU English Dept. http://www.nyu.edu/gsas/dept/english/
"Outside of a dog, a book is man's best friend.
Inside of a dog, it's too dark to read."--Groucho Marx

--[2]----------------------------------------------------------------
Date: Thu, 20 Feb 1997 10:25:37 -0500
From: Patricia Galloway <galloway@mdah.state.ms.us>
Subject: Re: 10.0723 a Serious Request

Lou's "serious request" alas is Not Invented Here (i.e. Whitehall): My
failing memory offers up "Three Days of the Condor", in which some
obscure CIA outfit was processing tons of text to come up with
suspicious collocations. Am I right? Maybe the filmmakers know how it
was done...

Pat Galloway
Mississippi Department of Archives and History

--[3]----------------------------------------------------------------
Date: Wed, 19 Feb 1997 22:10:44 -0500
From: John Unsworth <jmu2m@virginia.edu>
Subject: Re: 10.0723 a Serious Request

I think the first thing I'd say is: can imagine software that could
distinguish between, say, the evening news in LA and an excessively violent
TV drama, on the basis of content--or between an instance of pornography and
a show about the evils of pornography. Short of artificial intelligence,
and a pretty subtle one at that, I doubt it. I'd recommend that the
Whitehall optimist read Richard Powers, _Galatea 2.2_ to get a sense of what
he's asking for. Probably there will be others with
more and better reading suggestions...

John Unsworth / Director, IATH / Dept. of English
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
http://jefferson.village.virginia.edu/~jmu2m/

--[4]----------------------------------------------------------------
Date: Thu, 20 Feb 1997 11:38:13 -0700
From: Tom Landauer <landauer@psych.colorado.edu>
Subject: An enquiry from Her Majesty's Govt

Hey,

I'm pretty sure Latent Semantic Analysis can do that, or most of it, pretty
well and almost entirely automaticly. (We've shown that it can measure how
much an essay or text has to say about how the heart works, the history of
the Panama canal, or what causes aphasias, as acurately as expert human
judges. It is largely independent of word choice (thus hard to fool by code
words; it doesn't just count keywords) so it should be able to measure sex
and violence and anti-Brit propanganda just fine. If British TV carriews
closed captions, it could be done real-time, otherwise scripts or
transcrips would be needed. Is there really enough interest to make it
worthwhile pursuing?

Tom

Tom Landauer
Department of Psychology
and Institute of Cognitive Science
University of Colorado, Boulder

Postal and courier address:
Institute of Cognitive Science
Campus Box 344
U of Colorado
Boulder, CO 80309-0344

303 492 2875 (CU), 303 546 9401 (H)
FAX: 303 492 7177
email landauer@psych.colorado.edu