3.623 Connection Machine, parallel processing, etc. (107)

Willard McCarty (MCCARTY@vm.epas.utoronto.ca)
Mon, 23 Oct 89 20:50:52 EDT

Humanist Discussion Group, Vol. 3, No. 623. Monday, 23 Oct 1989.

Date: Mon, 23 Oct 89 19:44 N
Subject: supercomputing humanities; character sets

I do not want to beat this horse dead, but I do not feel satisfied w/ the
way the discussions of humanities supercomputing have been going. I felt
certain that some Connection Machine followers would pitch-in.

Yes, analog machines do not exist in the literal sense of the word, but it is
possible to "create" them and their hybrid bretheren (digital/analog). If one
uses, for a connectionist node activation function, the "Continuous Sigmoidal",
they have achieved an analog representation. One can achieve the hybrid
digital/analog representation by using an activation function of "continuous

What does all this mean to the computing humanist? If one can find a use for
probablistic representation schemes in their "humanistic" processing, then this
would be a worthy area to explore. For example, I have been toying with the
idea of using such probability modelling as a way of addressing the problem
of identifying context in complex NLU.

I cannot go into detail right now, but if anyone is willing and patient, I
can try responding to specific questions. It would be a great help to me and
other HUMANISTs, if those connectionism conscenti spoke up and carried these
comments further.

For those thinking about real applicability and accessibility to the hardware
and technology, I have two comments.

1. Machines, like the Connection Machine (CM), can now be purchased in smaller
configurations. Down to 2,000 processors. Anything smaller is not cost-
justifiable (paraphrase from TMC spokesperson).
NB: I do not have stock interest in the CM. I am just more familiar w/ it.
Other similar SIMD machines exist, but I am not so close to them.

2. Powerful applications for text processing have been around. I have
a friend who developed something for lisp machines, but the project
was scrapped because the company could not find a market for the
product. Thinking Machines Corp. is aware of a potential use for
their machine in text processing and are keeping an ear open for
possibilities. As someone interested in developing a robust text-
processing system for such a machine, I have been keeping them
informed, and would like to inform HUMANIST of their possibilities
as well. It has been the lack of a good project proposal that has inhibitting
the bridging of the gap.

FINALLY, as the subject header suggests, I would like to urge SOME HUMANISTs
(and their affiliates) to seriously concentrate on the issue of
an international coding scheme for characters. We all know that ASCII and
relatives are insufficient. I have heard of an SGML initiative to come up
with something, but have not heard of any further developments. If people
have been toying w/ the idea, but have failed to find the proper support
for such an initiative, please let me know. There is a very strong
commercial need for an ISO character coding schema, and the problem will not
go away.

Here's hoping for some very good future projects ...

-Joe Giampapa
Dida*Lab (research lab for DiDA*EL, education technologies developers)

[Subsequent note from the same person.... W.M.]

I forgot to clarify: parallel computation problems take (usually) 2 forms:
Multiple Instruction, Multiple Data (MIMD -- pronounced mim-dee); and
Single Instruction, Multiple Data (SIMD -- pronounced sim-dee).
PDP refers to the Parallel Distributed Processing (pronounced pee-dee-pee)
project at Carnegie Mellon (CMU) under Rummelhart and McClelland.

PDP and SIMD basically describe the same phenomena.

PDP and connectionist work has experienced very good success in character
recognition (cf. Hopfield nets) and speech recognition (cf. NetTalk -- query to
others: did I get this last one correct?). HUMANISTs can follow up on these
references in and Computer Science library, in the PDP books, vols 1 & 2 of
Rummelhart and McClelland.

Additionally, if people want to play w/ these models, there are two references:
The PDP Workbook: contains two pc-compatible diskettes w/ demo software. The
programs, written in C, are developed to coincide w/ chapters in the PDP
books. They can be very difficult to comprehend and use, but are one of
the best examples for developing many flexible, extendable, and rigorous
test PDP models.
MacBrain: Yes, runs on the Mac. I believe the min. requirement is Mac+ w/
512k. Better versions for the MacII. The version I used (macBrain2.0) is
a bit buggy and difficult to expand, but is EXCELLENT for teaching people
SIMD basics. I gave a presentation to Humanities prof.s and their students
and both groups loved it and learned much.

I invite anyone to correct me if I am wrong or did not clarify anything, and
esp. to add to this area. There are some very good possiblities for computing
humanists in these areas.

-Joe Giampapa