11.0012 design of a scanning service?

Humanist Discussion Group (humanist@kcl.ac.uk)
Fri, 9 May 1997 21:00:20 +0100 (BST)

Humanist Discussion Group, Vol. 11, No. 12.
Centre for Computing in the Humanities, King's College London
<http://www.princeton.edu/~mccarty/humanist/>
<http://www.kcl.ac.uk/humanities/cch/humanist/>

Date: Fri, 9 May 1997 11:37:45 +0100 (BST)
From: John Bradley <John.Bradley@kcl.ac.uk>
Subject: Designing a scanning service

I have been asked to develop a technical specification for a
scanning service which would be principally used by the Humanities
community here at King's College, London. This service would supplement
a walk-in-and-use scanning facility which is developing here. I am
assuming that OCR would be the main focus, with perhaps some image
scanning services available.

Operator supported image scanning is perhaps unnecessary except when
the highest quality of results are needed. For high quality results
one needs a high powered computer, high quality image software, a
powerful scanner, and an operator trained in issues of image quality
and manipulation. For text OCR scanning I expect (particularly given
the current state of the field), one has more modest goals and
requrements: a mainstream PC or Mac (there seems to be a larger range
of specialized OCR software products for the PC), a mainstream scanner
(perhaps with a sheet feeder), software and a trained operator.

I'd be very pleased to talk to those of you who currently run such a
service for your views. I'd be interested in your thoughts on:

(a) the nature of the materials that one actually is asked to scan.

Is there much need for operator supported image scanning? For OCR
what materials are currently practical?

(b) the suitability of current software to deal with these materials.

It seems to me that the mainstream development in OCR software these
days is taking it in directions that add features that are not of
much interest to humanities scholarship. New features or
enhancements that would be useful (such as improvements in
training strategies to allow for the scanning of a range of writing
systems) are not being developed.

(c) what type of hardware would most likely be appropriate.

There is some tension between the needs for operator-assisted image
scanning and OCR. Is my model of what machine and peripherals are
needed for OCR correct? How about operator skills?

My sense is that any discussion on this needn't be carried out
directly on HUMANIST, but I'd propose rather that anyone with
comments could send them to me directly and if there is an expressed
interest on HUMANIST, I can post a summary of these back in a week or
so.

Best wishes. ... john bradley

----------------------
John.Bradley@kcl.ac.uk