6.0405 Rs: Sanscrit and Hungarian E-Texts (2/216)

Fri, 11 Dec 1992 15:12:09 EST

Humanist Discussion Group, Vol. 6, No. 0405. Friday, 11 Dec 1992.

(1) Date: Thu, 10 Dec 92 12:00:10 WST (21 lines)
From: Thomas B. Ridgeway <ridgeway@blackbox.hacc.washington.edu>
Subject: Re: 6.0397 E-Text Query (1/10)

(2) Date: Wed, 9 Dec 1992 16:17 EST (195 lines)
From: Paul Mangiafico <PMANGIAFICO@guvax.acc.georgetown.edu>
Subject: Sanskrit and Hungarian E-texts

(1) --------------------------------------------------------------------
Date: Thu, 10 Dec 92 12:00:10 WST
From: Thomas B. Ridgeway <ridgeway@blackbox.hacc.washington.edu>
Subject: Re: 6.0397 E-Text Query (1/10)

John Haviland of Reed enquires re Sanskrit (or Hungarian) e-texts:
A small sample of Sanskrit e-texts is available for anonymous
ftp from blackbox.hacc.washington.edu in the directory pub/indic
(Brihatsamhita, Panini Sutras, Buddhacarita and Saundaryalahari
to be specific).
These are encoded in the proposed Classical Sanskrit Extended
standard for encoding romanized Indic languages. For more
discussion on this and related matters, I refer you to
the listserv group Indology-l, based at liverpool.ac.uk
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Thomas Ridgeway, Director,
Humanities and Arts Computing Center/NorthWest Computing Support Center
35 Thomson Hall, University of Washington, DR-10
Seattle, WA 98195 phone: (206)-543-4218 * Ask me about *
Internet: ridgeway@blackbox.hacc.washington.edu * Unix TeX *
- - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(2) --------------------------------------------------------------206---
Date: Wed, 9 Dec 1992 16:17 EST
From: Paul Mangiafico <PMANGIAFICO@guvax.acc.georgetown.edu>
Subject: Sanskrit and Hungarian E-texts

Regarding John Haviland's request for Sanskrit and Hungarian e-texts,
I was able to find a few with a quick search in Georgetown
University's CPET (Catalogue of Projects in Electronic Text). I have
included a list below, beginning with the ten categories under which
the data is classified, and followed by info on projects with Sanskrit
e-texts and one project with Hungarian e-texts. I hope this information is
useful to many HUMANISTs.

If you are in search of other e-texts, the Georgetown CPET may be of use
to you as well. The CPET database can be accessed via Telnet or modem,
or if you send me particulars on what you are looking for I can do a
quick search for you and email the results. In any case, if you would
like more information on this service, just send me a note.

Paul Mangiafico, project assistant
Center for Text & Technology
Georgetown University



0. Identifying acronym or short reference.
1. Name and affiliation of operation (with collaborators noted).
References to any published description.
2. Contact person and/or vendor with addresses (including
telephone and email if possible).
3. Primary disciplinary focus (and secondary interests) [e.g.
Literature, Language, Linguistics, Music, Art, etc.].
4. Focus: time period, location, individual, genre, or medium.
5. Language(s) encoded; [English, French, German, et. al.].
6. Intended use(s) [e.g. textbank, database, bibliography] with
Goal (or statement of purpose) and Size [number of works, or
entries, or citations].
7. Format(s), including choice of sequential text or database
excerpts, file formats, analytical programs and programming
languages, text markup and encoding schemes, hardware and
operating systems, etc. To what extent are the formats
consistent throughout the archive?
8. Form(s) of access: if online, what policies? If tape, what
track, bpi, block size, labels, parity setting? If diskette,
what size and operating system or microcomputer? If CD-ROM,
what format? What software is needed for accessing? Is it
provided with the package? Availability and price.
9. Source(s) of the archival holdings: encoded in-house, or
obtained from elsewhere (where)? Textual authority used for
encoding? Titles of the works held, bibliographical
information on them.


Bamberg (Otto Friedrich Universita%t)/ Thesaurus of Texts in
Ancient Indo-European Languages
0. THESIETEXT (Thesaurus of Texts in Ancient Indo-European
1. Thesaurus Indogermanischer Textcorpora; Universitat
Bamberg, Germany See Journal "Die Sprache," Vol. 32/2
2. Dr. Jost Gippert
Universita%t Bamberg, Orientalistik
Postfach 1549
D-W-8600 Bamberg, Germany
3. Literature, language, linguistics, history
4. From beginning of literacy to 17th century; Eurasia
5. Old Indic (Sanskrit), Old Iranian (Avestan, Old Persian),
Hittite, Tokharian, Old Germanic, Greek (Ancient), Italic
languages, Armenian (Old), and several other I.- E.
6. Textbank
7. Sequential text; encoding scheme of DOS, WordCruncher,
and WordPerfect 5.1
8. Access on diskettes, CD-ROM (planned)
9. Encoded by various scholars in different parts of Europe.

Hamburg (Univ)/ Sanskrit medical encyclopaedias
1. Sanskrit medical encyclopaedias
2. Prof. R.E. Emmerick
Iranian Studies
University of Hamburg Germany
3. Medicine
4. Caraka, Susruta, Astangahrdaya, Astangasamgraha, and the
Siddhasara of Ravigupta
5. Sanskrit

Tu%bingen (Seminar fu%r Indologie und Vergleichende
Religionswissenschaft)/ Tu%bingen Parana Project
1. Tu%bingen Parana Project. Peter Schreiner, Renate
So%hnen, Heinrich v. Stietencron. Publications
Indicies and Text of the Brahmapurana. Wiesbaden:
Harrassowitz [1987]
2. Professor Dr. Heinrich v. Stietencron
Seminar fu%r Indologie und Vergleichende
Mu%nzgasse 30
D-7400 Tu%bingen Germany
Tel. 0049-7071-292675
3. Indology (Indian studies), Sanskrit
4. Classical Hinduism; Puranas, Brahmapurana
5. Sanskrit
6. Published indicies on microfiche; deposit of the input
with the Oxford Text Archive has been announced but not
yet carried out. The Brahmapurana is a single Sanskrit
text with ca. 14000 verses.
7. Straight-forward trans-literation with marking of sandhi,
nominal compounds, references; TUSTEP format (ASCII
format possible). TUSTEP programs for KWIC-index, reserve
index word forms etc.
9. Encoded in-house.

Zurich (Univ)/ Sanskrit texts
1. Sanskrit texts
2. Prof. Peter Schreiner
Abteilung fu%r Indologie
Universita%t Zu%rich
Ra%mistr. 68
CH-8001 Zu%rich Switzerland
tel. 0041-1-2572036
3. Indology, Sanskrit, Hinduism, Indian philosophy
4. Visnupurana, Manu, Sakuntala, Asvaghosa, Buddhacarita,
Gaudapada-Karika, Adisesa, Paramarthasara, Bhagavadgita,
Narayaniyam, Mahabharata, Svetasvatara-Upanisad.
5. Sanskrit
6. deposit with Oxford Text Archive intended
7. Straight-forward trans-literation with marking of sandhi,
nominal compounds, references; TUSTEP format (ASCII
format possible). TUSTEP programs for KWIC-index,
reserve index word forms etc.
8. Presently none
9. Encoded in-house

TX Austin (University of Texas)/ Thesaurus Linguae Sanskritae
1. Thesaurus Linguae Sanskritae, University of Texas
2. Prof. R. Lariviere
University of Texas
Austin, Texas 78712
tel. (512) 471-5811
5. Sanskrit
9. Texts include Mahabharata and Ramayana


PA Pittsburgh (Carnegie Mellon Univ)/ CHILDES Database
0. CHILDES (Child Language Data Exchange System)
1. Childes Database, Carnegie Mellon Univ
See "The Child Language Data Exchange System: An
Update," Journal of Child Language, [1990]. Snow,
Catherine. "The Child Language Data Exchange System",
ICAME Journal (No.14). Bergen, Norway: Norwegian
Computing Center [April 1990]. Carterette, E. & Jones,
M.H. Informal Speech. Berkeley: University of California
Press [1974]. MacWhinney, B. & Snow, C. "The Child
Language Data Exchange System", Journal of Child Language
(Vol.12, pages 271-296). [1985].
2. Brian MacWhinney
Department of Psychology
Carnegie Mellon University
Pittsburgh, PA 15213
BITNET: brian@andrew.bitnet
Internet: edu%"brian@andrew.cmu.edu"
3. Linguistics; psycholinguistics
4. Transcripts of children's dialogue
5. English, Afrikaans, Danish, Dutch, French, German,
Hebrew, Hungarian, Italian, Polish, Slobin, Spanish,
6. Database of 40 sets of corpora of parent-child and child-
child interactions from children speaking (13 languages
in total); the corpora are divided into six major
directories: English, non-English, narratives, books,
language impairments, and second language acquisition;
includes three major tools for child language research:
(1) the CHILDES database of transcripts, (2) the CHAT
system for transcribing and coding data, and (3) the CLAN
programs for analyzing CHAT files; 140 million characters
(140 MB).
7. Database excerpts; available on floppies and tapes;
detailed coding scheme has been devised and the data are
put in that format
8. (Planned) CD-ROM
9. Obtained from researchers