15.238 book on word-frequency distributions

From: by way of Willard McCarty (willard@lists.village.Virginia.EDU)
Date: Thu Sep 13 2001 - 01:30:05 EDT

  • Next message: by way of Willard McCarty: "15.239 calls for papers, conference host"

                   Humanist Discussion Group, Vol. 15, No. 238.
           Centre for Computing in the Humanities, King's College London
                   <http://www.princeton.edu/~mccarty/humanist/>
                  <http://www.kcl.ac.uk/humanities/cch/humanist/>

             Date: Thu, 13 Sep 2001 06:21:19 +0100
             From: "David L. Gants" <dgants@parallel.park.uga.edu>
             Subject: Book: Word frequency dstributions

    >> From: Jean Veronis <Jean.Veronis@newsup.univ-mrs.fr>

                              KLUWER ACADEMIC PUBLISHERS
                         TEXT, SPEECH AND LANGUAGE TECHNOLOGY
                                     Volume 18
                    Series editors: Nancy Ide and Jean V=E9ronis

                            WORD FREQUENCY DISTRIBUTIONS
                                        by
                               R. Harald Baayen
                       University of Nijmegen, The Netherlands

    This book is a comprehensive introduction to the statistical analysis
    of word frequency distributions, intended for computational
    linguists, corpus linguists, psycholinguists, and researchers in
    the field of quantitative stylistics. Word frequency distributions
    are characterized by very large numbers of rare words. This property
    leads to strange phenomena such as mean frequencies that
    systematically change as the number of observations is increased,
    relative frequencies that even in large samples are not fully
    reliable estimators of population probabilities, and model parameters
    that vary with text or corpus size. Special statistical techniques
    for the analysis of distributions with large numbers of rare events
    can be found in various technical journals. The aim of this book is
    to make these techniques more accessible for non-specialists, both
    theoretically, by means of a careful introduction to the underlying
    probabilistic and statistical concepts, and practically, by providing
    a program library implementing the main models for word frequency
    distributions (CD-ROM included).
    Kluwer Academic Publishers, Dordrecht
    Hardbound, ISBN 0-7923-7017-1
    June 2001, 356 pp.
    EUR 117.00 / USD 108.00 / GBP 74.00

    ---------------------------------------------------------------------

    CONTENTS

    1. Word Frequencies.
    2. Non-parametric models.
    3. Parametric models.
    4. Mixture distributions.
    5. The Randomness Assumption.
    6. Examples of Applications.

    A. List of Symbols.
    B. Solutions of the exercises.
    C. Software.
    D. Data sets.
    Bibliography.
    Index.

    CD-ROM Included

    ---------------------------------------------------------------------

                                  PREVIOUS VOLUMES

          Volume 1: Recent Advances in Parsing Technology
                     Harry Bunt, Masaru Tomita (Eds.)
                     Hardbound, ISBN 0-7923-4152-X, 1996

          Volume 2: Corpus-Based Methods in Language and Speech Processing
                     Steve Young, Gerrit Bloothooft (Eds.)
                     Hardbound, ISBN 0-7923-4463-4, 1997

          Volume 3: An introduction to text-to-speech synthesis
                     Thierry Dutoit
                     Hardbound, ISBN 0-7923-4498-7, 1997

          Volume 4: Exploring textual data
                     Ludovic Lebart, Andr=E9 Salem and Lisette Berry
                     Hardbound, ISBN 0-7923-4840-0, December 1997

          Volume 5: Time Map Phonology:
                     Finite State Models and Event Logics in Speech
                     Recognition
                     Julie Carson-Berndsen
                     Hardbound, ISBN 0-7923-4883-4, 1997

          Volume 6: Predicative Forms in Natural Language and in
                     Lexical Knowledge Bases
                     Patrick Saint-Dizier (Ed.)
                     Hardbound, ISBN 0-7923-5499-0, December 1998

          Volume 7: Natural Language Information Retrieval
                     Tomek Strzalkowski (Ed.)
                     Hardbound, ISBN 0-7923-5685-3, April 1999

          Volume 8: Techniques in Speech Acoustics
                     Jonathan Harrington, Steve Cassidy
                     Hardbound, ISBN 0-7923-5731-0, July 1999

          Volume 9: Syntactic Wordclass Tagging
                     Hans van Halteren (Ed.)
                     Hardbound, ISBN 0-7923-5896-1, August 1999

          Volume 10: Breadth and Depth of Semantic Lexicons
                     Viegas, E. (Ed.)
                     Hardbound, ISBN 0-7923-6039-7, November 1999

          Volume 11: Natural Language Processing Using Very Large Corpora
                     Armstrong, S., Church, K.W., Isabelle, P.,
                     Manzi, S., Tzoukermann, E., Yarowsky, D. (Eds.)
                     Hardbound, ISBN 0-7923-6055-9, November 1999

          Volume 12: Lexicon Development for Speech and Language Processing
                     Frank van Eynde & Dafydd Gibbon (Eds.)
                     Hardbound, ISBN 0-7923-6368-X, April 2000.

          Volume 13: Parallel text processing:
                     Alignment and use of translation corpora
                     Jean V=E9ronis (Ed.)
                     Hardbound, ISBN 0-7923-6546-1, August 2000.

          Volume 14: Prosody: theory and experiment
                     Studies Presented to G=F6sta Bruce
                     Merle Horne (Ed.)
                     Hardbound, ISBN 0-7923-6579-8, August 2000.

          Volume 15: Intonation : Analysis, Modelling and Technology
                     Antonis Botinis (Ed.)
                     Hardbound, ISBN 0-7923-6605-0, October 2000.
                     Paperback, ISBN 0-7923-6723-5, October 2000.

          Volume 16: Advances in probabilistic and other parsing technologies
                     Harry Bunt, Anton Nijholt (Eds.)
                     Hardbound, ISBN 0-7923-6616-6, October 2000.

          Volume 17: Robustness in language and speech technology
                     Jean-Claude Junqua, Gertjan van Noord (Eds.)
                     Hardbound, ISBN 0-7923-6790-1, February 2001

    Check the series Web page for order information:

          http://www.wkap.nl/series.htm/TLTB



    This archive was generated by hypermail 2b30 : Thu Sep 13 2001 - 01:38:39 EDT