13.0560 voice processing

From: Humanist Discussion Group (willard@lists.village.virginia.edu)
Date: Wed Apr 26 2000 - 20:02:13 CUT

  • Next message: Humanist Discussion Group: "13.0558 Macaulay quotation and text-retrieval"

                   Humanist Discussion Group, Vol. 13, No. 560.
           Centre for Computing in the Humanities, King's College London

             Date: Wed, 26 Apr 2000 20:47:46 +0100
             From: Thierry van Steenberghe <100342.254@compuserve.com>
             Subject: Re: 13.0549 voice; virtual reality

    In reply to Francois Lachance question below:

    > Humanist Discussion Group, Vol. 13, No. 549.
    > Centre for Computing in the Humanities, King's College London
    > <http://www.princeton.edu/~mccarty/humanist/>
    > <http://www.kcl.ac.uk/humanities/cch/humanist/>
    > [1] From: lachance@chass.utoronto.ca (Francois Lachance) (17)
    > >
    > --[1]------------------------------------------------------------------
    > Date: Wed, 19 Apr 2000 21:11:58 +0100
    > From: lachance@chass.utoronto.ca (Francois Lachance)
    > Subject: Re: 13.0547 music: voice, instrument and song
    > Thierry,
    > > transcription of its dictionary, and asked them if they could
    transform our
    > > IPA-coded dictionary into the transcription used by their TTS
    engine. It
    > > turned out that the process worked astonishingly well, even though
    it also
    > > proved that a finer tweaking than just translating IPA into the TTS
    > > phonetic code would be required to obtain 'natural' sounding
    > The tweaking, did it occur? If it did, did it relate to some form of
    > transcription to indicate pauses and rhythm? I ask because this has
    > implications for the elements one would use in the encoding of a spoken
    > word corpus.
    > Thank you for taking the time to inform us all of these very interesting
    > developments.
    > --
    > Francois Lachance
    > Post-doctoral Fellow
    > projet HYPERLISTES project
    > http://www.humanities.mcmaster.ca/~hyplist/


    Yes, some first tweaking was actually performed on a selected part of our
    corpus, with the intention to demonstrate its feasibility and evaluate its
    potential. If this first tweaking indeed evidenced an improvement, the
    result was still not satisfactory enough that the utterances would sound
    'natural', and a further tweaking would have been desirable. However, we
    could not proceed as this part of the project was still going on when the
    whole project was suspended for external reasons, provisionnally do we hope.

    Now, as far as the tweaking process itself is concerned, there was no
    modification of the transcriptions, as far as I know. The speech synthesis
    specialists at the TTS company did carry the operation using their own
    (graphical) tools that allow to stretch/compress parts of the synthesised
    'speech' (thus effectively modifying pauses and rythm) and to modify the
    pitch of selected (di)phones (thus improving the syllabic stress and the
    intonation, for example).

    In my opinion, the process seemed quite promising, and work should have
    continued on the selected corpus to understand exactly what had to be done
    and then maybe try to devise an automatisable procedure that would allow
    the whole corpus to be first batch-tweaked (probably effectively modifying
    the transcriptions) to a state where only minor hand-tweaking should be
    applied to a reduced number of entries.


    Thierry van Steenberghe Bruxelles / Belgium mailto:100342.254@compuserve.com __________________________________

    This archive was generated by hypermail 2b29 : Wed Apr 26 2000 - 20:11:41 CUT