Humanist Discussion Group, Vol. 13, No. 560.
Centre for Computing in the Humanities, King's College London
Date: Wed, 26 Apr 2000 20:47:46 +0100
From: Thierry van Steenberghe <email@example.com>
Subject: Re: 13.0549 voice; virtual reality
In reply to Francois Lachance question below:
> Humanist Discussion Group, Vol. 13, No. 549.
> Centre for Computing in the Humanities, King's College London
>  From: firstname.lastname@example.org (Francois Lachance) (17)
> Date: Wed, 19 Apr 2000 21:11:58 +0100
> From: email@example.com (Francois Lachance)
> Subject: Re: 13.0547 music: voice, instrument and song
> > transcription of its dictionary, and asked them if they could
> > IPA-coded dictionary into the transcription used by their TTS
> > turned out that the process worked astonishingly well, even though
> > proved that a finer tweaking than just translating IPA into the TTS
> > phonetic code would be required to obtain 'natural' sounding
> The tweaking, did it occur? If it did, did it relate to some form of
> transcription to indicate pauses and rhythm? I ask because this has
> implications for the elements one would use in the encoding of a spoken
> word corpus.
> Thank you for taking the time to inform us all of these very interesting
> Francois Lachance
> Post-doctoral Fellow
> projet HYPERLISTES project
Yes, some first tweaking was actually performed on a selected part of our
corpus, with the intention to demonstrate its feasibility and evaluate its
potential. If this first tweaking indeed evidenced an improvement, the
result was still not satisfactory enough that the utterances would sound
'natural', and a further tweaking would have been desirable. However, we
could not proceed as this part of the project was still going on when the
whole project was suspended for external reasons, provisionnally do we hope.
Now, as far as the tweaking process itself is concerned, there was no
modification of the transcriptions, as far as I know. The speech synthesis
specialists at the TTS company did carry the operation using their own
(graphical) tools that allow to stretch/compress parts of the synthesised
'speech' (thus effectively modifying pauses and rythm) and to modify the
pitch of selected (di)phones (thus improving the syllabic stress and the
intonation, for example).
In my opinion, the process seemed quite promising, and work should have
continued on the selected corpus to understand exactly what had to be done
and then maybe try to devise an automatisable procedure that would allow
the whole corpus to be first batch-tweaked (probably effectively modifying
the transcriptions) to a state where only minor hand-tweaking should be
applied to a reduced number of entries.
Thierry van Steenberghe Bruxelles / Belgium mailto:firstname.lastname@example.org __________________________________
This archive was generated by hypermail 2b29 : Wed Apr 26 2000 - 20:11:41 CUT