- 2. Speech synthesis What is the task? Generating natural sounding speech on the fly, usually from text
- 3. Input type Concept-to-speech vs text-to-speech In CTS, content of message is determined from internal representation, not
- 4. Text-to-speech What to say: text-to-phoneme conversion is not straightforward Dr Smith lives on Marine Dr in
- 5. Text-to-phoneme module Architecture of TTS systems Grapheme-to-phoneme conversion Prosodic modelling Acoustic synthesis Abbreviation lexicon Exceptions lexicon
- 6. Text normalization Any text that has a special pronunciation should be stored in a lexicon Abbreviations
- 7. Grapheme-to-phoneme conversion English spelling is complex but largely regular, other languages more (or less) so Gross
- 8. Grapheme-to-phoneme conversion Much easier for some languages (Spanish, Italian, Welsh, Czech, Korean) Much harder for others
- 9. Syntactic (etc.) analysis Homograph disambiguation requires syntactic analysis He makes a record of everything they record.
- 10. Text-to-phoneme module Architecture of TTS systems Grapheme-to-phoneme conversion Prosodic modelling Acoustic synthesis Abbreviation lexicon Exceptions lexicon
- 11. Prosody modelling Pitch, length, loudness Intonation (pitch) essential to avoid monotonous robot-like voice linked to basic
- 12. Acoustic synthesis Alternative methods: Articulatory synthesis Formant synthesis Concatenative synthesis Unit selection synthesis
- 13. Articulatory synthesis Simulation of physical processes of human articulation Wolfgang von Kempelen (1734-1804) and others used
- 14. Formant synthesis Reproduce the relevant characteristics of the acoustic signal In particular, amplitude and frequency of
- 15. Formant synthesis Demo: In control panel select “Speech” icon Type in your text and Preview voice
- 16. Concatenative synthesis Concatenate segments of pre-recorded natural human speech Requires database of previously recorded human speech
- 17. Diphone synthesis Most important for natural sounding speech is to get the transitions right (allophonic variation,
- 18. Diphone synthesis Most systems use diphones because they are Manageable in number Can be automatically extracted
- 19. Concatenative synthesis Input is phonemic representation + prosodic features Diphone segments can be digitally manipulated for
- 20. Unit selection synthesis (USS) Same idea as concatenative synthesis, but database contains bigger variety of “units”
- 21. Speech synthesis demo
- 23. Скачать презентацию