Tonal Contour Generation for Isarn Speech Synthesis Using Deep Learning and Sampling-Based F0 Representation
Pongsathon Janyoi and Pusadee Seresangtakul
Speech samples to support the submission.
The synthetic speeches are generated by using the same spectral parameters with the different F0 contours.
This page contains following samples:
Natural speech.
Frame-based RNN : F0 values are genereted frame-by-frame.
DCT-based RNN : F0 constours are represented by DCT coefficients and generated
syllable-by-syllable.