본문 바로가기

카테고리 없음

MIST Tacotron speech synthesis inference audio

 

 

Single speaker neutral
Prosody Reference Original GST Original VAE Robust GST Robust VAE MIST Tacotron 100 MIST Tacotron 10
neutral 1
neutral 2

 

 

Single speaker emotion
Prosody Reference Original GST Original VAE Robust GST Robust VAE MIST Tacotron 100 MIST Tacotron 10
angry
angry
disgusting
disgusting
fear
fear
happy
happy
sad
sad
surprise
surprise

 

 

multi speaker neutral
Prosody Reference Original GST Original VAE Robust GST Robust VAE MIST Tacotron 100 MIST Tacotron 10
Woman1
Woman2
Man1
Man2
Dialect1
Dialect2

 

multi speaker emotion
Prosody Reference Original GST Original VAE Robust GST Robust VAE MIST Tacotron 100 MIST Tacotron 10
Angry 1
Angry 2
Happy 1
Happy 2
Sad 1
Sad 2

 

Non-linear prosody transfer
Reference_sad nea_happy nea_sad nee_sad neo_neutral