Creating Synthetic Voices For Children By Adapting Adult Average Voice
Using Stacked Transformations And VTLN

Reima Karhila, D.R. Sanand, Mikko Kurimo and Peter Smit
Speaker adaptation with 10 sentences, as used in the listening tests for ICASSP 2012.
(See the paper in IEEExplore)

40 child speakers 60 sentences each
VTLN normalisation

CSMAPLR group adaptation CSMAPLR group adaptation
Target speaker
CSMAPLR speaker adaptation with 10 sentences CSMAPLR speaker adaptation with 10 sentences VTLN + CSMAPLR speaker adaptation with 10 sentences CSMAPLR speaker adaptation with 10 sentences

Adapted from adult voice

Stack adapted voice

Stack adapted voice with VTLN

Adapted from child voice

Back to Aalto ICS speech group demo page