( gen_001 "sentence to synthesize." )
( gen_002 "another new sentence." )
( gen_003 "here is one more." )
....
Except these won't be in English obviously.
cd /proj/tts/tools/babel_scripts/ecooper/turkish_bn_clean
cp your_text.done.data
etc/txt.done.data.test
First, move the training data files out of the way so they don't get clobbered when you run the scripts on your test utterances:
mv prompt-utt prompt-utt-train
mv lab lab-train
mv festival/utts festival/utts-train
mkdir prompt-utt
mkdir lab
mkdir festival/utts
Create prompts, fake alignment files, and test data utts:
./bin/do_build build_prompts etc/txt.done.data.test
for u in prompt-utt/*.utt; do
b=`basename $u .utt`
grep "dur_factor" $u | sed 's:"::g' | awk '{print "0 0 " $6}' >> lab/$b.lab
done
./bin/do_build build_utts etc/txt.done.data.test
Just for clarity, rename the test output files:
mv prompt-utt prompt-utt-test
mv lab lab-test
mv festival/utts festival/utts-test
Your output .utt files are in festival/utts-test.
Follow the steps in the data Makefile (e.g. /proj/tts/hts-2.3/template_si_htsengine/data/Makefile) for make label. We typically just copy over the Makefile to wherever the utts are, edit it to just do the fullcontext step, and run it.
Make sure the DUMPFEATS command contains the pointer to the Festival voice you want to use for your language. If you are re-using the utt-to-lab setup that you used for creating training labels, then it should already be there. Recall that the Festival voice is only used to note which phonemes are vowels, so it does not have to be the same exact frontend where you created the utt files, as long as the phoneset contains properly named phones with properly identified vowels.
If you see warnings about phoneset Radio, these are safely ignored. This is just a warning indicating that we are using phonemes outside of the default US English set.
The output will be in labels/full. Follow the next steps to remove the timestamps, and put these final labels into yourvoicedir/data/labels/gen.