To load a voice from that list:
festival>
(voice_yourchosenvoicename)
To view the phoneset of the currently loaded voice:
festival> (PhoneSet.description nil)
festival> (define myutt (utt.load nil "cmu_us_arctic_slt_a0001.utt"))
Then the following commands can be used to look into the structure in different ways:
(utt.relationnames myutt)
(utt.relation.items myutt "Token")
(utt.relation.items myutt "Word")
(utt.relation_tree myutt "Token")
(utt.relation_tree myutt "Word")
And any of the "relationnames" (e.g. "Phrase", "Syllable", "IntEvent")
can be selected as the last argument
instead of "Token" or "Word".
The relation "Segment" includes both
pauses and phonemes, so if you want phonemes then use the
Segment relation.
You can get different information about each of the different relations, and those feature functions are described here: http://www.festvox.org/docs/manual-1.4.2/festival_32.html#SEC141 These are as C++ functions, but they are also accessible from the Festival Scheme REPL, e.g. for Syllable.syllable_duration:
festival> (define myutt (utt.load nil "f1a0001.utt"))
#<Utterance 0x7f993b6d0130>
festival> (define firstsyl (car (utt.relation.items myutt "Syllable")))
#<item 0x288bdc0>
festival> (item.feat firstsyl 'syllable_duration)
0.054999992
festival> (define myutt (utt.load nil
"f1a0001.utt"))
festival> (define allsegs
(utt.relation.items myutt "Segment"))
festival> (define pairsegs (mapcar
(lambda (x) (list (item.feat x
'R:SylStructure.parent.syllable_start) (item.name x)))
allsegs))
((0 "pau")
(0.2 "ax")
(0.255 "k")
(0.255 "ey")
(0.255 "p")
.....
festival> (define (combine fthing lst
final) (if (eq? '() lst) final (if (eqv? (car fthing) (car (car
lst))) (combine (append fthing (cdr (car lst))) (cdr lst) final)
(combine (car lst) (cdr lst) (append final (list
fthing))))))
festival> (combine '("x" "x") pairsegs
'())
(("x" "x")
(0 "pau")
(0.2 "ax")
(0.255 "k" "ey" "p")
.....
(the '("x" "x") is just a dummy placeholder to initialize the function.) As you can see, this combines the phonemes by syllable and also shows you the syllable start time.