Montreal Forced Aligner
The Montreal Forced Aligner can be found
here.
We have a speech lab installation
here: /proj/tts/tools/Montreal-Forced-Aligner
The full documentation can be found
here
but we have the short version on this page.
Note that I had to remove g2p-seq2seq related stuff from
pythonpath in my .bashrc to get
this to work, because of a python2/python3 conflict. If you have
already used g2p_seq2seq, you should make sure to remove the related
things from your .bashrc. If you plan to use g2p-seq2seq again in the
future, you will need to remember to put it back.
About
We typically use Festival EHMM for alignment since it is a built-in
part of the utterance file creation process. However, sometimes EHMM
alignment fails, so we found the MFA. The MFA has the added benefit
of optionally doing a round of speaker-adaptive alignment.
What you need
- A lexicon in the appropriate format: space-separated
words and phonetic pronunciations.
- A "corpus" directory consisting of .wav audio files and
matching .lab orthographic transcription files (which should be
cleaned up of special characters for annotation, etc.)
How to run
cd
/proj/tts/tools/Montreal-Forced-Aligner/dist/montreal-forced-aligner
bin/mfa_train_and_align -s NUM
-t /path/to/your/tmpdir /path/to/your/corpusdir
/path/to/your/lexicon.txt outdir
- -s NUM: optional; specify how
many characters at the beginning of the .wav filenames indicate the
speaker ID. Otherwise speaker adaptation will not be used.
- -t tmpdir: optional; specify
where temporary output will go. Default is somewhere in your home
directory.
- outdir: output directory.
Output
[[TODO]] also info about OOV words.