Montreal Forced Aligner

The Montreal Forced Aligner can be found here. We have a speech lab installation here: /proj/tts/tools/Montreal-Forced-Aligner

The full documentation can be found here but we have the short version on this page.

Note that I had to remove g2p-seq2seq related stuff from pythonpath in my .bashrc to get this to work, because of a python2/python3 conflict. If you have already used g2p_seq2seq, you should make sure to remove the related things from your .bashrc. If you plan to use g2p-seq2seq again in the future, you will need to remember to put it back.

About

We typically use Festival EHMM for alignment since it is a built-in part of the utterance file creation process. However, sometimes EHMM alignment fails, so we found the MFA. The MFA has the added benefit of optionally doing a round of speaker-adaptive alignment.

What you need

A lexicon in the appropriate format: space-separated words and phonetic pronunciations.
A "corpus" directory consisting of .wav audio files and matching .lab orthographic transcription files (which should be cleaned up of special characters for annotation, etc.)

How to run

cd /proj/tts/tools/Montreal-Forced-Aligner/dist/montreal-forced-aligner bin/mfa_train_and_align -s NUM -t /path/to/your/tmpdir /path/to/your/corpusdir /path/to/your/lexicon.txt outdir

-s NUM: optional; specify how many characters at the beginning of the .wav filenames indicate the speaker ID. Otherwise speaker adaptation will not be used.
-t tmpdir: optional; specify where temporary output will go. Default is somewhere in your home directory.
outdir: output directory.

Output

[[TODO]] also info about OOV words.