We are using the MCD library for Python by Matt Shannon. Thanks to Kai-Zhan for setting it up and making a parallelized version, which can be found here:
/proj/tts/examples/kl2792/bin/get_mcd_dtw
Please only run this on kucing, where all the dependencies are installed.
Usage:
./get_mcd_dtw --param_order 60 NATDIR SYNTHDIR UTTID1 UTTID2 .... UTTIDN
NATDIR is the directory containing the .mgc files for the original speech that you are comparing to.
SYNTHDIR is the directory containing synthetic .mgc files. Hopefully you have saved these from when you did synthesis; if not, you can re-extract in the same way that you extract for natural audio in a new corpus.
UTTIDs are the utterances for which you want to compute mcd (minus file extension).
It returns a tuple of (mincostperframe, totframes). The first value is the MCD value that we care about. It returns one tuple for all of the utterances you give it.