Your habanero account will have 10gb of quota for your own use in your home directory, and the Speech Lab has 2TB of shared space in /rigel/katt.
python get-pip.py --prefix=/usr/local/
Python Libraries - numpy, theano, and bandmat. And new in the most recent version: regex, tensorflow, and keras. Run the following:
pip install numpy --user
pip install theano --user
export CFLAGS=-I/rigel/home/yourusername/.local/lib/python2.7/site-packages/numpy/core/include/
pip install bandmat --user
pip install regex --user
pip install tensorflow --user
pip install keras --user
git clone https://github.com/CSTR-Edinburgh/merlin.git
module load gcc/4.8.5
module load anaconda/2-4.2.0
module load cudnn/5.1
module load cuda80/blas
cd merlin/tools
./compile_tools.sh
You should see "All tools successfully installed!"
Next, run the demo on the GPU cluster:
cd ../egs/slt_arctic/s1
Create a new file there called slurm_run_demo.sh (or whatever you want) and put the following in it:
#!/bin/sh
module load gcc/4.8.5
./run_demo.sh
# End of script
#
# Simple Merlin demo submit script for Slurm.
#
#SBATCH --account=katt # The account name for the job.
#SBATCH --job-name=MerlinDemo # The job name.
#SBATCH -c 1 # The number of cpu cores to use.
#SBATCH --gres=gpu:1 # request 1 GPU
#SBATCH --time=1:00:00 # The time the job will take to run
#SBATCH --mem-per-cpu=10gb # The memory the job will use per cpu core.
#SBATCH --mail-user=youremail@.columbia.edu
#SBATCH --mail-type=END
module load anaconda/2-4.2.0
module load cudnn/5.1
module load cuda80/blas
Then, run it as follows:
sbatch slurm_run_demo.sh
You will get an email when the demo finishes (~7min). You can check the output .wav files under experiments/slt_arctic_demo/test_synthesis/wav. You can see the log output under slurm-######.out.
Note that you can just remove the time limit if you don't want to specify one. This does not appear to have any negative consequences.
You can check which GPU cards are in use, and by whom, by running:
python yourmerlindir/src/gpu_lock.py
On Habanero, you can submit this as a slurm script. The default behavior of this locking mechanism is to free automatically if whatever you are running crashes, however Merlin specifies use of a manual lock, which has to be freed manually. This means that if the voice training crashes, or if you kill it before it has completed, then the locks might not get released automatically. So, if this happens, you should check if your locks are still there by running the command above, and then free them manually, like this:
python yourmerlindir/src/gpu_lock.py --free lockid
Where lockid is the integer ID of the GPU card that you have locked.
Because of locking, we can only train one voice per GPU card. Here are the numbers of cards we have on each machine:
GPU Drivers: Not for Habanero but for our GPU machines in lab. These are Ubuntu 16.04 and apt installs CUDA version 7.5. This did not work immediately but worked once the machines were rebooted. Hecate has CUDA version 8 on Ubuntu 14.04.