Generating Audio with WaveNet

Install the WaveNet dependencies.
Download and unzip this repo.
Download, unzip, and move the tensorflow-wavenet-master folder into this repo's folder under the name 'tensorflow-wavenet'.
Make a new folder for a sound in this repo's folder.
Make a folder in that folder called 'corpus'.
Put 16kHz mono wav files of the sound in the corpus folder.
Open a shell active in this repo's directory.
Run python gen.py sound-folder-name-here to train indefinitely, generating three 10-second examples every 10000 steps in sound-folder-name-here/gen/training-step-here/.
Interrupt whenever you want to stop, then run the command again to pick up where you left off or repeat the steps for a different sound.

Installing WaveNet Dependencies

If you aren't using Windows with an NVIDIA GPU, see here, and then $ pip install librosa. Otherwise, follow the instructions below.

NVIDIA GPU Support on Windows

This guide worked for Windows 10 64-bit as of May 2017.

Make sure your GPU drivers are up to date.
Download and run the CUDA installer.
Download CUDNN v5.1 for CUDA 8.0 (you have to make an account).
Move cuda\bin\cudnn64_5.dll to where-you-installed-CUDA\v8.0\bin\.
Move cuda\include\cudnn.h to where-you-installed-CUDA\v8.0\include\.
Move cuda\lib\x64\cudnn.lib to where-you-installed-CUDA\v8.0\lib\x64\.
Add where-you-installed-CUDA\v8.0\bin\ to the PATH environment variable.
Install python 3.5.
Download the latest prebuilt numpy+mkl and scipy wheels for Windows 64-bit CPython 3.5.
With a shell open to the folder you downloaded those wheels, install them with the following command:
$ pip install numpy‑1.13.0rc2+mkl‑cp35‑cp35m‑win_amd64.whl scipy‑0.19.0‑cp35‑cp35m‑win_amd64.whl
Install tensorflow-gpu.
$ pip install tensorflow-gpu
Make sure you have the Visual Studio Visual C++ Build Tools installed.
Install resampy from source following the instructions under "Advanced users and developers [...]".
Install librosa.
$ pip install librosa

Making 16KHz Mono Wav Files

Audacity

File -> Import -> Audio... your original file.
Project Rate (Hz) (in the bottom left corner): 16000
If stereo, click on the down arrow by the name of the track, then Split Stereo to Mono.
File -> Export...
Format: WAV (Microsoft) signed 16 bit PCM

ffmpeg

$ ffmpeg -i input-file-here.mp3 -ar 16000 -ac 1 output-file-here.wav

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
gen.py		gen.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generating Audio with WaveNet

Installing WaveNet Dependencies

NVIDIA GPU Support on Windows

Making 16KHz Mono Wav Files

Audacity

ffmpeg

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generating Audio with WaveNet

Installing WaveNet Dependencies

NVIDIA GPU Support on Windows

Making 16KHz Mono Wav Files

Audacity

ffmpeg

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages