Fast WaveNet generation using queues (NSynth) (CLA) #669

pkmital · 2017-05-23T18:21:12Z

Here is an implementation of using queues for the WaveNet decoder in NSynth as described in:
Ramachandran, P., Le Paine, T., Khorrami, P., Babaeizadeh, M., Chang, S., Zhang, Y., … Huang, T. (2017). Fast Generation For Convolutional Autoregressive Models, 1–5.

This should let you encode using the existing NSynth model and then synthesize from any encoding using a much faster method than the current approach. You can generate a 4 second audio file in a few minutes this way, which isn't terrible. I can get about 100 samples per second using this method (not at all accurate measurements), which means a 4 second clip @ 16 KHz can be synthesized in about 10 minutes. You can potentially use this to also explore different encodings from interpolation or encode your own sounds and explore their syntheses with this generation method much more easily than before.

There is no CLI tool I'm afraid but I'm hoping someone else can develop that to make it easier for others! This just includes a simple python module magenta.models.nsynth.wavenet.generate which includes a function synthesize showing how to use the FastGenerationConfig to load an audio file, encode it, and then synthesize from the encoding.

Lastly, I wasn't familiar with the BUILD system so please let me know if that looks okay.

jesseengel

Awesome submission! Just a first comment, I think a couple of the functions could be moved over to utils.py. I'm going to run this PR through our internal linters and let you know if anything needs to be changed.

You should probably add a py_binary to run the program from the command line. It can be super simple, something like...

# Copyright 2017 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
r"""DOC STRING HERE
"""
# internal imports
import tensorflow as tf

from magenta.models.nsynth.generate import synthesize

FLAGS = tf.app.flags.FLAGS

tf.app.flags.DEFINE_string("wav_file", "'model.ckpt-200000", "Path to input file.")
tf.app.flags.DEFINE_string("out_file",  "'synthesis.wav", "Path to output file.")
tf.app.flags.DEFINE_string("ckpt_path", "'model.ckpt-200000", "Path to checkpoint.")
tf.app.flags.DEFINE_integer("sample_length", 64000, "Input file size in samples.")
tf.app.flags.DEFINE_integer("sample_length", 64000, "Output file size in samples.")
tf.app.flags.DEFINE_string("log", "INFO",
                           "The threshold for what messages will be logged."
                           "DEBUG, INFO, WARN, ERROR, or FATAL.")

def main(unused_argv=None):
  tf.logging.set_verbosity(FLAGS.log)
  synthesize(wav_file=FLAGS.wav_file,
                    ckpt_path=FLAGS.ckpt_path,
                    out_file='synthesis.wav',
                    sample_length=64000,
                    synth_length=64000):

if __name__ == "__main__":
  tf.app.run()

jesseengel · 2017-05-23T22:15:16Z

magenta/models/nsynth/wavenet/generate.py

+import numpy as np
+
+
+def inv_mu_law(x, mu=255.0):


These functions should probably be loaded from / added to utils.py. You could just rename them as inv_mu_law_numpy() for example.

jesseengel · 2017-05-23T22:15:25Z

magenta/models/nsynth/wavenet/generate.py

+  return out
+
+
+def load_audio(wav_file, sample_length=64000):


These functions should probably be loaded from / added to utils.py. You could just rename them as inv_mu_law_numpy() for example.

…twavenet-cla

…librosa for wavfile loading

…et-cla

jesseengel

LGTM, after my commits ;).

committing fastgen code w/ proper email this time

3e26019

jesseengel self-requested a review May 23, 2017 22:09

jesseengel self-assigned this May 23, 2017

jesseengel reviewed May 23, 2017

View reviewed changes

jesseengel and others added 23 commits May 24, 2017 13:55

Merge branch 'fastwavenet-cla' of github.com:pkmital/magenta into fas…

1b53343

…twavenet-cla

moving functions to utils and adding binary for generation

5be6e40

Merge branch 'fastwavenet-cla' of github.com:pkmital/magenta into fas…

1e6d402

…twavenet-cla

fixed some bugs with sampling. automatic multiple of hop length. use …

ef33dc3

…librosa for wavfile loading

fix for sample_length > wav_data.size

b5e3202

fixed small bug in multinoulli sampler

ed5dd7f

shared functions between generate and save_embeddings

4bcd017

got generation working for folders of wavs or npys

964bfb0

name change to make consistent

f48fda3

added licenses

b54711b

save proper extension names

ec8d42c

Merge branch 'master' of github.com:tensorflow/magenta into fastwaven…

a9a5852

…et-cla

LINTING

32acd39

Merge branch 'master' of github.com:tensorflow/magenta into fastwaven…

aae16c9

…et-cla

docstring

5090739

docstring

7becdea

typos and BUILD deps

709b2bd

update readme

a5067ac

typos

5f741e2

added console_scripts, small typos

32f7b16

console entry points

699688c

change quote delimiter

5bad138

typo

e3b947a

jesseengel mentioned this pull request Jun 10, 2017

Please provide example command-line for calling nsynth/wavenet/train.py for training on multiple GPUs #625

Open

update README for pip installed scripts

8d074ec

jesseengel approved these changes Jun 12, 2017

View reviewed changes

jesseengel merged commit bd5f28b into magenta:master Jun 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast WaveNet generation using queues (NSynth) (CLA) #669

Fast WaveNet generation using queues (NSynth) (CLA) #669

Uh oh!

pkmital commented May 23, 2017

Uh oh!

jesseengel left a comment •

edited

Loading

Uh oh!

jesseengel May 23, 2017

Uh oh!

jesseengel May 23, 2017

Uh oh!

jesseengel left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fast WaveNet generation using queues (NSynth) (CLA) #669

Fast WaveNet generation using queues (NSynth) (CLA) #669

Uh oh!

Conversation

pkmital commented May 23, 2017

Uh oh!

jesseengel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jesseengel May 23, 2017

Choose a reason for hiding this comment

Uh oh!

jesseengel May 23, 2017

Choose a reason for hiding this comment

Uh oh!

jesseengel left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jesseengel left a comment •

edited

Loading