0% found this document useful (0 votes)
183 views

Working With The Web Audio API Preview

Working with the Web Audio API preview

Uploaded by

kien16
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
183 views

Working With The Web Audio API Preview

Working with the Web Audio API preview

Uploaded by

kien16
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

i

Working with the Web Audio API

Working with the Web Audio API is the definitive and instructive guide to
understanding and using the Web Audio API.
The Web Audio API provides a powerful and versatile system for con-
trolling audio on the web. It allows developers to generate sounds, select
sources, add effects, create visualizations and render audio scenes in an
immersive environment.
This book covers all essential features, with easy to implement code
examples for every aspect. All the theory behind it is explained, so that
one can understand the design choices as well as the core audio processing
concepts. Advanced concepts are also covered, so that the reader will gain
the skills to build complex audio applications running in the browser.
Aimed at a wide audience of potential students, researchers and coders,
this is a comprehensive guide to the functionality of this industry-​standard
tool for creating audio applications for the web.

Joshua Reiss is a Professor with the Centre for Digital Music at Queen Mary
University of London. He has published more than 200 scientific papers,
and co-​authored the book Intelligent Music Production and textbook Audio
Effects: Theory, Implementation and Application. At the time of writing, he is
the President of the Audio Engineering Society (AES). He co-​founded the
highly successful spin-​ out company LandR, and recently co-​ founded
the start-​ups Tonz and Nemisindo. His primary focus of research is on
state-​of-​the-​art signal processing techniques for sound design and audio
production. He maintains a popular blog, YouTube channel and Twitter
feed for scientific education and research dissemination.
ii

Audio Engineering Society Presents …


www.aes.org
Editorial Board
Chair: Francis Rumsey, Logophon Ltd.
Hyun Kook Lee, University of Huddersfield
Natanya Ford, University of West England
Kyle Snyder, University of Michigan

Intelligent Music Production


Brecht De Man, Joshua Reiss and Ryan Stables

The MIDI Manual 4e


A Practical Guide to MIDI within Modern Music Production
David Miles Huber

Digital Audio Forensics Fundamentals


From Capture to Courtroom
James Zjalic

Drum Sound and Drum Tuning


Bridging Science and Creativity
Rob Toulson

Sound and Recording, 8th Edition


Applications and Theory
Francis Rumsey with Tim McCormick

Performing Electronic Music Live


Kirsten Hermes

Working with the Web Audio API


Joshua Reiss

For more information about this series, please visit: www.routledge.com/​


Audio-​Engineering-​Society-​Presents/​book-​series/​AES
iii

Working with the Web


Audio API

Joshua Reiss
iv

Cover image: Getty: naqiewei


First published 2022
by Routledge
4 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
and by Routledge
605 Third Avenue, New York, NY 10158
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2022 Joshua Reiss
The right of Joshua Reiss to be identified as author of this work has been
asserted in accordance with sections 77 and 78 of the Copyright, Designs
and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised
in any form or by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying and recording, or in any information
storage or retrieval system, without permission in writing from the publishers.
Trademark notice: Product or corporate names may be trademarks or registered trademarks,
and are used only for identification and explanation without intent to infringe.
British Library Cataloguing-​in-​Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-​in-​Publication Data
Names: Reiss, Joshua D., author.
Title: Working with the Web Audio API / Joshua Reiss.
Description: Abingdon, Oxon ; New York, NY : Routledge, 2022. |
Includes bibliographical references and index.
Identifiers: LCCN 2021052201 (print) | LCCN 2021052202 (ebook) |
ISBN 9781032118680 (hardback) | ISBN 9781032118673 (paperback) |
ISBN 9781003221937 (ebook)
Subjects: LCSH: Computer sound processing–Handbooks, manuals, etc. |
Sound–Recording and reproducing–Digital techniques. | Application program
interfaces (Computer software)–Handbooks, manuals, etc. |
JavaScript (Computer program language)–Handbooks, manuals, etc. |
Web applications–Design and construction–Handbooks, manuals, etc.
Classification: LCC TK7881.4 .R455 2022 (print) |
LCC TK7881.4 (ebook) | DDC 621.389/3–dc23/eng/20220124
LC record available at https://lccn.loc.gov/2021052201
LC ebook record available at https://lccn.loc.gov/2021052202
ISBN: 978-​1-​032-​11868-​0 (hbk)
ISBN: 978-​1-​032-​11867-​3 (pbk)
ISBN: 978-​1–​003-​22193-​7 (ebk)
DOI: 10.4324/​9781003221937
Typeset in Times New Roman
by Newgen Publishing UK
Access the companion website: https://​github.com/​joshreiss/​Working-​with-​the-
Web-​Audio-​API
v

Contents

List of figures  vii


List of code examples  xi
Resources  xv
Preface  xvii
Acknowledgments  xx

1 Introducing the Web Audio API  1

Interlude​– Generating sound with scheduled sources  9

2 Oscillators  11

3 Audio buffer sources  23

4 The constant source node  38

Interlude​– Audio parameters  45

5 Scheduling and setting parameters  47

6 Connecting audio parameters and modulation  63

Interlude – Destination and source nodes  73

7 Analysis and visualization  75

8 Loading, playing and recording  82

9 OfflineAudioContext  95
vi

vi Contents
Interlude – Audio effects  101

10 Delay  103

11 Filtering  114

12 Waveshaper  130

13 Dynamic range compression  147

14 Reverberation  158

Interlude – M
​ ultichannel audio  173

15 Mixing audio  175

16 Stereo panning  185

17 Spatialized sound  194

Interlude – Audio worklets  209

18 Working with audio worklets  211

19 The wonders of audio worklets  229

Appendix – T
​ he Web Audio API interfaces  242
References  245
Index  246
vi

Figures

.1 The simplest audio context.


1 2
1.2 A complex audio routing graph. It applies an envelope,
filterbank, distortion and reverb to noise. 3
1.3 A simple audio graph to generate a tone with reduced volume. 7
2.1 Two sampled sinusoids, one with frequency 3 Hz and the
other with frequency 7 Hz. They appear identical when
sampled at a frequency 10 Hz, since 5−3 =​7−5. 12
2.2 The ideal forms of the types supported by an
OscillatorNode. In each case, two periods are shown of an
oscillator with 2 Hz frequency. 14
2.3 A pulse wave with frequency 2 Hz and duty cycle 25%,
generated from CreatePeriodicWave. Note that since the
DC offset term is set to zero, the average value is 0. 18
3.1 Playback of a buffer source node, illustrating the effect of
looping, playbackRate, SampleRate and offset. 30
5.1 How canceling operates when there is an ongoing scheduled
event on a parameter. Originally, the parameter value is
set to 0.5 at time 0.5, then ramped down to 0 over half a
second. cancelScheduledValues at time 0.75 will remove
this ramp so that the parameter remains fixed at 0.5.
cancelAndHoldAtTime at time 0.75 will keep the ramp
until time 0.75, then hold it at whatever value it has at that
point in time. 54
5.2 An example of clipping of an audio parameter automation
from the nominal range. 55
6.1 An audio node can connect to several different nodes or
parameters (a), several nodes can connect to an audio node
or parameter (b), an audio node can connect back to itself
or one of its own parameters (c), but an audio node
cannot have several connections to the same audio node or
parameter (d). 65
6.2 Depiction of FM synthesis, AM synthesis and ring
modulation. 68
vi

viii Figures
7.1 The output of Code ­example 7.1, showing either the
time domain waveform (top) from time domain data or
magnitude spectrum (bottom) from frequency domain data. 80
10.1 Magnitude response of a comb filter using 1 millisecond
delay. 106
10.2 Audio graph for Code ­example 10.3. 110
10.3 Audio graph for a simple implementation of the
Karplus-​Strong algorithm. 112
11.1 Magnitude response for various ideal filters. 117
11.2 Magnitude responses for the Web Audio API’s lowpass
filter, PureData’s lowpass filter and the standard
first-​order Butterworth lowpass filter. 122
11.3 Magnitude responses for the Web Audio API’s bandpass
filter, PureData’s bandpass filter and the standard
Butterworth bandpass filter design. 123
11.4 Phase responses for the Web Audio API’s allpass filter and
the standard Butterworth allpass filter design. 124
12.1 The characteristic input/​output curve for a
quadratic distortion. 131
12.2 Comparison of hard and soft clipping. 132
12.3 Soft clipping of a sine wave with four different input gains. 134
12.4 Half-​wave and full-​wave rectification. 135
12.5 The spectrum of a single sine wave before and after
asymmetric distortion has been applied. 136
12.6 The output spectrum after distortion has been applied
for sinusoidal input, comparing two values of distortion
level for soft clipping (left), and comparing soft and hard
clipping (right). 136
12.7 The spectrum of two sinusoids before and after distortion
has been applied. 136
12.8 Output spectrum with aliasing due to distortion, and the
output spectrum after oversampling, lowpass filtering and
downsampling. 140
12.9 The effect of soft clipping on a decaying sinusoid. 141
12.10 How the waveshaping curve is used. For an original
curve over the interval −1 to +​1, equally spaced values
are mapped to an array with indices 0 to N−1, and
waveshaping outputs are interpolated between the
array values. 144
12.11 The input/​output curve for a bit crusher with bit
depth =​3 (eight levels). 146
13.1 Static compression characteristic with make-​up gain and
hard or soft knee. When the input signal level is above a
threshold, a further increase in the input level will produce
a smaller change in the output level. 150
ix

Figures  ix
13.2 Graph of internal AudioNodes used as part of the
DynamicsCompressorNode processing algorithm. It
implements pre-​delay and application of gain reduction. 152
14.1 Reverb is the result of sound waves traveling many
different paths from a source to a listener. 159
14.2 Impulse response of a room. 159
14.3 One block convolution, as implemented in block-​based
convolutional reverb. Each block is convolved with the
impulse response h of length N. 162
14.4 Partitioned convolution for real-​time artificial reverberation. 163
14.5 Supported input and output channel count possibilities
for mono inputs (left) and stereo inputs (right) with one,
two or four channels in the buffer. 165
15.1 The ChannelMergerNode (left) and ChannelSplitterNode
(right). 180
15.2 Block diagram of the flipper when set to flip the left and
right channels. 182
15.3 Block diagram of ping-​pong delay. 183
16.1 Listener and loudspeaker configuration for placing a
sound source using level difference. 186
16.2 Constant-​power panning for two channels. On the left
is the gain for each channel, and on the right is the total
power and total gain. 187
16.3 Perceived azimuth angle as a function of level difference. 188
16.4 How panning of a stereo source affects its panning
position. Final panning position versus the panning
applied is plotted for five stereo sources with original
positions between left (p =​−1) and right speakers (p =​ +​1) 190
16.5 The effect of stereo enhancement on the panning position
of a source. Width less than 0 moves a source towards the
center, width greater than 0 moves it away from the center.
Panning positions above 1 or below −1 indicate a change
of sign between left and right channels. 193
17.1 Right hand coordinate system. When holding out your
right hand in front of you as shown, the thumb points in
the X direction, the index finger in the Y direction and the
middle finger in the Z direction. 196
17.2 Listener and source in space. 197
17.3 Cone angles for a source in relation to the source
orientation and the listener’s position and orientation. 198
17.4 Calculation of azimuth angle α from source position S,
listener position P, and listener forward direction
F. V is the normalized S-​L vector, and α is calculated
by projecting V onto the forward direction, e.g. adjacent
(F·V) over hypotenuse (1). 201
x

x Figures
17.5 Diagram showing the process of panning a
source using HRTF. 203
17.6 The spatialization procedure used in the PannerNode. 205
18.1 Interaction between an audio worklet node and audio
worklet processor, along with the syntax for how these
components can be created. 212
19.1 Rewriting a buffer to store previous inputs. 237
xi

Code examples

.1
1 Hello World Application, generating sound.  5
1.2 Hello World, version 2.  6
1.3 Hello World, version 3.  6
1.4 UserInteraction.html and UserInteraction.js.  8
2.1 Oscillator.html and Oscillator.js.  15
2.2 CustomSquareWave.html, a square wave generator using
the PeriodicWave.  17
.3
2 PulseWave.html, a pulse wave generator.  19
2.4 Detune.html and Detune.js.  20
3.1 BufferedNoise.html. Use of an AudioBufferSourceNode to
generate noise.  25
3.2 Playback.html and Playback.js, allowing the user to interact
with AudioBufferSourceNode parameters for a chirp signal
as the buffer source.  31
3.3 BufferedSquareWave.html, showing how a square wave can
be reconstructed using wave table synthesis, similar to a
Fourier series expansion.  33
3.4 Pause.html, showing how playback of an audio buffer can
be paused and resumed by changing the playback rate.  35
3.5 Backwards.html, showing how to simulate playing a buffer
backwards by playing the reverse of that buffer forwards.  37
4.1 DCoffset.html, showing use of the ConstantSourceNode to
add a value to a signal.  39
4.2 ConstantSourceSquareWave.html, which uses a
ConstantSourceNode to change the frequency of a square
wave constructed by summing weighted sinusoids.  41
4.3 NoConstantSourceSquareWave.html, which has the same
functionality as Code ­example 4.2, but without use of a
ConstantSourceNode.  41
.4
4 Grouping.html and Grouping.js.  43
5.1 Beep.html and Beep.js, which demonstrate audio parameter
automation.  55
xi

xii  Code examples


5.2 SetValueCurve.html, which creates a beep sound using a
custom automation curve.  56
5.3 repeatBeep.html, showing how parameter automation can
be looped using JavaScript’s setInterval method.  57
5.4 Crossfade.html and Crossfade.js, which use
setValueCurveAtTime to show the difference between a
linear crossfade and an equal power crossfade of two signals.  59
5.5 Bells.html and Bells.js, showing how to synthesize a
bell-​like sound using parameter automation to create a
sum of decaying harmonics.  61
6.1 FMSynthesis.html and FMSynthesis.js, showing how FM
synthesis is performed by connecting a low-​frequency
oscillator to the frequency parameter of another oscillator.  68
6.2 AMSynthesis.html and AMSynthesis.js, for implementing
AM synthesis by connecting a low-​frequency oscillator to
the gain parameter of a gain node.  70
7.1 Analyser.html and Analyser.js, showing example use of the
AnalyserNode.  79
8.1 SimpleMediaElement.html, showing basic use of an
<audio> media element, without the Web Audio API.  83
8.2 MediaElement.html, showing use of a media element in an
audio context.  85
8.3 MediaElement2.html, showing use of the
MediaElementSourceNode with the audio() constructor.  86
8.4 decodeWithRequest.html, showing use of
decodeAudioData and XMLHttpRequest to load an audio
file, and playbackRate to play it back at a variable rate.  87
8.5 decodeWithFetch.html, showing use of decodeAudioData
and fetch() to load an audio file, and playbackRate to play
it back at a variable rate.  88
8.6 LevelMeter.html, using MediaStreamAudioSourceNode to
create a level meter from microphone input.  90
8.7 MediaStreamToAudioBuffer.html, showing how to take an
excerpt of a media stream and use it as the audio buffer for
an AudioBufferSourceNode.  91
8.8 MediaRecorderExample.html, showing how to record
audio from an audio node using MediaRecorder.  93
8.9 RecorderExample.html, showing how to record audio from
an audio node using recorder.js.  94
9.1 OfflineContext.html, generating 5 seconds of an oscillator
and storing it in a buffer.  97
9.2 OfflineContext2.html, recording an offline context, for
batch processing. bufferToWave.js is third-​party code, not
depicted here, for generating a wave file from an audio buffer.  98
9.3 OfflineContext3.html, showing use of suspend() and
resume() to give progress updates on an offline audio context.  99
xi

Code examples  xiii


10.1 combFilter.html, implementing a comb filter using the
delay node.  107
10.2 vibrato.html and vibrato.js, which implement a vibrato
effect using a modulated delay line.  109
10.3 feedbackDelay.html, which implements delay with
feedback, and user control of the delay and feedback gain.  110
10.4 KarplusStrong.html and KarplusStrong.js, showing a
basic implementation of the Karplus-​Strong algorithm
using the delay node.  112
11.1 Biquad.html and Biquad.js files to visualize the
biquad filters.  120
11.2 IIRFilterhtml, audio.js and graphics.js, for plotting the
magnitude response of an IIR filter.  126
11.3 IIRInstability.html, showing an IIR filter becoming
unstable as a feedback coefficient changes.  127
11.4 BiquadInstability.html, showing a biquad filter becoming
unstable as a feedback coefficient changes.  128
12.1 Clipper.html, showing how to clip a signal with the
WaveShaperNode.  145
12.2 BitCrusher.html, using the WaveShaperNode to quantize
a signal.  146
13.1 AttackRelease.html and AttackRelease.js, for visualizing
the effect of changing Attack and Release parameters.  153
13.2 Compressor.html and Compressor.js, showing a dynamic
range compressor in action. The user can control all
parameters and a meter depicts the amount of gain
reduction at any time.  154
14.1 ConvolutionReverb.html, example use of a convolver
node to simulate reverb.  167
14.2 NormalizeIR.html, showing use of a convolver node with
a stored impulse response, also showing subtleties in the
normalize parameter.  168
14.3 FIR.html, showing how FIR filtering is performed using
the ConvolverNode.  170
15.1 ChannelFlip.html, showing how the ChannelSplitter
and ChannelMerger may be used to switch left and right
channels in a stereo source.  181
15.2 Ping-​pong delay using the ChannelMergerNode.  183
16.1 stereoPanning.html, a simple panning example.  190
16.2 StereoEnhancer.html, a stereo width enhancer.  191
17.1 listener.html and listener.js, showing use of the
PannerNode for moving a listener in a 2D space with a
fixed sound source.  205
17.2 panner.html and panner.js, showing use of the
PannerNode for positioning a source in the 3D space
around a listener.  206
vxi

xiv  Code examples


18.1 basicNoise.html and basicNoise.js, for creating an audio
node that generates noise, using an audio worklet.  212
18.2 asyncNoise.html, which uses async await to create an
audio worklet node.  215
18.3 multipleInputs.html and multipleInputs.js, for creating an
audio node that generates single-​source, single-​channel
output based on the maximum samples from all channels
of all inputs.  216
18.4 panX.html and panX.js, for equal-​power panning across
an array of speakers.  218
18.5 gain.html and gainworklet.js, for applying gain to a noise
source.  221
18.6 smoothing.html and smoothingWorklet.js, for applying
exponential smoothing to a noise source.  223
18.7 filterOptionsWorklet.html and filterOptionsWorklet.js,
demonstrating use of processorOptions for selecting a
filter type.  225
18.8 filterMessaging.html and filterMessagingWorklet.js, using
the message port for selecting a filter type.  227
19.1 Pulse.html and PulseWorklet.js, which use an audio
worklet to create a pulse wave. By setting the duty cycle to
0.5, this also creates square waves. Note that here we do
not attempt to avoid aliasing.  230
19.2 BitCrusher.html, using an audio worklet to quantize a signal. 
232
19.3 Compressor.html, using an audio worklet to apply
dynamic range compression.  233
19.4 stereoWidenerNode.html and stereoWidenerWorklet.js,
using an audio worklet to apply stereo width enhancement.  235
19.5 fixedDelay AudioWorklet, illustrating rewriting a buffer
each sampling period.  237
19.6 fixedDelay2 AudioWorklet, illustrating a fixed delay using
a circular buffer.  238
19.7 The Karplus-Strong algorithm using an audioworklet for
the feedback delay.  239
xv

Resources

All code examples are available at;


https://​github.com/​joshreiss/​Working-​with-​the-​Web-​Audio-​API

YouTube videos related to the book can be found at:


https://​tinyurl.com/​y3mtauav
We make extensive use of the Web Audio API documentation
https://​developer.mozilla.org/​en-​US/​docs/​Web/​API/​Web_​Audio_​API
and especially the Web Audio API specification
www.w3.org/​TR/​webaudio/​

Sound files used in the source code were all public domain or Creative
Commons licensed.

From Cambridge Multitracks,

• Chapter 4: Rachel multitrack, by Anna Blanton, at https://​cambridge-​


mt.com/​ms/​mtk/​#AnnaBlanton

From Nemisindo, https://​nemisi​ndo.com, we used

• Chapter 14: Applause.mp3 from https://​nemisindo.com/​models/​


applause.html
• Chapter 17: Drone.wav, at https://​nemisindo.com/​models/​propeller.
html?preset=​Military%20Drone

From FreeSound, https://​freesound.org/​

• Chapter 8: beat1.mp3 at https://​freesound.org/​people/​rodedawg81/​


sounds/​79539/​
• Chapter 10: trumpet.wav at https://​freesound.org/​people/​MTG/​
sounds/​357601/​
• Chapter 11: flange love.mp3 at https://​freesound.org/​people/​deleted_​
user_​4338788/​sounds/​263391/​
• Chapter 16: symphonic_​warmup.wav from https://​freesound.org/​
people/​chromakei/​sounds/​400171/​
xvi
xvi

Preface

The Web Audio API is the industry-​ standard tool for creating audio
applications for the web. It provides a powerful and versatile system for
controlling audio on the Web, allowing developers to generate sounds,
select sources, add effects, create visualizations and render audio scenes in
an immersive environment. The Web Audio API is gaining importance and
becoming an essential tool both for many of those whose work focuses on
audio, and those whose work focuses on web programming.
Though small guides and formal specifications exist for the Web
Audio API, there is not yet a detailed book on it, aimed at a wide audi-
ence of potential students, researchers and coders. Also, quite a lot of
the guides are outdated. For instance, many refer to the deprecated
ScriptProcessorNode, and make no mention of the AudioWorkletNode,
which vastly extends the Web Audio API’s functionality.
This book provides a definitive and instructive guide to working with
the Web Audio API. It covers all essential features, with easy-​to-​implement
code examples for every aspect. All the theory behind it is explained, so that
one can understand the design choices as well as the core audio processing
concepts. Advanced concepts are also covered, so that the reader will gain
the skills to build complex audio applications running in the browser.

Structure
The book is structured as follows. The book is divided into seven sections,
with six short interludes separating the sections, and most sections contain
several chapters. The organization is done in such a way that the book can
be read sequentially. With very few exceptions, features of the Web Audio
API are all introduced and explained in detail in their own chapter before
they are used in a code example in another chapter.
The first section is a single chapter. It gives an overview of the Web
Audio API, why it exists, what it does and how it is structured. It has source
code for a ‘Hello World’ application, the simplest program that does some-
thing using the Web Audio API, and then it extends that to showcase a few
more core features.
xvii

xviii Preface
The second section concerns how to generate sounds with scheduled
sources. There is a chapter for each scheduled source: oscillators, audio
buffer sources, and constant source nodes.
The third section focuses on audio parameters. It contains two chapters:
one on scheduling and setting these parameters, and then one on connecting
to audio parameters and performing modulation.
Then there is a fourth section on source nodes and destination nodes,
beyond the scheduled sources and the default destination. It has chapters
on analysis and visualization of audio streams, on loading and recording
audio, and on performing offline audio processing.
At this point, the reader now has knowledge of all the main ways in
which audio graphs are constructed and used in the Web Audio API. The
remaining sections focus on performing more specialized functions with
nodes to do common audio processing tasks or to enable arbitrary audio
generation, manipulation and processing.
The fifth section focuses on audio effects, with chapters on delay, filtering,
waveshaping, dynamic range compression and reverberation. Each chapter
introduces background on the effect and details of the associated audio
node, with the exception of the filtering chapter, for which there are two
relevant nodes, the BiquadFilterNode and IIRFilterNode.
A sixth section deals with spatial audio, and consists of three chapters.
The first looks at how multichannel audio is handled in the Web Audio
API, and introduces audio nodes for splitting a multichannel audio stream
and for merging several audio streams into a multichannel audio stream.
Two further chapters in this section address stereo panning and spatial
rendering.
The final section unleashes the full power of the Web Audio API with
audio worklets. The first chapter in this section explains audio worklets
in detail and introduces all of their features with source code examples.
The final chapter in the book revisits many source code examples from
previous chapters, and shows how alternative (and in some ways, better)
implementations can be achieved with the use of audio worklets.
Chapters and sections may be read out of order. For instance, one
may choose to delve into audio effects and multichannel processing,
Chapter 10 to Chapter 17, before exploring the details of audio parameters,
destinations and source nodes, Chapter 5 to Chapter 9. In which case, just a
basic understanding of some nodes and connections from earlier chapters
is necessary to fully understand the examples. Or one may skip Chapter 9
entirely without issue, since the OfflineAudioContext is not used in other
chapters.
Only a very small amount of the full Web Audio API specification is
not covered in this book. This includes some aspects of measuring or
controlling latency, aspects that are not included in the Chrome browser
implementation, such as the MediaStreamTrackAudioSourceNode
(used only by Firefox), and discussion of deprecated features, such as the
ScriptProcessorNode.
xi

Preface  xix

Notation and coding conventions


The following conventions are used throughout this book.
Italics are often used for new terms.
Constant width font is used for source code and JavaScript or html
terms. Depending on the context, we also sometimes use plain text for the
general concept behind a programming aspect. For instance, we may either
refer to a gain node in plain text, or a GainNode when referring to the spe-
cific syntax used in coding.
For mathematical notation, brackets are generally used when we refer
to functions of discrete, integer samples, such as x[n], and parentheses are
used for functions of continuous time, such as x(t). Equations are usually
unnumbered.
We mostly use the terminology and conventions in the Web Audio API
documentation. So the audio signals being generated and processed are
referred to as audio streams, and this all happens within an audio graph.
However, the documentation refers to a block of 128 sample-​frames in a
render quantum. This is a bit awkward, so we refer to Web Audio processing
a block of 128 samples.
In general, the code examples should work individually ‘out of the box’,
without relying on external libraries or dependencies.
We tend to use simple coding guidelines that reduce the amount of text
in each source code example. So for instance, we do not use semi-​colons
unless necessary. Simple loops and functions are often presented as a single
line. We do not use the JavaScript method getElementByID(), and
instead rely on the fact that element IDs are global variables. Similarly, we
do not use EventListeners unless needed. As an example, code examples
are more likely to use:

Volume.oninput =​() =​
> VolumeNode.gain.value =​Volume.value

rather than

var VolumeElement =​document.getElementById("Volume");


VolumeElement.addEventListener("input", changeVolume, false);
function changeVolume () {
VolumeNode.gain.value =​Volume.value;
}

However, variables are usually (but not always) declared in the code
examples. We also make no use of CSS files; our aim is to present working
examples of Web Audio API concepts, but not complete and pretty
applications.
x
newgenprepdf

Acknowledgments

The author is a member of the Centre for Digital Music at Queen Mary
University of London. This visionary research group has promoted adven-
turous research since its inception, and he is grateful for the support and
inspiration that they provide.
Much of the audio-​ related research that underlies techniques and
algorithms used in the Web Audio API was first published in conventions,
conferences or journal articles from the Audio Engineering Society (AES).
The author is greatly indebted to the AES, which has been promoting
advances in the science and practice of audio engineering since 1948.
The author has worked with Web Audio since 2017. Much of that
work has been in the field of procedural audio, which is essentially real-​
time, interactive sound synthesis. It led to the formation of the company
Nemisindo, which provides, among other things, a large online proced-
ural sound effect generation system based on the Web Audio API. Many
great researchers have worked with the author, either on projects leading to
Nemisindo or as part of the Nemisindo team, including Thomas Vassallo,
Adan Benito, Parham Bahadoran, Jake Lee, Rod Selfridge, Hazar Tez,
Jack Walters, Selim Sheta and Clifford Manasseh.
There is also an amazing community of Web Audio developers, whom
the author knows only through their contributions and discussions online.
Without their work, this book (and the Web Audio API itself) would have
been far weaker and less useful. Many of the examples and insights in the
book are based on their work. The best explanations often lie in their ori-
ginal contributions, whereas any errors or omissions are due to the author.
Finally, the author dedicates this book to his family: his wife Sabrina,
daughters Eliza and Laura, and parents Chris and Judith.
1

1 
Introducing the Web Audio API

This chapter introduces the Web Audio API. It explains the motivations
behind it, and compares it to other APIs, packages and environments
for audio programming. It gives an overview of key concepts, such as
the audio graph and how connections are made. The AudioContext is
introduced, as well as a few essential nodes and methods that are explored
in more detail in later chapters. A ‘hello world’ application is presented
as a code example, showing perhaps the simplest use of the Web Audio
API to produce sound. We then extend this application to show alterna-
tive approaches to its implementation, coding practices, and how sound is
manipulated in an audio graph.

The Web Audio API


The Web Audio API is a high-​level Application Programming Interface for
handling audio operations in web applications. It makes audio processing
and analysis a fundamental part of the web platform. It has a lot of built-​
in tools, but also allows one to create their own audio processing routines
within the same framework. Essentially, it allows one to use a web browser
to perform almost any audio processing that one could create for stand-​
alone applications. In particular, it includes capabilities found in modern
game engines and desktop audio production applications, including
mixing, processing, filtering, analysis and synthesis tasks.
The Web Audio API is a signal flow development environment. It has a
lot in common with visual data flow programming, like LabView, Matlab’s
Simulink, Unreal’s BluePrint, PureData, or Max MSP. They all provide a
graphical representation of signal processing. But unlike the others, the
Web Audio API is text-​based JavaScript, not graphical. There are third-​
party tools to work with a graphical representation for web audio develop-
ment, but they are still in early stages.
With the Web Audio API, one can define nodes, which include sound
sources, filters, effects and destinations. One can also create his or her own
nodes. These nodes are connected together, thus defining the routing, pro-
cessing and rendering of audio.

DOI: 10.4324/9781003221937-1
2

2  Introducing the Web Audio API

The audio context


Audio operations are handled within an audio context. The audio operations
are performed with audio nodes (consisting of sources, processors and
destinations), and the nodes are connected together to form an audio
routing graph. The graph defines how an audio stream flows from sources
(such as audio files, streaming content or audio signals created within the
audio context) to the destination (often the speakers).
The audio context is defined with a constructor, AudioContext(), as
we will see in the Hello World example below.
All routing occurs within an AudioContext containing a single
AudioDestinationNode, In the simplest case, a single source can be routed
directly to the output, as in Figure 1.1. The audio nodes appear as blocks.
The arrows represent connections between nodes.
Modular routing allows arbitrary connections between different audio
nodes. Each node can have inputs and/​or outputs. A source node has no
inputs and a single output. Sources are often based on sound files, but
the sources can also be real-​time input from a live instrument or micro-
phone, redirection of the audio output from an audio element, or entirely
synthesized sound.
A destination node has one input and no outputs. Though the final
destination node is often the loudspeakers or headphones, you can also
process without sound playback (for example, if you want to do pure visu-
alization) or do offline processing, which results in the audio stream being
written to a destination buffer for later use.
Other nodes such as filters can be placed between the source and des-
tination nodes. Such nodes can often have multiple incoming and outgoing
connections. By default, if there are multiple incoming connections into a
node, the Web Audio API simply sums all the incoming audio signals. The
developer also doesn’t usually have to worry about low-​level stream format
details when two objects are connected together. For example, if a mono
audio stream is connected to a stereo input it should just mix to left and
right channels appropriately.
Modular routing also permits the output of AudioNodes to be routed
to an audio parameter that controls the behavior of a different AudioNode.
In this scenario, the output of a node can act as a modulation signal rather
than an input signal.

AudioContext

Source Destination

Figure 1.1 The simplest audio context.


3

Introducing the Web Audio API  3


A single audio context can support multiple sound inputs and complex
audio graphs, so, generally speaking, we will only need one for each audio
application we create.
The default nodes of the Web Audio API are fairly minimal, only 19
in all.

1. AudioBufferSourceNode
2. MediaElementAudioSourceNode
3. MediaStreamAudioSourceNode
4. ConstantSourceNode
5. OscillatorNode
6. BiquadFilterNode
7. ChannelMergerNode
8. ChannelSplitterNode
9. ConvolverNode
10. DelayNode
11. DynamicsCompressorNode
12. GainNode
13. PannerNode
14. StereoPannerNode
15. WaveShaperNode
16. IIRFilterNode
17. AnalyserNode
18. MediaStreamAudioDestinationNode
19. AudioDestinationNode

The first five are all source nodes, defining some audio content and
where it comes from. The last three are all destinations, giving some
output. Everything else is an intermediate node, which processes the audio,
and has inputs and outputs. We will be talking about all of these nodes,
including providing examples, in later sections. We will also introduce the
AudioWorkletNode, which provides the means to design your own audio
node with its own functionality.
To give you a sense of how these nodes might be used, another
graph is shown in Figure 1.2. The idea of this graph is to shape some
noise and add some effects, perhaps to create a boomy explosion. An

BiquadNode
Lowshelf
BiquadNode
audioWorkletNode Peaking
gainNode waveshaperNode convolverNode destination
whiteNoise
BiquadNode
Peaking
BiquadNode curve buffer
setValueCurveAtTime Highshelf

Figure 1.2 A complex audio routing graph. It applies an envelope, filterbank, dis-
tortion and reverb to noise.
4

4  Introducing the Web Audio API


audioWorkletNode is used to generate a continual noise source. A time
vary gain is applied by automating parameters on a gain node. Then a
filterbank of BiquadFilterNodes is applied to shape the frequency con-
tent, which is then summed back together into another gainNode. This
is then passed through a waveshaperNode to add distortion, based on a
waveshaping curve, and a convolverNode to add reverberation, based on
an impulse response. Finally, this is sent to the destination and heard by
the user. Several of the nodes in this example take additional information
besides audio streams and simple parameter settings, such as an array of
automation curve values for the first gain node, a waveshaping curve for
the waveshaper, and an audio buffer for the convolver.
If much of that explanation does not make sense, don’t worry. One of
the main purposes of this book is to introduce and explain all of these
nodes, in sufficient detail that you should be able to use them to achieve
whatever audio processing goal you have in mind. But now, let’s give our
first code example.

What you need to get started


One nice thing about using JavaScript to develop browser-​based applications
is that the developer does not need to install a big development environ-
ment, and the applications are accessible by almost anyone. That said, a
few tools are still needed.

• The Chrome web browser –​of all available web browsers, Chrome
has perhaps the most extensive implementation of the Web Audio
API. Almost all features of the Web Audio API are implemented
in other popular browsers (Firefox, Safari, Edge, Opera, and their
mobile device equivalents) but there are enough subtle differences that
applications can’t be guaranteed to work out-​of-​the-​box in another
browser just because they work in Chrome. There are third-​party tools
to help ensure cross-​browser functionality, but for all the code herein,
we just stick with Chrome.
When running your applications, make sure to have the Developer
Console open in Chrome. That way, you will see any error messages that
appear. You can find the Developer Console by opening the Chrome
Menu in the upper-​right-​hand corner of the browser window and
selecting More Tools → Developer Tools.
• A source code editor –​this could be any text editor designed specifically
for editing source code. Atom, Visual Studio Code and Sublime Text
are popular choices. The author mainly used Atom, but it shouldn’t
really matter which one you use, as long as you can satisfy the next
bullet point.
• An http server package –​for most source code editors, this is an
add-​on. A lot of the code examples will not run properly without a
5

Introducing the Web Audio API  5


web server running, and launching the server with your project open
is usually the best way to quickly check your code. For atom, the
atom-​live-​server package satisfies all the needs for the code examples
herein.

Example: Hello World


In Code ­example 1.1 we have possibly the simplest program using the Web
Audio API. It just plays a simple sinusoid, the Web Audio equivalent of a
‘Hello World!’ application.

Code ­example 1.1.  Hello World Application, generating sound.

<button onclick=​'context.resume()'>Start</​
button>
<script>
context=​ new AudioContext()
Tone=​ context.createOscillator()
Tone.start()
Tone.connect(context.destination)
</​
script>

Most browsers will know this is an html file and display it correctly.
Inside <script> and </​script> is the JavaScript code that uses
the Web Audio API. A new AudioContext is created. It has a single
OscillatorNode, with default settings. We will introduce the oscillator
node later, but for now its sufficient to note that it is a source node that
generates a periodic waveform, and with its default values it is a 440 Hz
sine wave.
The oscillator is started, and connected to the destination. Any
audio node’s output can be connected to any other audio node’s input
by using the connect() function. context.destination is an
AudioDestinationNode, sending the audio stream to the default audio
output of the user’s system.
However, the script by itself will not do anything. The audio context is
suspended by default, and needs to be started by some user interaction. We
did this by creating a button on the web page, and having that perform the
line context.resume() when clicked.
This Hello World application does not showcase any intermediate pro-
cessing. That is, the source is connected directly to the destination. So let’s
extend it a little bit. In Code ­example 1.2, we have added a gain node, which
simply multiplies the input by some value, in order to produce the output.
We set that value to 0.25. Now, the source tone is connected to the gain
node and the gain node is connected to the destination. So the oscillator’s
signal level is reduced by one-​fourth.
6

6  Introducing the Web Audio API


Code ­example 1.2.  Hello World, version 2.

<button onclick=​'context.resume()'>Start</​
button>
<script>
var context=​ new AudioContext()
var Tone=​ context.createOscillator()
var Volume=​ context.createGain()
Volume.gain.value=​0.25
Tone.start()
Tone.connect(Volume)
Volume.connect(context.destination)
</​
script>

Let’s take a step back now and look at a few lines of code in detail.
Changing the gain of an audio signal is a fundamental operation in audio
applications. createGain is a method of an audioContext that creates
a GainNode. The GainNode is an AudioNode with one input stream and
one output stream. It has a single parameter, gain, which represents the
amount of gain to apply. Its default value is 1, meaning that the input is
left unchanged.
Alternatively, we could have created a new gain node directly using a
gainNode constructor. This takes as input the context and, optionally,
parameter values. Besides allowing us to set the parameters when created,
it can be slightly more efficient than the createGain method. Also, serial
connections can be combined as A.connect(B).connect(C) …. So we
can rewrite this as in Code ­example 1.3.

Code ­example 1.3.  Hello World, version 3.

<button onclick=​'context.resume()'>Start</​
button>
<script>
var context=​ new AudioContext()
var Tone=​ new OscillatorNode(context)
var Volume=​ new GainNode(context,{gain:0.25})
Tone.start()
Tone.connect(Volume).connect(context.destination)
</​
script>

The resulting audio graph, shown in Figure 1.3, is the same for Code
­example 1.2 and Code ­example 1.3.

Example: –​adding user interaction


The web page that this will render is static. That is, other than clicking the
Start button, there is no user interaction with it. It simply plays out the
tone. So now we will add some interaction, allowing the user to change
some parameters of the audio nodes.
7

Introducing the Web Audio API  7

AudioContext

Tone Volume Destination

Figure 1.3 A simple audio graph to generate a tone with reduced volume.

First, note that it is generally good practice to separate the front-​end,


user interface-​related code, from the back-​end, processing and access-​
related code. For a lot of web applications, this also makes sense since the
front-​end is mostly html and css files, and the back-​end is almost entirely
JavaScript. So let’s follow that practice here.
We want the gain parameter of the gain node to be controlled by the
user. So we will add a slider to the interface. In html, these are known as
range controls. We made this range control so that it can assume values
from 0 to 1 in increments of 0.01. When the user moves the slider, it will
update the gain parameter value with the value of the sliders.
There are a few points to note here. For the gain control, we called the
actual user interface element VolumeSlider, and called the audio node
Volume. Also, we never defined a variable VolumeSlider. We can access
it because the web browser defines the html element IDs as global variables.
This was nice for giving us simple code for the example, but it’s often better
to be explicit and use document.getElementById(‘someElement’)
rather than just someElement.
Now let’s add a few more features to make this a little more functional.
The slider is nice, but we might want to see the actual gain value
that is used. So we can display these amounts in the html with <span
id=​‘SomeLabel’> and </​span> in the html file and SomeLabel.
innerHTML =​SomeValue in the JavaScript file.
We also want to replace the Start button with a Start/​Stop button to turn
the sound on and off. But the Web Audio API does not allow oscillators to
be started more than once (I don’t know why, and certainly calling start()
and stop() is the intuitive way to turn a sound source on and off). So there
are a few ways around this. One can pause the whole audio context with
context.suspend(), but here we would like to stop just one node. One
could create a new oscillator each time, which could be done by creating
the oscillator, setting its values and connecting it all inside the StartStop
function. But this involves a little extra code and thought. And it’s inelegant
since all other nodes are created consistently elsewhere. Another solution is
to connect or disconnect the oscillator rather than start or stop it.
This is shown in Code ­example 1.4. An oscillator is created but initially
unconnected. If the StartStop button is clicked while the oscillator is uncon-
nected, the oscillator is connected to a gain node. If the button is clicked
8

8  Introducing the Web Audio API


while the oscillator is connected, it disconnects the oscillator. A change in
the VolumeSlider value will update both the value of the Volume node’s
gain parameter and the text in the VolumeLabel html element with the
value of the slider.

Code ­example 1.4.  UserInteraction.html and UserInteraction.js.

<button onclick=​
'StartStop()'>Start/​Stop</​button>
<p>Gain</​
p>
<input type=​
'range' max=​
1 value=​
0.1 step=​0.01 id=​'VolumeSlider'>
<span id=​ span>
'VolumeLabel'></​
<script src=​ script>
'UserInteraction.js'></​

var context =​new AudioContext()


var Tone =​ context.createOscillator()
var Volume =​new GainNode(context,{gain:0.1})
Tone.start()
var Connected =​ false
Volume.connect(context.destination)
function StartStop() {
if (Connected =​=​ false) {
context.resume()
Tone.connect(Volume)
Connected =​ true
} else {
Connected =​ false
Tone.disconnect(Volume)
}
}
VolumeSlider.oninput =​ function() {
VolumeLabel.innerHTML =​ this.value
Volume.gain.value =​ this.value
}

We now have our first Web Audio API application with user controls.
It allows one to experiment with oscillators, listen to different volume
settings, and disconnect the oscillator at will. For a lot of this, like the
OscillatorNode and disconnect(), we only introduced just enough to
show off some functionality. They will be more formally introduced, with
more detail, in later sections.
References
Bristow-Johnson, R ., 1994. The equivalence of various methods of computing biquad
coefficients for audio parametric equalizers. Audio Engineering Society Convention 97.
Buffa, M. , et al., 2018. Towards an open Web Audio plugin standard. The Web Audio
Conference.
Choi, H ., 2018. AudioWorklet: The future of web audio. International Computer Music
Conference.
Chowning, J. M. , 1973. The synthesis of complex audio spectra by means of frequency
modulation. Journal of the Audio Engineering Society, 21(7).
Creasey, D. J. , 2017. Audio Processes: Musical Analysis, Modification, Synthesis, and
Control, Routledge.
Farnell, A ., 2010. Designing Sound. MIT Press.
Gardner, W. G. , 1994. Efficient convolution without input/output delay. Audio Engineering
Society Convention 97.
Giannoulis, D ., Massberg, M ., & Reiss, J. D. , 2012. Digital dynamic range compressor
design—a tutorial and analysis. Journal of the Audio Engineering Society, 60(6).
Jaffe, D. A. , & Smith, J. O. , 1983. Extensions of the Karplus-Strong plucked string algorithm.
Computer Music Journal, 7(2), pp. 56–69.
Jillings, N ., Wang, Y ., Reiss, J. D. , & Stables, R ., 2016. JSAP: A plugin standard for the
Web Audio API with intelligent functionality. 141st Audio Engineering Society Convention.
Karplus, K ., & Strong, A ., 1983. Digital synthesis of plucked string and drum timbres.
Computer Music Journal, 7(2), pp. 43–55.
Massenburg, G ., 1972. Parametric equalization. 42nd Audio Engineering Society
Convention.
Pulkki, V ., 1997. Virtual sound source positioning using vector base amplitude panning.
Journal of the Audio Engineering Society, 45(6).
Reid, G ., 2000. Amplitude Modulation. s.l.: Sound on Sound.
Reiss, J. D. , & McPherson, A ., 2014. Audio Effects: Theory, Implementation and
Application. s.l.: CRC Press.
Stockham, T. G. Jr, 1966. High-speed convolution and correlation. Spring Joint Computer
Conference.
Valimaki, V ., & Reiss, J. D. , 2016. All about audio equalization: Solutions and frontiers.
Applied Sciences, special issue on Audio Signal Processing, 6(5).

You might also like