0% found this document useful (0 votes)
11 views

Higher Order Ambisonics Spatialization System For Amedi Lab 2020 Draft3

The document describes a spatialization system capable of positioning up to 50 sound sources in 3D space using ambisonics. It details controls for individual source playback and spatialization, as well as room parameters and a listener position controller. Parameters for the audio engine and options for arranging sources in different spatial distributions are also outlined.

Uploaded by

Guy Fleisher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Higher Order Ambisonics Spatialization System For Amedi Lab 2020 Draft3

The document describes a spatialization system capable of positioning up to 50 sound sources in 3D space using ambisonics. It details controls for individual source playback and spatialization, as well as room parameters and a listener position controller. Parameters for the audio engine and options for arranging sources in different spatial distributions are also outlined.

Uploaded by

Guy Fleisher
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Higher Order Ambisonics spatialisation system

Current capabilities:
- Up to 50 sound sources
- 7th order ambisonics spatialisation
- 12th order ambisonics spatialisation

- Separate source spatialisation automation lanes


- Real time dynamic control over room response parameters
- Real time dynamic control over listener position in room
- Real time dynamic control over spatial source distribution in space
- OSC input and control of spatialisation of sources over network (for usage with Unity3D)

- Binaural spatialisation with 4 different models for playback


Audio Engine

CPU Section Audio rate section Audio Setup section

Parameters:
1. Main audio ON/OFF
2. CPU usage in %
3. RAM usage
4. Sampling Rate
5. Signal and I/O Vector Size
6. Soundcard
7. I/O
8. Local IP
9. Operating System
10. Free disk space
11. Multi-core support (parallel processing)

*In this little part you can see the audio settings for the entire project.
In the photo above you see my laptop’s settings.
Generally, your setup settings should be set to the Yamaha AD sound card (for both input and
output) and your Windows operating system.
The audio sampling rate should ALWAYS be 48,000Hz, and the signal vector size and I/O vector
size at 128. Furthermore, the overdrive button should be on, and the interrupt button should be
off (as in photo).
These settings are the default when you open the software, so you don’t need to do anything.
Source Playback Controls first draft

1. Source Selector; choose your source from the dropdown menu: sound file, click sound
(with impulse control), burst, sine tone and ADC (microphone input from sound card).
You can mute each source separately, choose to open a sound file, or drag & drop a
sound file into the selector.
2. Equalization parameters; change EQ parameters of source; Low/Mid/High frequency
attenuation for each source
3. Perceptual (psychoacoustic) parameters; change perceptual parameters of source:
Presence = compression of frequencies
Warmth = compression of lower frequencies
Brilliance = compression of higher frequencies
Room Mix = the amount the source is weighted into the room reverb
Running verb = amount of early reflection mix into reverb
Envelopment = the amount of “leak” of different reverb stages
4. Source Cartesian position (x, y, z); move source coordinates in cartesian space (x, y, z)
5. “Around” source-spinning abstraction; Spin source around the center listening point:
Init = initial position (in degrees) around the center listening point
Speed (degrees/seconds) of spin. Negative (minus) values yield a counter-clockwise
spin
Grain = the amount of time between each section of the deg/s
Current = current position in degrees
6. Automation lanes for each spatial plane; write and play automations for
source position:

The automation lanes are set between (-2.) and (2.) on the Y axis and between 0:00
and 01:00 minutes on the X axis. To add a new point of automation simply click in the
function interface. To erase a point, [shift+click]. in order to attenuate with finer detail
hold [ctrl + click] the point and move it slowly. If you click a single point you can view
it’s location and values (x, y) on the interface.

Furthermore, you have zoom controls over the X axis and Y axis separately, that go from the
beginning to the end of each automation.
The top right number box of each automation lane is the Length of the automation itself in
milliseconds (1,000 millisecond = 1 second, hence 60,000ms = 60 seconds = 1 minute)
Source Playback Controls New2
Changes made to source editing, spatial manipulation and spatial operations

Source Input Patcher: “50SourcesNEW”

This patcher allows you to load up to 50 separate sound samples to be played back and
manipulated using the 97 Speaker Array.
SingleSample

Samples can be loaded by pressing the LoadSample button. Similiarly, samples may be added
using drag & drop onto the white rectangle area.
There are separate control buttons for play/stop of each sample. The Normalize Gain number
box allows for the dynamic application of gain normalization (a bit similar to a compressor, but
without the artefacts).

When you load a new sample into the source player, it should look like this:
Furthermore, you can control and edit the sample’s playback region, using the tools on the tools
panelto the left of the loaded sample.

The tools are (from top to bottom:


- Region Select
- Move selected region (re-fit into buffer)
- Move selected region (no region re-fit)
- Draw one waveform (experimental feature)

But most importantly, all the changes will be visible in the spatialisation patcher window
and the selected loop region will be used!
Transport Tool

1. Play; play all sources


2. Stop; stop all sources
3. Pause; pause all sources
4. Resume; resume play from pause point of all sources
5. Mute/Unmute; mute/unmute all sources
6. Gain control; gain (level in decibels) of all sources
7. Sequential source play trigger: choose time (in milliseconds) between each source play
cue and hit the start button. You can stop the triggering by clicking the start button again.
Else, it will attempt to play all 50 sources sequentially. You can also change the trigger
speed dynamically (even after you’ve started)
8. Sequential source automation player: same as player but for the automation lanes in
each source.
Viewer/Operator

Spat Viewer: view and control source location and spatialisation


Spat Operator (Oper): view and control room related parameters

In order to view the windows, we have several control:


Open = open the viewer/operator
Status = status of all sources/speakers in project
Shortcuts = shortcut list for quick reference
Help = help window panel
Snapshot = viewer/operator status (saved in a snapshot)
Spat Viewer/Spat Operator

Generally speaking, you should ALWAYS use Spat viewer in order to spatialise sources in the
room. Spatialisation is measured in Cartesian space coordinates (x = left/right, y = front/back, z
= up/down). This is to simplify the coordinate system of polar coordinates (based around the
360 degrees of the circle).
Spat Operator is used to keep track and alter (dynamically) the source’s information as it relates
to the room calculations, room reverb and the acoustics in the virtual space. Again, most
important parameters are found on each source player panel, for quick access and easy
compatibility.
Room Control

1. Mute; mute room reverb for all sources


2. Reverb; Room reverb mix for all sources
3. Heaviness; the amount of lower frequency content in the reverb
4. Liveness; the amount of higher frequencies content in the reverb
5. Room size; the size of the room
6. Reset; reset button for all reverb parameters
7. Early reflection section:
- Minimum amount of time and maximum amount of time (in milliseconds) of early
reflections of the reverb and the distribution (into other reverb sections) over time.
Listener

This panel controls the listener’s head position in space. This has to do with the head related
transfer function for the room response.
1. Visible; show/hide listener head in the Spat.viewer window
2. Editable; enable/disable editing with mouse of listener head position in Spat.viewer
3. two ready made head positions:
Listener standing button; listener head at (0, 0, 1.4m) coordinates
Listener laying on bed button; listener head at (0, 0, 0.75m) coordinates
4. dynamic position controls:
X axis; control listener’s head position over the X axis
Y axis; control listener’s head position over the Y axis
Z axis; control listener’s head position over the Z axis
Yaw; spin clockwise
Pitch; spin counter-clockwise
Roll; roll backwards/forwards
Spatial Arrangement of sources

This section is meant to simplify and accelerate working with a large number of sources. There
are 15 different spatial arrangements of sources in space which you may choose from. For easy
access, they are ordered into 3 distinct groups of arrangement:
1. Shapes; Equal area (around a sphere), Spiral arrangement, circular arrangement and upper
spiral arrangement.
2. All Directions(generally speaking, spherical 3D arrangement); Spherical covering (to cover
the largest amount of space), Spherical packing (best way to pack the sources into a sphere),
Minimal energy (the least amount of energy needed to cover a sphere), Maximal volume,
Tetrahedron (this tab has platonic solid shapes - for further information please see:
_solids), Nearly uniform arrangement, Spherical 0 design - according to
number of sources.
3. Upper (generally speaking, half of a 3D sphere); Upper Random, Upper Equiangle (same
angle from center [delta] of all sources), Upper nearly uniform, Upper equal area (each source
gets an equal area on the half-sphere)

***Pay special attention to the small yellow button (top right side): this button controls
the arrangement: based around a 3 dimensional sphere or a 3 dimensional cube.
General controls over source group arrangement:
- Yaw; control the entire source group spin on the X-axis clockwise (left to right)
- Pitch; control the entire source group spin on the X-axis counter clockwise (right to left)
- Roll; control the entire source group spin on the Y-axis (front to back)
- Xscale; scale all sources in group along the X axis (1.0 = no scaling)
- Yscale; scale all sources in group along the Y axis (1.0 = no scaling)
- Zscale; scale all sources in group along the Z axis (1.0 = no scaling)
- Distance; control the distance of all sources from the point (0, 0, 0,)
- Group position X; move entire group over X axis (flatten Y axis)
- Group position Y; move entire group over Y axis (flatten X axis)
- Group position Z; move entire group over Z axis (flatten X axis)
Presets

In order to save a preset (which includes all interface inputs) you need to follow the next steps:
1. in the number box on the right input the number of preset you want to save and press enter
2. you will see the squares change color from dark to brighter shade
brighter color squares mean that there is a preset saved in that location. darker color means the
slot is empty.
In order to choose and recall a preset, simply click the square preset number you wish to load.
**BE CAREFUL; you can save on top of someone else’s preset quite easily, so perhaps make a
list of presets on another sheet of paper, so you don’t save in each other’s presets
OSC control

In the projects, there is Open Sound Control (‘OSC’) support for source spatialisation. OSC is a
way of sending data packets over network in order to transfer sound parameters without
compromising the speed of connection (or audio interruptions). It uses the computer’s built in
networking capabilities and enhances them. (for more details, please see:
)
The OSC panel shows a number of parameter controls:
- Destination host (the IP of the computer you are connecting to)
- Send toggle - incoming data monitor - port number to send on - print toggle - speed limiter
- Receive toggle - incoming data monitor - port number to send on - print toggle - speed limiter

Generally speaking, the OSC messages are hard-coded into the project’s interface, as seen
from this image, showing the routing ‘under-the-hood’, so for every change in OSC settings,
consult back with Guy, or simply change the ‘Route’ object’s argument pathway formatting
syntax under the OSC panel, to fit your required formatting.
the OSC addresses follow a simple syntax of:
/[source index number from 1 - 10]/x y z positions (3 floating-point numbers)
(Notice the “/“ slant lines’ position)
the source index can be a float or integer, between 1 and 10
the x, y and z position must be a floating point number.

As mentioned, the OSC messages are formatted accordingly in order for the direct connection
with Unity3D and the exchange of sound source position data in real-time.
The port number and other general setting are NOT hard coded, hence you need to input them
manually in order to make use of OSC.

steps to follow:
You MUST enter the video computer’s IP address
you MUST change the port number to the one sending the data on (should be port number
6161)
you probably should put a speed limitation on the source input, in order to slow down the
incoming messages (in milliseconds).

I took the liberty to try and accommodate an experiment where one’s breath is monitored (via
sensors) and mapped to control the expansion of sound sources around the listener. This
requires OSC connection to a sensor (arduino? maxuino? other micro-processors?) which
outputs the breath parameter as floating-point numbers. This parameter could then be scaled to
better fit the sensor’s output range (with the input minimum and maximum range scaling).

the OSC addresses follow a simple syntax of:


/breath/x y z positions (3 floating-point numbers)
Binaural Spatialisation models (please see 3-D model for visual representation of speaker layouts)

As mentioned, the system has a Binaural decoding patch preset with four distinct speaker layout
combinations, in order to try capture and transfer a stereo field experience, as in Binaural
playback, with multiple speaker array systems of playback. Since the speaker array system is
calibrated in the direction of the centre listening position, the room response plays a major part
in the decoding part of the B-format higher order ambisonics’ spherical harmonics, and
implemented into the patch design.

Method A:
12 speakers in 3-dimensional alignment simulating closed-cup headphones through multi-
speaker array.
This method tries and emulate the closed-cup headphone positions with 12 speakers; open
centre, closed bottom and top, in an attempt to minimise room response.

Method B:
12 speaker array in widest stereo configuration for multi-speaker array.
This method tries and re-create the widest stereo positioning of multi-speaker array to use entire
room response.

Method C:
16 speaker array representing 2 ears as the 2 walls (strengthened widest Binaural stereo).
This method tries and re-create the widest stereo positioning of multi-speaker array to use entire
room response, while strengthening the left and right positions through addition of centre
speaker at low and medium level (ear level position)

Method D:
6 speaker array - true Binaural in stereo configuration for multi-speaker array
This method tries to re-create the perfect binaural positioning through the use of the centre
speakers column, bottom, medium and top positions.
This method does not take into account room response in speaker positioning.
Spat4Live

2 projects containing 2 custom spatialisation tools using max for live.

The Projects:
- [project1] for29 speakers
- [project2] for binaural speaker setup

The projects have been saved with a defined preset of input/output channel routing.
There are 20 Sources , spread on 20 channels.
There are Output channels routed automatically to reflect 2D Higher Order Ambisonics of the
12th order spread spatially in a symmetric formation on 29 speakers.
The ‘SpatMaster’ track controls ALL of the spatial commands for all the sources, through the
use of automation lanes.

The custom max for live device is based on the IRCAM Spat Spatialisteur~.

Viewer/Operator controls
Source arrangement formation presets

Source spatial control and spatial operations

Listener position and orientation controls Room Settings

***for more details on source operations, listener movement, source spatial arrangements, viewer,

operator and OSC control, please refer to relevant section in the Max patch description, (pp. 11 - 16)
Overview. Of the custom max for live device (The bottom of the Ableton Live set should look like
this):

You might also like