Algorithms and Data Strctures for a Music Notation System Based on GUIDO [Hermann Walter, Keith Hamel] (2002)
Algorithms and Data Strctures for a Music Notation System Based on GUIDO [Hermann Walter, Keith Hamel] (2002)
Dissertation
Referenten:
Prof. Dr. Hermann Walter, Darmstadt
Prof. Dr. Keith Hamel, Vancouver, Kanada
First of all, I would like to thank Prof. H. K.-G. Walter for his support. Without him,
research in the field of computer music would not have started at the computer
science department of the Darmstadt University of Technology.
I would also like to express my thanks to Holger H. Hoos and Keith Hamel for
their fundamental work on G UIDO Music Notation and for countless discussions
on the subject of music representation and music notation. Special thanks also to
Jürgen Kilian for many fruitful conversations and encouragement. Additionally, I
would like to thank Marko Görg, Matthias Huber, Martin Friedmann, and Christian
Trebing for working on different aspects of G UIDO. Parts of this thesis would not
have been possible without their work.
Many thanks go to the colleagues at the “Automata Theory and Formal Languages”
group of the computer science department of the Darmstadt University of Technol-
ogy, who have always been extremely helpful and open to discuss the sometimes
unfamiliar subjects of computer music.
Last but not least, I wish to express my love and gratitude to my wife for her trust
in my work.
iv
Abstract
Abstract v
1 Introduction 1
vii
viii CONTENTS
5 Applications 121
5.1 The G UIDO NoteViewer . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.1.1 Architecture of the G UIDO NoteViewer . . . . . . . . . . . . . . 122
5.1.2 Interface Functions of the GNE . . . . . . . . . . . . . . . . . . . 124
5.2 The G UIDO NoteServer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.2.1 Browser-Based access . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.2.2 CGI-Based access . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.2.3 Usage of JavaScript . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.2.4 The Java Scroll Applet . . . . . . . . . . . . . . . . . . . . . . . . 130
CONTENTS ix
6 Conclusion 137
Bibliography 140
Appendix
Erklärung 153
x CONTENTS
“In general, a study of the rep-
resentation of musical notation
induces a profound respect for
its ingeniousness. Graphical
features have been used very
Chapter 1 cleverly to convey aspects of
sound that must be accommo-
dated in the realization of a
single event.” [SF97a], p. 11
Introduction
1
2 CHAPTER 1. INTRODUCTION
sound waves became possible, these roles were beginning to loose importance. Only
because music could be recorded and therefore easily distributed, performers (for
example jazz musicians) were beginning to become as popular as previously only
the composers were. Today, someone can be the composer and single performer of a
piece at the same time; for these musicians, the need to produce a graphical score is
of little importance. Still, a large number of composers are producing scores today,
which are then interpreted and performed by musicians. Therefore, transmitting
musical information using graphical scores is an important subject.
Conventional music notation as it is used today
has evolved over a long period of time, beginning
well before the invention of printing techniques.
One of the most important contributors to todays
music notation is Guido d’Arezzo (? around 992,
y around 1050), who was the first to place notes
on and between lines, which form the staves of a
score known today [Blu89b]. This was the birth of
the two-dimensional structure for graphical music
representation that is used until now.4 As the com-
plexity of musical compositions grew and new in-
struments were invented, music notation evolved
continously to adequately represent the new ideas.
The techniques for producing scores also changed
over time. As printing techniques were invented
around 1500, they were also used and adapted for
music notation. Guido d’Arezzo
The printing process for creating scores that was used from around 1600 until be-
yond the invention of computer driven printers is called engraving [Blu89c, eng98].5
Here, the (human) engraver uses a copper plate to “engrave” the score into the soft
metal plate using specialized tools. The quality and readability of the score greatly
depends on the experience of the human engraver. Very often, music notation is
ambiguous, as there exist multiple ways to represent a musical idea graphically:
“Musical notation is not logically self-consistent. Its visual grammar is open-ended
– far more so than any numerical system or formal grammar.”6 Because of the com-
plexity of the task, some books on the art of music engraving have been written
[Had48, Rea79, Ros87, Vin88, Wan88], which give detailed instructions on how to
engrave music. These instructions are usually derived by studying and extracting
examples from “real” scores, which were engraved by an expert human engraver.
Nevertheless, for almost any typesetting rule, there seems to exist at least one score,
4
As shown in Figure 1.1, the pitch of a note is indicated by the vertical placement on and between
lines of a staff. The attack time of an event is denoted by the horizontal placement on the staff.
Events with the same attack time have the same horizontal position.
5
Some music publishers are still employing human engravers today.
6
Quote taken from [SF97a] (p. 15)
3
where this rule is explicitly broken in order to obtain a good overall result. This
makes automatic music typesetting a complicated task.
high
pitch
low 0 ¼ ½ ¾
time position
The invention of computers and high quality laser printers has dramatically chan-
ged the way musical scores are being produced today: Nowadays, almost all music
typesetting7 is done with the help of computers, but it must be noted that even
though computer programs are capable of producing very fine scores, a large amount
of human interaction is still required, if high quality output is desired.
The issues and problems found when automatically typesetting music are mani-
fold. Don Byrd, who has written a dissertation on the subject, defines three main
tasks that must be solved by any music notation system: Selecting, Positioning, and
Printing the music notation symbols [Byr84]. Each of these tasks requires know-
ledge that was formerly held by engravers only. In order to automatically create
music notation, the human knowledge must be transfered into a computer system,
a task, which is always challenging and interesting. Different approaches to encode
engravers knowledge into music notation systems have been tried. Most approaches
use a case (or rule) based approach to automatically select and position the music
notation symbols [BSN01, Gie01]. Unfortunately, there is currently little public dis-
cussion on music notation algorithms: most commercial vendors of music notation
software do not publish or discuss their internal algorithms.
Even though quite a number of music notation programs are being used today, no
commonly used and wide spread notation interchange format exists. Even though
many proposals for such an interchange format have been made (for example NIFF
or SMDL), none of them are widely used. One of the main reasons for this seems to
be the complexity of any such format. Because music notation and musical struc-
ture are generally so rich, the developed formalisms are also complex. Very often,
supporting such an interchange format involves enormous effort for an application
developer. Additionally, some developers of commercial music notation software
seem to have little interest in an unrestricted interchange of scores from one appli-
7
Note that the word “typesetting” is borrowed from printed text; actually scores where prepared
using types for a period in time, but the results were not satisfactory. Nevertheless, the term “musical
typesetting” describes the process of creating a visual score (for computer display or printing).
4 CHAPTER 1. INTRODUCTION
cation to another.
The only (structured) music representation format that is widely accepted today is
MIDI, which is not suited as a notation interchange format because it was never
intended as such; much of the information present in a graphical score can not be
represented using MIDI. Many other music representation formalism have been de-
veloped since computers are being used for music information processing. Most of
these were created for a specific purpose and not for notation interchange.
Beginning in 1996, a group of developers began developing a new music representa-
tion language called G UIDO Music Notation8 which is named after Guido d’Arezzo
[HHRK98]. G UIDO is a human readable, adequate music representation language
that is not primarily focused on conventional music notation; it is much more an
open format, which is capable of storing musical, structural and notational infor-
mation. One of the main design criteria for G UIDO was adequateness, meaning
that a simple musical idea should have a simple G UIDO representation, whereas
more complex ideas may sometimes require more complex G UIDO representations.
G UIDO is also intuitive to read because commonly known musical names are used:
it is easy to understand that the G UIDO description [ nclef<"treble"> c d e ]
describes a treble-clef and three consecutive notes c, d, and e. Despite of the intu-
itive character of simple G UIDO descriptions, it can very well be used for represent-
ing complex musical ideas. It was shown in [HHR99] that G UIDO is also very well
suited as a notation interchange format.
Because of its features, an increasing number of research groups and individuals
began to use G UIDO Music Notation as the underlying music representation for-
malism in various music related projects. In this context, the need for an easy
to use music notation system that converts G UIDO descriptions into conventional
graphical scores became quite strong; this demand led directly to the design and
implementation of a music notation system based on G UIDO, which is being de-
scribed in the presented thesis. Other music notation software has been adapted to
import and export G UIDO [Ham98]. Even though G UIDO descriptions may contain
all necessary formatting information, the general process of converting an arbitrary
G UIDO description into a conventional score is not trivial: Because a G UIDO descrip-
tion can be highly unspecified, meaning that no explicit formatting information is
contained, the missing information must be generated automatically. Additionally,
the G UIDO description may contain notes or rests with unconventional durations,
for which no standard graphical representation is possible. These cases must be
dealt with by a set of music notation algorithms, which must first be recognized,
defined, and implemented. It should also be possible to describe each of these music
notation algorithms as a G UIDO to G UIDO transformation. This means that each
algorithm gets a G UIDO description as its input and returns an enhanced G UIDO
description, where well defined formatting aspects have been automatically added.
8
In the following, the terms G UIDO Music Notation and G UIDO are used in parallel. Both terms
describe the format as well as specific files. It will be clear from the context, what is meant.
5
This concept greatly simplifies the exchange of music notation algorithms and is
also very helpful for generally discussing the issues found in automatic music nota-
tion systems.
Because G UIDO descriptions are human-readable plain text, any application sup-
porting G UIDO requires an internal, computer-suited representation for storing,
manipulating, and traversing the musical data. As part of this thesis an object-
oriented class library, called Abstract Representation, was designed and imple-
mented as such an internal representation for G UIDO descriptions. The G UIDO
parser kit [Hoo] is used to convert a G UIDO description into one instance of the
Abstract Representation. Because the general structure of the Abstract Represen-
tation closely matches the internal structure of G UIDO, the conversion process is
straight-forward. As the Abstract Representation is an object-oriented data struc-
ture, the required manipulation and traversal functions are an integral part of the
framework. These manipulation functions, which are mostly used by music nota-
tion algorithms, include operations for deleting or splitting events or adding musical
markup.
In order to create a graphical score, an additional object-oriented class library, called
the Graphical Representation was developed. Each notational element found in a
conventional score has a direct counterpart in the Graphical Representation. Two
steps are then required to create a conventional score from an arbitrary G UIDO
description: first, the G UIDO description must be converted into the Abstract Rep-
resentation, which must then be converted into the Graphical Representation. Once
the notational elements of the Graphical Representation have been determined,
they must be placed on lines and pages. This requires sophisticated algorithms
for spacing, line breaking and page filling. All of these issues have been thor-
oughly dealt with during work on this thesis. The spacing algorithm used in most
current music notation system, which has been previously discussed in the litera-
ture [Gou87], has been enhanced to provide better results for complex interaction of
rhythms. The line breaking algorithm has been extended to be usable for obtaining
optimally filled pages, which is a standard requirement for graphical scores. There
have been no previous publications on the subject of page filling, although one com-
mercial music notation system includes an implementation of such an algorithm.
The stand-alone music notation system that has been developed using the data
structures and algorithms described above is freely available and is being used by
a growing number of people and institutions around the world. The implemented
music notation system is also used in other areas: the G UIDO NoteServer is a free
Internet service that converts G UIDO descriptions into pictures of scores. This ser-
vice can be accessed using any standard web browser. Another part of research
which was carried out while working on this thesis focused on the development of a
Java based music notation editor.
The rest of this thesis is structured as follows: In the first part of chapter 2, G UIDO
Music Notation is thoroughly described and numerous examples for the various fea-
tures of G UIDO are given; several sections of this part of the chapter are enhanced
6 CHAPTER 1. INTRODUCTION
Introduction
In this chapter, G UIDO Music Notation,1 a music representation language which
has been developed since 1996 [HH96], will be introduced and compared to other
representation formalisms. G UIDO Music Notation is named after Guido d’Arezzo
(ca. 992-1050), a renowned music theorist of his time and important contributor
to today’s conventional musical notation [Blu89b]. His achievements include the
perfection of the staff system for music notation and the invention of solmization
(solfege). G UIDO is not primarily focused on conventional music notation, but has
been invented as an open format, capable of storing musical, structural, and no-
tational information. G UIDO has been split into three consecutive layers: Basic
G UIDO introduces the main concepts of the G UIDO design and allows to represent
much of the conventional music of today. Advanced G UIDO extends Basic G UIDO by
adding exact score-formatting and some more advanced musical concepts. Finally,
Extended G UIDO can represent user-defined extensions, like microtonal information
or user defined pitch classes.
G UIDO Music Notation is designed as a flexible and easily extensible open stan-
dard. In particular, its syntax does not restrict the features it can represent. Thus,
G UIDO can be easily adapted and customized to cover specialized musical concepts
as might be required in the context of research projects in computational musicol-
ogy. More importantly, G UIDO is designed in a way that when using such custom
extensions, the resulting G UIDO data can still be processed by other applications
that support G UIDO but are not aware of the custom extensions, which are grace-
fully ignored. This design also greatly facilitates the incremental implementation
of G UIDO support in music software, which can speed up the software development
process significantly, especially for research software and prototypes.
1
In the following, the terms G UIDO Music Notation and G UIDO are used in parallel. Both terms
describe the format as well as specific files. It will be clear from the context, what is meant.
7
8 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
G UIDO has not been developed with a particular type of application in mind but to
provide an adequate2 representation formalism for score-level music over a broad
range of applications. The application areas include notation software, composi-
tional and analytical systems and tools, musical databases, performance systems,
and music on the web.
This chapter introduces the most important musical features of G UIDO Music Nota-
tion and then compares G UIDO to other music representation languages, namely the
MIDI File Format, DARMS, MuseData, SCORE, Common Music Notation, Plaine
and Easie Code, MusiXTEX, LilyPond, NIFF, SMDL, and XML-based representa-
tions.
2
The notion of adequateness is a major design criteria of G UIDO. Synonyms for the term “ade-
quate” are: decent, acceptable, all right, common, satisfactory, sufficient, unexceptionable, unexcep-
tional, unimpeachable, unobjectionable. From Merriam-Webster’s Collegiate Dictionary: adequate
means “sufficient for a specific requirement”. G UIDO is an adequate representation format, because
simple musical concepts can be represented in a simple way; only complex musical notions may
require more complex representations.
3
Please note that the specified duration may be zero; an event with duration zero can be helpful
in exact score formating, as will be shown in later chapters.
2.1. BASIC GUIDO MUSIC NOTATION 9
# # 4 ——
— EÚ ——— . j
—— —— —— ——EÛ— EÚ– ——— . j
—— —— ——
[ \clef<"treble"> \key<"D"> \meter<"4/4"> & 4 EÛ –– =‹ XÛ=====
====== XÛ XÛ XÛ ‹ ==== –– ‹ XÛ==== XÛ XÛ XÛ=‹
a1*1/2 b a/4. g/8 f#/4 g a/2 b a/4. g/8 f#/4 g
# # —— —— XÚ–– XÚ–– EÚ–
a/2 a b c#2/4 d c#/2 b1 a/1 ] EÚ EÚ––
& =EÛ====
== EÛ =
‹ ––– =====– – = ‹ –– ==== – ‹ w===
= =\
6
These are the first three bars from the Bach Choral “Wie wunderbarlich” from the St. Matthew
Passion.
7
It is possible to write [ f c*1/4, e/8, g/16 g ]. The G UIDO specification deals with this, as
if [ f c*1/4, e/4, g/4 g ] had been written: the longest note of a chord determines the chord
duration.
2.1. BASIC GUIDO MUSIC NOTATION 11
##
c XÚ–– XÚ XÚ–– # ——XÛ ——XÛ— XÚ–– XÚ–– XÚ––– XÚ–– XÚ– XÚ–– UXÚ–
‹ ––– ===== – ‹ –– === – –– =\
{ [ % soprano
\clef<"treble"> \key<"b"> \meter<"C"> =====
& – = – =
‹ – ======
–
b1*1/4 | b b a# f# h c#2 d d e d \fermata( c# ) ],
## U
[ % alto
c ——XÛ ——XÛ # ——XÛ ——XÛ— —— — ——XÛ— ——XÛ— XÚ–– XÚ–– XÚ– # ——XÛ
–
\clef<"treble"> \key<"b"> \meter<"C">
a1*1/4 | g g# f# c# b0 f#1 f# b c#2 h1 \fermata( a# ) ] ,
=====
& =
‹ ===== _XÛ— =‹ _—XÛ====== – ‹ – === – =\
# # c XÚ–– Ú
X — Ú
X Ú
X Ú
X Ú
X
– Ú
X
– Ú
X
– UXÚ–
– XÚ–– XÚ– # —XÛ –– –– XÚ–– ––– –– – –– ––
[ % tenor
– =‹ –– ===== – –– ‹ – ======
– – ‹ – ===
\clef<"g2-8"> \key<"b"> \meter<"C">
f#1*1/4 | e d c# a#0 e1 e d/8 e f#/4 g f# \fermata( f#) ] , =====
& = =\
[ % bass
Ú
X Ú
X Ú
X U
# c # XÚ– XÚ # XÚ–– XÚ–– XÚ–– # XÚ––– # XÚ–– ––– –– # XÚ–– –– XÚ–
\clef<"bass"> \key<"b"> \meter<"C">
‹ ––– =====
? # – – ––
===== –– = – – – = ‹ ====== ‹ – === =\
d#0*1/4 | e e# f# f# g# a# b b a# b \fermata( f# ) ] }
——X— —— U——
{ d#0*1/4, f#1*1/4, a, b } |
# # —— — —
c —XX—X —X—X— # ———XX # ——XX— ——X— ——X— ——X— ——X— ——XX—
{ e0,e1, g, b } { e#0, d1, g#, b }
X— XX— X—X
‹ _X——====
{ f#0, c#1, f#, a# } { f#0, a#, c#1, f# }
=====
& _ —— ‹ _X—=====
_ — X _ —
X _ —
X ‹ _—X—X=====
= _ —X—X _X—X— _X—— = X— X—— \
Û
X Û
X XÛ _—XÛ _
{ g#0, e1, b0, b1 } { a#0, e1, f#, c#2 }
{ b0, d1, f#, d2 } { b0, f#1, b, d2 } _ _ — _ — _
_XÛ— _XÛ #_XÛ _XÛ _XÛ
#_ — _ X
— #_ —
Û
X #_ Û
X #_ _
_—XÛ
{ a#0, g1, c#2, e} { b0, f#1, b, d2}
\fermata( { f#0, f#1, a#, c#2 } ) ]
number (-1) or a string “F” to produce the same key-signature (F-major, one
flat).
Abbreviations for tag names exists: the nslur-tag can be also specified as nsl.
The same is true for a variety of tags (nbm for nbeam, ni for nintens, and many
more). For a complete set of tag-names and their abbreviations, see [HH96].
Range tags (like the nslur-tag in the example) can be specified either by using
round brackets or by using begin- and end-pairs. One of the slurs in the exam-
ple of Figure 2.3 is specified using a nslurBegin- and a nslurEnd-tag. Using
this mechanism, it is possible to allow a simple form of overlapping ranges.
This feature is enhanced in Advanced G UIDO (see section 2.2.1).
If a time signature has been set by using the nmeter-tag, no bar lines have to
be encoded explicitly. Bar lines can be set explicitly to denote an up-beat. The
nbar-tag is equivalent to the use of “j” (using the nbar-tag it is also possible to
explicitly set bar numbers).
that parameter names are optional and can be omitted, as long as all required pa-
rameters are specified. Using parameter names not only increases the readability
of Advanced G UIDO sources (which is beneficial for implementing G UIDO support),
it also makes it possible to only partially specify parameters, and to assume default
values for unspecified optional parameters.
—— XÚ–– —— XÚ––
[ \slurBegin:1 g0*1/4 \slurBegin:2 f1
XÛ – XÛ –
====================
& _ _ =\
\slurBegin:3 g2 a0 \slurEnd:1 g1 _XÚ–– _XÚ––
\slurEnd:2 f2 \slurEnd:3 ] – –
tag parameters to existing Basic G UIDO tags, which control the physical dimensions
of score pages, the position and size of systems and staves, as well as the exact
location of all graphical elements associated with staves. While notes, rests, most
notational symbols, and text are typically associated with staves, Advanced G UIDO
also supports graphical and text elements like, e.g., a title, which can be placed at
specific locations on the page independent from staves.
While the full set of mechanisms and corresponding tags for completely specifying
score layout and formatting cannot be presented here in detail, some main concepts
are highlighted while others are outlined rather briefly. A small set of tags is used
to specify the physical dimension and margins of score pages (npageFormat), the
relative positioning of systems on a page (nsystemFormat), type and size of staves
in a system (nstaffFormat) and their relative location (nstaff), page and sys-
tem breaks (nnewPage, nnewSystem), etc. Spacing and layout are based on graph-
ical reference points, which Advanced G UIDO specifies for each notational element.
Some elements (such as systems, or composer and title information) are positioned
relative to the page, others are relative to the staves or notes they are logically
associated with.
Generally, for elements associated with a staff, vertical offsets are usually specified
relative to the size of the staff, using the relative unit halfspace (hs). Likewise,
the size of many elements, such as notes or clefs, is always specified relative to the
size of the staff. The most important mechanism for exactly specifying horizontal
locations is the nspace tag. Placed between two notational elements, e.g., two notes,
it overrides any automatic spacing and forces the given amount of horizontal space
to separate the horizontal reference positions of the elements.
Advanced G UIDO includes all tags defined in Basic G UIDO, but allows exact posi-
tioning and layout information to be specified using additional optional parameters.
For instance, it is possible to exactly define the slope of a slur, or to change the size
or type of a notehead, whenever this is needed. While automatically created Ad-
vanced G UIDO descriptions10 will usually include complete layout and formatting
information, in many cases this information is only partially specified. In this case,
an application reading such a G UIDO description will try to infer unspecified in-
formation automatically where needed. This is the major goal of the presented the-
sis: Converting an “under-specified” G UIDO description into an acceptable graphical
score; the required steps for this conversion process are examined in the following
chapters.
If specific information is not required or supported by an application, it can easily
be ignored when interpreting the G UIDO input. Note that this approach allows the
adequate representation of scores: only when exact formatting information is really
needed, the corresponding additional tags and parameters have to be supplied.
10
Because Advanced G UIDO can be used as a notation interchange format, completely formatted
scores can be saved as a complete Advanced G UIDO description by notation programs supporting
G UIDO.
% The first page of a BACH Sinfonia, BWV 789
% Most of the element-positions are specified 16
% using Advanced GUIDO tags. The layout has
% been very closely copied from the URTEXT
% edition by the Henle-Verlag.
% This example has been prepared to show, that
% Advanced GUIDO is capable of exact score- SINFONIA 3
% formatting.
%
BWV 789
{ [ % general stuff 5
4 1 3
3 2 2
\pageFormat<"a4",lm=0.8cm,tm=4.75cm,bm=0.1cm,rm=1.1cm> ——— — —— —— —— —— —— —— —— —— 5
2
\title<"SINFONIA 3",pageformat="42",textformat="lc",dx=-2cm,dy=0.5cm> # # c a X——Û XÛ— _XÛ— n X———Û X———Û X——Û X——Û XÛ—— X———Û ——— ——XÛ— X——Û X——Û ——— ———— X——Û X——Û X——Û— X———Û _XÛ_XÛ— XÛ— X——Û X——Û X——Û— ——XÛ X——Û X—Û X——Û # XÛ— _XÛ— XÚ– XÚ– X—Ú–Û–
— —
\composer<"BWV 789",dy=1.35cm,fsize=10pt> &
========= X Û
=‹ ======= XÛ XÛ — =‹ ======== a –– –– – nXÚ–– ‹
‹ ‹ 2 ‹
_ X
Ú
–
_ _X
Ú
% indent of 1.05 cm for the first system.
XÚ–– ‹ ––– ––– XÚ–– XÚ–– # XÚ– XÚ–– XÚ–– _XÚ––– _XÚ–––– _XÚ––– XÚ–– ‹
\systemFormat<dx=1.05cm> ?
## c –
=========
a X–Ú–– XÚ–– a XÚ–– ‹‹‹ X–Ú–– a XÚ– XÚ––– XÚ–– XÚ––––
XÚ ‹
‹ X
Ú
– – – – – – – – ‹
3 1
« J 3
J– ======== J– 2
–– ––– =– ========‹ 1
% voice 1 5
3 2
45 1 5
45
1 1 2 1 3
— — — — — # X—Û _E—Û
\staff<1,dy=0.85cm> % next staff is 0.85 cm below
# # X—Û
X
Ú X
Ú
–
– XÚ–– X—Ú–Û–– XÚ X—Û XÚ– XÚ–– X—Ú–Û—– X—Û— XX—Ú–Û– XX—ÚÛ–– XX—Û—Ú– # XX—Û—Ú– XÚ–––– XÚ––– XÚ––– XÚ–– XÚ–– XÚ– XÚ–– XÚ–– XÚ– XÚ–– XÚ––
\staffFormat<style="5-line",size=0.9375mm> &
==============
– ‹ XÚ–=========== =‹
2
– XÚ–– – – – XÚ– = ‹ – –– –– – – – – – – – – – – – – – ‹
% barlines go through whole system _
X
Ú ‹
\barFormat<style="system"> # X
Ú
– # X
Ú X Ú
– X Ú
– –
– X
Ú
– n X
Ú
– X
Ú X
Ú X
Ú
– X
Ú
– X
Ú
– n X
Ú —
— — — — — — n X
Ú
% measure 1 (voice 1)
# – X–Ú – – – – – – –– XÚ–– X–Ú– –– –– –– –– –– ‹ X–Ú– XÚ– XÛ — — — X—Û XÛ— XÚ–– X–Ú– –– XÚ–– ‹‹
? # – – – –
============== 1
– – – –
1
= ‹ – =========== – 1 2 3
– – – = – ‹
4 3
\clef<"treble"> \key<"D"> \meter<"C"> \stemsUp
« _X—Û— XÛ— XÛ
\restFormat<posy=-1hs> _/8 \beam( \fingering<text="3", 2 1 5 4
3 5 4
— — — — — — 4 3
dy=8hs,dx=-0.6hs,fsize=7pt>(f#2/16) g) 5
\beam( \stemsUp<5hs> a/8
# # _XX—ÛÚ– ——XÛ— X——Û X——Û X—Û— X—Û X—Û— X——Û # X——Û ——— X——Û— X——Û X——Û X—Û— ——XÛ n X——Û X—Û— —— —— X—Û— X——Û X——Û X——Û X——Û— ——
–
– X XÚ–
\fingering<text="2",dx=1hs,dy=11.5hs,fsize=7pt>( &
============== Ú–– XÚ–– XÛ = XÛ XÛ XÚ– XXÛÚ– XÚ–– XÚ– XÚ– XX—ÛÚ XÚ––– XÚ––– =
12
– – XÚ–– ‹ XÚ–– ===========
‹ –
–
\stemsUp<8hs> c ) )
–– –– –– –– ––– ––– –– ––– ‹‹
\beam( \stemsUp h1 e2/16 f#)
‹ — — —— —— —— —— —— —— —— XÚ– XÚ XÚ ‹
\beam( \stemsUp<5hs> g/8 \stemsUp<8.5hs> h1 \stemsUp) \bar
# # XÚ–– XÚ–– XÚ–– X–Ú–– XÚ––– n XÚ– XÚ XÚ–– XÚ–– X–Ú–– XÚ ‹ — X—Û XÛ— XÛ — — XÛ XÛ— X—Û XÚ– ––– ––– ––– ‹
============== ‹ X—Û=========== X—Û X—Û – ‹
% measure 2 (voice 1)
« ? – – – – –– ––– –– – –– ––– = –– –– – =
\beam( a1/8 d2/16 e) \beam( f#/8 a1)
\beam( \stemsUp<12hs> g/16 3 — — — — — —
\stemsUp \fingering<text="4",fsize=7pt,dy=9hs,dx=-0.2hs>(f#2) e # # X—Û —— X——Û— .
—
X Ú X
Û X
Ú X
Ú XX—ÛÚ– X—Û— X——Û ——— X——Û X—Û— X——Û X——Û— # X——Û— X—Û X———Û XÚ XX—ÛÚ– . XÚ XÚ– XX—ÚÛ– X——Û— X——Û— X——Û X—Û— X——Û— X——Û— X—Û— X—Û
\stemsUp<8hs> d \stemsUp) –
==============X
Ú Ú
X – # X
Ú – X
Û = XÚ –
– – – – – – XÚ–
‹ XÚ–– =========== – – =‹
\beam( \stemsUp<12hs> c# \stemsUp
& –– ––– XÚ–– –– –– –– –– –– EÚ––
– ‹ – – – – – – – — ‹
\fingering<text="5",fsize=7pt,dy=9hs>(h) a ‹ —— ‹
. X Ú
–
– X
Ú X
Ú
– XÛ ‹
\stemsUp<8hs> g \stemsUp ) \bar
# # X–Ú– XÚ–– XÚ– XÚ–– X–Ú– XÚ–– X–Ú– # X–Ú– X–Ú–– XÚ–– X–Ú–– ‹ n X–Ú– X–Ú– XÚ–– X–Ú– X–Ú– X–Ú– X–Ú–– ––– – X–Ú–– X–Ú–
==============
? – – –– – – – – – – – = – – – – –
‹ – =========== =‹
% measure 3 (voice 1)
\beam( \stemsUp<5.5hs> f#2/16 \stemsUp e
«
\fingering<text="2",fsize=7pt,dy=10hs>( d ) — — — —
\stemsUp<5.5hs> e \stemsUp)
# # XÚ–– XÚ–– XÚ– XÚ–– XÚ–– XÚ– XÚ– XÚ–– —— X——Û X——Û X—Û— ———— ———— X—Û— X——Û X——Û # XÚ– XÚ–––– XÚ––– XÚ–––
\beam( \stemsUp<6hs> f# \stemsUp ==============
& – – –– – ––– n XÚ–– # X–Ú–– –– ––– –– n X–Ú–– = ‹ X—Û=========== XÛ XÛ— – – – = – ‹
\fingering<text="1",fsize=7pt,dy=11hs>(e)
‹ —— —— ‹‹
\fingering<text="3",fsize=7pt,dy=11hs>(f#)
—— —— —— _——XÛ _X——Û— __X——Û _X——Û— _X———Û —— —— —— —— _X———Û _X——Û— _X———Û —— ‹‹ —— —— —— —— —— —— —— _X——Û _X——Û— _
XÛ XÛ _XÛ X—Û— ‹
\stemsUp<6.5hs> g# \stemsUp)
XXÛÚ– # X—Û # XÛ— XÛ XÛ ‹ # XÛ— X—Û # X—Û XÛ— XÛ ‹
# # XEÛÚ–– # XÛ— XÛ – XÚ–– XÚ–– @ X
Ú
– XÚ––
\stemsUp<4.75hs> a/4 \stemsUp – – ‹ – XÚ–– XÚ––– n XÚ––– –
?
============== – =– =========== – – =‹
\slurBegin <dx1=2hs,dy1=3hs,dx2=0hs,dy2=2hs,h=2hs>
\fingering<text="5",fsize=7pt,dy=6hs>( \stemsUp<4.5hs> e/4 )
« — —
\bar \newSystem<dy=4cm> # # XÚ– XÚ XÚ XÚ– XÚ– XÚ– XÚ– —— —— —— —— X——Û
% measure 4 (voice 1)
X——Û X———Û # X———Û X——Û— X——Û XÚ X—Û ———— X———Û X——Û X——Û—
============== – X Ú _XÚ ‹
\staff<1,dy=0.98cm>
& –– ––– ––– –– ––– XÚ––– –– ––– # X—Û— # X—Û XÛ— XÛ # X–Ú–– XÚ––– n XÚ–– =
‹ # XÚ–– =========== XÚ–– –– ––– n XÚ–– _XÚ–– # XÛ
— – ‹ – – – – – _XÚ–– n XÚ––– n XÚ––– =
e/4 \slurEnd ‹
—X—Û —— —— — —— — – – ––– ‹‹
\tie<dy1=2.4hs,dx2=-1hs,dy2=2.4hs,h=1.75hs>( # n XÛ # XÛ— X—Û XÛ— XÛ— ‹ XÚ– # XÚ– XÚ # XÚ XÚ XÚ
\fingering<text="4 5",dy=8hs,fsize=7pt>( \stemsUp<5hs> d) d ) ‹
============== XÚ–– XÚ–– = XÚ–– =========== XÚ–– ‹‹
Figure 2.5 shows an example illustrating Advanced G UIDO’s exact formatting fea-
tures. The G UIDO code includes page, system, and staff formatting as well as accu-
rate formatting of slurs. Stem length and directions, beam groupings, and ties are
also specified precisely. The complete G UIDO description for the page can be found
in Appendix 6.
∞
¬ 3 +2 ¥ a X X X X XX X X X >X a X £ X >X X X >
Perc 1 8 X X
p ƒ §:∞
¶ ∞
Perc 2 X X XXX X X X
L ƒ
> > > >
'
>
XX___X^
#_X
~~~~ # _
_
3+ 2 ~~~~~
~~ ~ ~ ~ ~ _J
~~~~
~~~~~
& 8 _jX ~~~~gliss.
~~~~~
Piano p '
38+2 XJ ~~~~gliss.
~~~~~
~~~~~
~~~~~
? ~~~~~
~~~~ __j __
# #___XXX__XX
~~~~~
~~
ing the nalter-tag, microtonal alteration of standard pitch classes, such as quarter-
tone accidentals can be realized; nalter<-0.5>(a1/8) represents an eighth note
a1 altered downwards by a quarter tone. Likewise, different tuning systems can be
represented by two other new tags, ntuning and ntuningMap, which are used to
select predefined tuning systems, such as just tuning, or to specify a tuning scheme
based on systematic alterations of given pitch classes. Finally, concepts such as
quarter-tone scales or absolute pitches can be represented by parameterized events,
which generalize the concept of an event (such as a note or rest) in Basic or Ad-
vanced G UIDO.
[ c2/8 d b1 e2 c e b1 d2 { [ c2/8 d b1 e2 c e b1 d2 ],
c2/8 e b1 d2 c d b1 e2 ] } c2/8 d b1 e2 c e b1 d2 ]
Using G UIDO as a music representation language offers some advantages over other
representation languages, which will be described in the following sections. Never-
theless, one issue concerning the grouping structure of G UIDO has to be mentioned:
When using several sequences to construct a piece (the piece is encoded in poly-
normalform; see section 2.1.2), synchronous events appear at different locations
in the description. This is a common problem for many music representation lan-
guages, especially if they are human readable. The issue is problematic, because
these G UIDO descriptions can not be used for streaming purposes, where a client
receives a stream of musical information from a server. In a streaming-scenario,
the receiving party must (in the worst case) already have received the complete
G UIDO description before it can (dis-)play the score or the music. To overcome this
disadvantage, a mechanism is needed, which converts poly-normalform descriptions
into chord-normalform G UIDO descriptions. Currently, one design issue in the Basic
and Advanced G UIDO specification prohibits this: as was already mentioned in sec-
tion 2.1.2, a chord is a collection of single events, which all have the same duration.
There are many instances, where this restriction is too tight. One such instance
can be seen in Figure 2.2 on page 11: the tenor voice contains two eighth notes in
the second bar; these are omitted in the chord-normalform. To create a streamable
G UIDO description, an extended chord-normalform must be defined, which allows
chords to contain simple sequences instead of only notes. The details of this still
need to be specified exactly.
Another issue, which is closely related to the streaming of G UIDO descriptions, is the
question of editing complex G UIDO descriptions: in a complex piece containing mul-
tiple voices, a single musical time position is present up to n times (where n is the
number of sequences). Therefore, changing the musical content at one time position
may require editing the G UIDO description at n different positions in the file. This
is a common problem that can be found with other representation formats aswell;
the only solution for a text based format would be the usage of a two-dimensional
spreadsheet which would basically look like a two-dimensional, textual representa-
tion of the graphical score. While this approach may solve the problems concerned
with editing, it requires somewhat complex formatting even for simple pieces. To
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 21
overcome the issue in the case of G UIDO, a specialized G UIDO editor needs to be cre-
ated, which is aware of different voices and keeps track of mapping time positions
to individual voice positions.
2.5.1 MIDI
MIDI (Musical Instrument Digital Interface) was originally developed as a proto-
col for the communication of digital musical instruments (primarily synthesizers).
Therefore, MIDI has a strong bias towards piano keyboards and mainly records
keyboard events (keys being pressed and released) together with other control pa-
rameters. The MIDI file format [HwDCF+ 97] offers the means to store this proto-
col information in a file. In order to be an efficient streamable format, MIDI data
is transmitted and stored as a stream of bytes which is not directly human read-
able.12 MIDI is strongly performance oriented and was never intended to be used
as a general music representation language. The usage of MIDI for the interchange
12
Different MIDI formats exists; when using format 1, different tracks are placed sequentially in
a file, therefore format 1 MIDI files are not suitable for streaming.
22 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
of musical data is therefore somewhat limited: “Standard MIDI Files have the dis-
advantage that they represent relatively few attributes of music” [SF97a], page 24.
Standard MIDI files cannot, for example, distinguish between D# and Eb , because
these notes are realized by pressing the same key on a piano keyboard. There is
also no mechanism for explicitly representing rests within MIDI files. Several ex-
tensions to MIDI have been suggested to overcome the various limitations of MIDI
for a general musical and notational information interchange.13 So far, none of the
extensions have been widely accepted, therefore, the basic problems remain.
Regardless of the shortcomings of MIDI as a music interchange language, millions
of MIDI files exists today. Almost every computer music application supports MIDI
either for importing or exporting musical data in a commonly understood format.
MIDI is one of the most used music file formats in the realms of score level repre-
sentation. Everything that can be represented in a MIDI file can also be represented
within a G UIDO description. Because G UIDO allows arbitrary event durations, there
is no loss of information when directly converting MIDI into G UIDO.14 This does
not hold for the inverse: in most cases, information is lost when converting from
G UIDO to MIDI. The reason for this lies in the fact that much of the musical infor-
mation contained in a G UIDO description simply cannot be explicitly represented
using MIDI (e.g. slurs, articulation, etc.).
The complexity of converting MIDI files into graphical scores very much depends
on the recorded MIDI data: if the notes are quantized (which means, that all note
onsets and durations are put into a predefined set of allowed timepositions), and
the separation of simultaneous voices into several tracks has been carried out, then
a conventional score can be created quite easily. Of course many issues and tasks
remain, for example deciding which voice to print in which staff for a piano score.
Another dissertation dealing with many of the complex problems of this domain is
currently being written by Jürgen Kilian and will be published within the next year
[Kil02].
The inverse process (producing MIDI files from G UIDO descriptions) is a straight-
forward process that has been implemented in [Fri00]. As stated above, most of the
explicit musical markup is lost, although it is used for creating a musical “sensi-
ble” performance (for example, a nslur-Tag in the G UIDO description changes the
played duration of the notes in the MIDI file, resulting in a legato playback).
Conclusion The MIDI file format is widely used for representing score level music
but has significant drawbacks when being used as a notation interchange format.
Because of the different underlying ideas, a direct comparison of MIDI and G UIDO
does not yield interesting results. All information present in a MIDI file can be
represented in G UIDO, which does not hold for the inverse.
13
See pages 73–108 in [SF97a].
14
A direct conversion from (unquantized) MIDI into G UIDO without any “intelligent” processing
does generally not produce usable information (at least for music notation purposes).
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 23
2.5.2 DARMS
DARMS stands for Digital Alternate Representation of Musical Scores [SF97b]. Its
development started as early as 1963 and it is still used as a music (and mainly
score) representation language today, although currently no “modern” application
for reading or writing DARMS exists. Because DARMS was developed to allow com-
plete scores to be coded with a generic computer keyboard, the resulting code is
human readable.15 A multitude of DARMS dialects exist, all specified for special ex-
tensions (like for example lute tabulature). All dialects are closely related to the so
called Canonical DARMS, which was first described in 1976. The term “canonical”
was thoughtfully chosen, meaning that the encoding of single elements of a score is
unambiguous: there is only one way of encoding notes, rests etc.
One major difference between DARMS and G UIDO lies in the fact that in DARMS,
pitch is represented by a number which encodes a position on a staff (which depends
on the current clef) rather then being represented explictly like in G UIDO. The
note “e” in the first octave would be represented by “1” if a treble clef was used, by
“7” for an alto clef, and by “13” for a bass clef, because the position on the staff-
line changes. Figure 2.8 shows the G UIDO and DARMS coding of a small musical
fragment together with the matching graphical score.
_
_XÚ–––
DARMS (1): !G 1Q !C 7Q !F 13Q
XÚ––
DARMS (2): !G 1Q !C 7 !F 13
——
GUIDO (1): [ \clef<"treble"> e1/4 \clef<"alto"> e \clef<"bass"> e ]
& XÛ B –
========= ?
GUIDO (2): [ \clef<"g"> e1/4 \clef<"c"> e \clef<"f"> e ]
One important similarity between DARMS and G UIDO is the carry feature: In
DARMS, the duration or the pitch of a note is carried to the following note. This
feature is used in the second DARMS encoding (denoted DARMS (2)) of the previ-
ous example (Figure 2.8): the duration (here a quarter note) is kept for all following
notes. If the pitch stays the same and the duration is changing, it suffices to encode
the new duration without repeating the pitch (which is actually just the line num-
ber for the current staff). If neither the duration nor the pitch changes, either pitch
or duration can be used for encoding the repeating note.
In G UIDO, the carry feature does not work for pitch information, as pitch informa-
tion is explicit for every note. The carry feature works the for duration and for the
register (the octave); this makes hand coding of musical examples fairly fast.
In DARMS, so called push codes indicate non-printing rests: a similar feature is
available in G UIDO. Here, the non-printing rests are called “empty” and behave
15
In order to save memory and to optimize performance, DARMS uses single characters to encode
musical information; K stands for key signature, G and F are the treble and bass clef. Even though
the resulting code is human readable, the reader must know the meaning of the single characters in
order to understand the encoded music.
24 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
just like normal events without being visible in the final score.
The concept of grouping several elements horizontally by using a comma (“,”) and
whitespace (“ ”) for vertical placement is similar in DARMS and G UIDO.16
DARMS and G UIDO both encode music by storing several voices (called “parts” in
DARMS and “sequences” in G UIDO) sequentially, which are then stacked horizon-
tally.
One rather advanced feature of DARMS called “linear decomposition” allows to en-
code complex parts of a score by offering multiple parses, which is equivalent to
“going backwards” in time. This feature is not directly available in G UIDO, but as
G UIDO descriptions have no limitation on the contained number of voices, “linear
decomposition” can be simulated easily by adding voices containing “empty” events
together with notes and rests, which encode the relevant area of the score. Fig-
ure 2.9 shows a short fragment where linear decomposition (which is encoded in
DARMS using “!& : : : & : : : $&”) is used for the last three notes. The equivalent
G UIDO description is also shown: here, two voices are required.
DARMS:
— —— — —
!I1 !G -1QU 2QU !& 3EU( 2EU) / 2HU //
GUIDO:
–– –– ––
{ [ \staff<1> \clef<"g2"> \stemsUp c1/4 e \beam( f/8 e ) | e/2 ] ,
[ \staff<1> \stemsDown empty/2 \beam( d/8 c ) | c/2 ] }
Conclusion DARMS is a very well thought of encoding scheme for music notation –
especially considering that it was created in the 1960’s. Because of the compressed
nature of the encoding format, DARMS looks rather cryptic and can not be under-
stood intuitively. Also, encoding pitch information by positions on staff-lines does
not offer an advantage over explicitly encoding pitch information for each note. All
information from a DARMS description can be encoded in G UIDO.
2.5.3 MuseData
The MuseData format has been developed by Walter B. Hewlett since 1982 at the
Center for the Computer Assisted Research in the Humanities (CCARH) [Hew97].
It is strongly focused on encoding logical aspects of musical information, meaning
that MuseData represents what could be thought of essential musical information
that in terms can be (re-)used for analytical and notational purposes. Up to now,
many classical works (by J. S. Bach, Beethoven and many others) have been encoded
as MuseData files. Some of the encoded works have also been published as scores
16
In order to allow an arbitrary amount of whitespace in the case of horizontal groupings (e.g.
chords), G UIDO additionally uses curly and straight braces (“f,g,[,]”).
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 25
which are sold commercially. One interesting aspect of MuseData is, that – similar
to G UIDO – the encoded information does not need to be complete in the sense, that
exact formatting information for notational elements needs not to be encoded; so
called print suggestions can be used to guide notation programs in producing exactly
formatted scores from a MuseData file.
MuseData files are text-based, human-readable files. Different from many other
text based representation languages, the formatting within the file is an integral
part of the represented music: each line is treated as a record, containing either
header information (like the title and composer of a piece) or musical information.
The musical attributes record contain attributes which usually pertain throughout
the file (key signatures, time signatures, clef signs, etc.). So called “Data Records”
encode musical information (such as notes, rests, etc.) by using different predefined
letters and numbers placed at specific column-positions. This encoding makes it
easy to find a specific location within a MuseData file, but additional overhead is
required to ensure the correct encoding of a piece.
Figure 2.10 shows an example MuseData encoding together with the matching
G UIDO description and the respective score. The header information is omitted
in the MuseData encoding in order to concentrate on the representation of musi-
cal parameters. The first line encodes the key signature (2 sharps), the division
per quarter note (4), the time signature (0/0), and the clef signs for the first and
the second staff (4 is a treble clef, 22 is the bass clef). In the following lines, the
notes and rests are encoded. A very powerful feature within MuseData is the “back”
command, which is used to “rewind” the time. This is used in the example to print
two simultaneous voices within the same part. This feature is similar to the “lin-
ear decomposition” feature of DARMS. The note names in the MuseData example
can be easily identified (“D4”, “G3”, etc.). In MuseData the (musical) duration of
a note or rest does not necessarily have to match the notated duration; this is an
important feature for a complete music representation language, because it allows
the representation of uncommon durations without producing an unreadable score.
A detailed description of the column-oriented formatting features of MuseData can
be found in [Hew97]. For a comparison of MuseData and G UIDO it suffices to know
that MuseData offers a large variety of formatting instructions commonly used in
conventional scores.
While there are quite some similarities between G UIDO and MuseData, the most
important difference is the restrictive formatting of MuseData files. While this may
have an advantage when navigating within a MuseData file, it results in quite an
amount of overhead when producing the files. Because MuseData and G UIDO were
both created as complete music representation languages, they share the notion
of encoding musical content rather than music notation. Both formats offer exact
formatting features, which may be ignored by applications. The “back” feature of
MuseData has no equivalent in G UIDO. This feature can be simulated quite easily
in G UIDO, as was shown in Figure 2.9. As the MuseData file format is very well
documented and a large number of MuseData files exist, a converting application
26 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
«
E3 2 3 e d2 =
2.5.4 SCORE
SCORE is a commercial music notation system available for DOS and Windows. It
has been developed since the 1980s by Leland Smith [Smi97]. Input to SCORE is
encoded as an ASCII file, which contains a description of notes, time position and
musical markup (like articulation and slurs).17
SCORE has been created with high quality publishing of musical scores in mind;
it is therefore very strongly graphically oriented: one input file always describes
exactly one page of a score. Many formatting decisions are made automatically by
SCORE, but it is also possible to specify the exact location of all graphical elements
through the use of parameter files. It is even possible to include arbitrary postscript
17
In the following, the term SCORE will be used to describe both the input file format and the
notation system. It will be clear from the context, what is meant.
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 27
symbols and files in a score, making it an ideal tool for contemporary music publish-
ing.
Because SCORE is so much concentrated on the visual aspects of a score, the file for-
mat itself is not very useful as a general music representation language, although
logical information for a piece can be extracted. Nevertheless, SCORE is a very suc-
cessful notation program used by many publishers for high quality musical typeset-
ting. The notation algorithms implemented in SCORE are among the finest avail-
able and include automatic formatting of various parts of a score. Unfortunately,
the developer has not published anything on the implemented algorithms, therefore
a direct comparison to other published notation algorithms can not be done.
Even though SCORE input files are ASCII files, they follow a somewhat non-in-
tuitive approach by first specifying the pitch of all notes, followed by the respective
durations, followed by articulations and finally ending with the description of beams
and slurs.18
Conversion from SCORE to G UIDO would probably be possible; at least the basic
data (like pitch, octave, most articulations, beaming and slurs) could be easily con-
verted to their respective G UIDO descriptions. From [Smi97] (p. 279): “[SCOREs]
strongly graphical nature produces occasional obstacles to analysis. For example,
the differentiation of ties and slurs [: : :] must be inferred from other information.”
without accidentals, because no accidentals – besides the key signature – are visi-
ble. The encoding is not concerned with the sounding pitch, but stores the visible
pitch, ignoring the accidentals. When a melody needs to be written with another key
signature, all following notes must be updated to reflect the change. This greatly
differs from G UIDO and other formats, where the sounding pitch is encoded.
2.5.7 MusiXTEX
MusiXTEX [TME99] is part of a whole family of TEX based music typesetting sys-
tems. It is a set of TEX macros to typeset orchestral scores and polyphonic music. It
does not decide on esthetic problems in music typesetting, therefore the process of
“coding MusiXTEX might appear to be (: : :) awfully complicated (: : :)”.19 The manual
states right at the beginning: “If you are not familiar with TEX at all I would rec-
ommend to find another software package to do musical typesetting.” It is therefore
not intended to be easily used by the average computer user. Nevertheless, being
a non-commercial open-source development, MusiXTEX could be used as the under-
lying framework for a user friendly notation system, although this requires a lot of
work.
The basic idea of MusiXTEX is to describe a score as a textual description, which is
then converted into a graphical score. As indicated above, MusiXTEX does not make
formatting decisions: it is up to the user to provide the system with parameters for
every little formatting detail (for example the slope of a beam). Because of its graph-
ical nature, MusiXTEX contains a very large number of musical symbols but it is not
very useful as a general music representation format. Because MusiXTEX is based
on (plain) TEX, a score description encoded in MusiXTEX can be created and read
19
Citation taken from the MusiXTEX manual [TME99]
30 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
«
\Notes\ibu0f0\qb0{cge}\tbu0\qb0g|\hl j\en Piano
‹‹ ‹‹
\Notes\ibu0f0\qb0{cge}\tbu0\qb0g|\ql l\sk\ql n\en 4 —— —— —— —— —— —— —— —— ‹‹ —— —— —— —— —— —— —— —— ‹‹
\bar & 4 _—XÛ XÛ XÛ— XÛ _—XÛ XÛ XÛ— XÛ =
========== XÛ XÛ XÛ — XÛ XÛ— XÛ ‹
‹ —XÛ======== _XÛ
\Notes\ibu0f0\qb0{dgf}|\qlp i\en
\notes\tbu0\qb0g|\ibbl1j3\qb1j\tbl1\qb1k\en
\Notes\ibu0f0\qb0{cge}\tbu0\qb0g|\hl j\en
\end{extract}
\end{music}
2.5.8 LilyPond
GNU LilyPond20 [NN01, MNN99] is another musical typesetting system based on
TEX. Differing from MusiXTEX, LilyPond does make formatting decisions. The
whole LilyPond system works very similar to the G UIDO Notation Engine:21 a text
file is converted into an internal score representation, which contains the exact lo-
cation of each notational element that together constitute the graphical score. Lily-
Pond uses TEX for the actual typesetting, whereas the G UIDO Notation Engine uses
the so called G UIDO Graphics Stream to describe which elements to put where.
Even though the name LilyPond encompasses the whole system for typesetting mu-
sic, the name is also used to describe the input format.22 In the following, it should
be clear from the context, whether the complete system or just the input file format
is meant.
20
LilyPond is distributed under the GNU Public License [GNU]; in the following, the term LilyPond
will be used as a synonym for GNU LilyPond
21
The G UIDO Notation Engine and the G UIDO Graphics Stream is described in detail in chapter 5.
22
In earlier versions of the LilyPond documentation, the name Mudela (short for “Music Descrip-
tion Language”) was used as the name for the input format. This name seems no longer to be used
in current versions.
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 31
The encoding of musical data in LilyPond does not only address typesetting issues,
but also performance aspects of a musical score.23 A LilyPond description is human
readable plain text, which can be intuitively understood (as long as simple examples
are encoded) due to the use of commonly used musical names (for instance note
names like “c,d,e” etc., and instructions like “key”, “meter”, “time” and the like).
This feature of LilyPond is very similar to G UIDO. LilyPond differs from G UIDO as
it allows the usage of variables and the definition of functions, which can change
the default behavior of the notation system. These features make LilyPond more a
“programming language” than just a music representation formalism. This greatly
enhances the expressive power of the language, but it also makes it more difficult
to process or convert LilyPond to other formats.
As LilyPond clearly addresses users, who are comfortable with specifying their mu-
sical input as text, great care has been taken to make the encoding of music as
comfortable as possible: some nice features include a relative-mode for note-entry,
which is used to interpret the octave of a note with respect to the preceeding note
by taken the shortest interval – this eliminates the need to explicitly switch octaves
too often, a common source of encoding errors when using G UIDO.
The LilyPond system converts the input file into a TEX description, exactly specify-
ing, where each glyph24 has to be put on a page. It is left to TEX to actually create the
graphical output (either a DVI-file or a postscript output). In order to make musical
typesetting decisions, LilyPond accesses the font information available (the width
and height of individual glyphs). The used musical font – called feta – is distributed
freely with the system.
The LilyPond system can be extended by advanced users. Because the input is in-
terpreted and can contain complete functions (using Scheme, a functional program-
ming language similar to Lisp), the LilyPond system is a very versatile and powerful
notation system. One problem of LilyPond is its dependency on a variety of other
programs and tools: a complete installation of TEX, GUILE, Scheme, Python, etc.
is required to run LilyPond. The system setup is rather complicated and can not
usually be carried out by people, who have only interest in music. The targeted user
of LilyPond is the computer literate, probably already using TEX for typesetting. It
is not intended as a replacement of commercially available notation packages and it
does not offer any graphical frontend.
Finally, LilyPond is work in progress. It is constantly being enhanced, new features
are added. Being an open development, it will probably attract more programmers
and users in the near future.
Comparing the LilyPond input-format to G UIDO is not easy, because the intended
use is different. LilyPond strongly focuses on notational issues while G UIDO is
meant to be a general music representation language. By being extremely flexible,
23
At the time of this writing, the only performance aspect that can be manipulated is the tempo of
playback when converting a LilyPond file into a MIDI file.
24
A glyph is a single element of a font; most musical symbols are constructed by combining several
glyphs.
32 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
LilyPond:
\score {
\notes {
\clef violin
e4
_
_XÚ–––
\clef alto
—— XÚ––
e4 & XÛ B –
========= ?
\clef bass
e4
}
\paper { }
}
2.5.9 NIFF
The Notation Interchange File Format (NIFF) [Gra97, Bel01] is a binary format
that was meant to be the notation interchange format used by the majority of com-
mercial and non-commercial music notation programs as a commonly agreed format
to exchange music notation. The specification of the format was completed in 1995.
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 33
NIFF is based on the Resource Interchange File Format (RIFF), a data structure
defined by Microsoft (and also used in WAV files, for example).
The developers of NIFF, among them the major software companies dealing with
music notation, envisioned a complete and universally usable description formal-
ism that would allow the description and interchange of a large variety of scores.
Therefore, the actual specification grew rather large, resulting in very few complete
implementations of the format. As major sponsors dropped their support, there is
currently no further development of the original format (see [Bel01]); the format
is nonetheless completely specified and can be used free of charge by anyone inter-
ested. A software development kit (SDK) is available for free, which allows reading
and writing of NIFF files [Bel01].
One major difference between NIFF and G UIDO lies in the fact that NIFF is a binary
format, which requires specialized software for reading and writing a file. Another
difference is the lack of adequacy in NIFF – to encode something like a simple scale,
the complete layout of the score has to be specified aswell. This makes an intuitive
use of NIFF almost impossible, which is not necessarily wrong by itself, as NIFF
was never intended to be an easy to use representation formalism. Nevertheless,
a direct comparison of G UIDO and NIFF is not a fruitful endeavor. However, the
NIFF developers have identified and addressed many of the commonly understood
problems of representing music notation. Their effort is a valuable resource for
others working in this field.
“SMDL can represent sound, notation in any graphical form, and processes of each
that can be described by any mathematical formula.”25 The design of SMDL is based
upon several domains: the logical domain which is also called “cantus” is used for
music representation, while the visual domain is responsible for graphical aspects,
like for instance one or several images of a score. Another domain, the “gestural
domain”, covers performance aspects of a piece. The general idea is to allow one
piece of music in the logical domain to have many score representations in the visual
domain and also different performances in the gestural domain. The most important
musical representation aspects occur in the logical domain. Pitch can be represented
either as frequency or by using names, which are associated with frequencies by
using a previously defined lookup table. Duration may be specified using either
real time or “virtual” time (which just means that the duration of events is defined
relative to each other; conventional scores use virtual time). The use of virtual time
is supported by HyTime, which is used to map the virtual time of an event to a real
duration based upon tempo information. Any articulation and dynamic information
can be encoded in SMDL.
The syntax of SMDL follows the well known syntax of SGML, which is similar to
the syntax of HTML. An important aspect of SMDL is the fact that the code was not
devised to be easily human-readable; it is supposed to be used as an interchange
format, which is read and written by computer programs. The overhead required
for representing even simple music prohibits to show an example here; a rather
large SMDL description can be found in [SN97]. One specific goal of SMDL is the
ability to represent any item found in any other music representation code.
SMDL has not been widely accepted by music software developers so far. One of
the reasons for that may be the fact that it is difficult for potential users and tool-
developers to see how SMDL might apply to their particular application. SMDL’s
stated goal to cover everything found in any other music representation code makes
the format very broad – a rather large overhead is required even when simple music
is represented.
Because of its gaining popularity, XML based formats are being created in many
domains (such as mathematics, chemistry, and music). Recently, a couple of music
representation languages based on XML have been presented [CGR01]. XML by
itself is a meta-format, which mainly defines the syntax of the document. The se-
mantics of any XML document is defined by so called Document Type Definitions
(DTD’s). The main work for any developer of a music representation language is the
creation of a DTD.
25
[SN97], page 470
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 35
One major advantage of XML over other representation formalisms is the fact that a
growing number of software tools exist for validating and creating XML documents.
This does not solve any of the problems found when representing logical aspects of
music or when converting a (logical) music representation into a graphical score;
but it helps in the validating of the documents (the syntax and semantics of the
DTD can be checked without knowledge of the document domain).
At the time of this writing, a dozen different XML-based formats for music represen-
tation have been proposed, some of which are prototypical, others cover almost all
aspects required for general music representation. Because XML is a hierarchical
format, an XML developer has to define which elements of music can be described
by using hierarchies (for example one piece contains many voices, one voice con-
tains many measures, a measure contains chords, a chord contains notes, and so
on). One interesting feature of XML is a grouping construct based on identifiers. An
entity can be made part of group using this mechanism. Some XML-development
tools are able to validate these grouping mechanisms (for example to ensure that all
identifiers are unique). As an example consider the following:
where the beam has a unique identifier, which can be used by notes belonging to the
particular beam group.
As each XML-based music representation language defines a different DTD, there
is no general structure common to these encodings. One thing that is common to
all formalisms is the descriptive richness of the formats. For example, there is a
<note .../> construct in almost any XML-based format; this results in rather
long encodings even for simple pieces.
Because XML is a syntactic rather than a semantic formalism, the representation
of music has no ideal or natural match within XML. To demonstrate that XML per
se (as opposed to one distinct XML-based representation) has no advantage over
G UIDO, a DTD and converters for an XML-based music representation based on
G UIDO, called G UIDOXML, were developed. A G UIDO description can be simply
converted into the one-to-one corresponding G UIDOXML description and vice versa.
Figure 2.15 shows a G UIDO description and the matching G UIDOXML-code. Note,
that both descriptions of Figure 2.15 contain the same information. Because the
conversion in both directions is almost “trivial”, it can be argued that G UIDO itself
is a suitable XML-based music representation formalism.
XML-based music representation languages are still quite new; the future will show,
whether any one of the proposed formats will be accepted and used widely.
disclosed to the public and which quite frequently change whenever a new version
of the program is released.26 Finale offers a plugin-interface, which can be used
to extract musical information from a file, but this technique requires a running
version of the program (for which a purchased license is needed). The same holds for
Sibelius. Therefore, the process of converting the files directly requires some degree
of reverse engineering. Most available conversion programs,27 which either write or
read the commercial formats, rely on limited documentation of the file formats and
only extract partial information from the files.
Even though a large number of scores exists, which have been encoded in Finale or
Sibelius, these file formats can not be regarded as general music representation lan-
guages. As long as the underlying formats are not made publicly available, the data
contained within must first be converted into some “open” format before it can be
truly interchanged with other applications. Currently, the producers of commercial
notation software seem to have no interest in an open interchange format.
26
Finale does offer limited information on its format (called Enigma); the information is not com-
plete and changes with every new release.
27
For example, there exists a converter from Finale-files to LilyPond.
2.5. OTHER MUSIC REPRESENTATION LANGUAGES 37
2.6 Conclusion
In this chapter, G UIDO Music Notation was described in detail and features and
examples for each of the three layers of G UIDO (Basic, Advanced, and Extended
G UIDO) were given. Then, other music representation languages were compared to
G UIDO and their respective advantages and disadvantages were elucidated. Each
one of the music representation formalisms, which were presented in this chapter,
has a distinct area of application – there is no general format that covers all conceiv-
able aspects of music representation and is easy to use at the same time. Because of
the different descriptional focus – some formalisms represent musical ideas better
than notational ones and vice versa – it is a matter of the task at hand, which music
representation language to use. Many of the above described formalisms concen-
trate strongly on the graphical appearance of a score (e.g. SCORE, MusiXTEX, and
others). These representations are definitely suited for a music notation system,
but they lack the power to completely represent the musical content of a piece. As
SMDL clarified, a score is just one possible view of a piece. Just as different editions
(or extractions) may exists, other views (like performance oriented views) may ex-
ist. Therefore, the focus on notational aspects alone is not sufficient for a general
music representation format. MuseData focuses on the logical domain of music rep-
resentation, while still being able to cover aspects of exact score-formatting. While
this approach is somewhat similiar to G UIDO, it has drawbacks because of the very
strict formatting rules. MuseData is not easily extensible, and it is also focused on
music from the classical and romantic period. Music without a regular measure
structure (either lacking a time signature or having different time signatures for
different voices) can not be easily encoded in MuseData. Nevertheless, MuseData is
a complete music representation formalism.
There are many reasons for choosing G UIDO as the underlying music representa-
tion language for a music notation system. First of all, G UIDO descriptions are easy
to create and parse. Secondly, the possibility to represent exact score-formatting
information within a G UIDO description is of great advantage – all algorithms of
the notation system can be described as G UIDO to G UIDO transformations. But,
as G UIDO is primarily describing musical content, the formatting aspect within a
G UIDO description is just additional information. The fundamental musical struc-
ture of a piece is always closely related to the structure of the G UIDO description.
Additionally, tools for removing all formatting information from a G UIDO descrip-
tion exist. Therefore, the essential musical information can be easily extracted.
Because G UIDO is such a flexible representation language, the conversion of arbi-
trary G UIDO descriptions into conventional scores can be complicated: just consider
non standard note durations being specified (like c*11/14), or a complex piece given
without exact formatting instructions. The algorithms required for converting such
2.6. CONCLUSION 39
G UIDO descriptions into acceptable conventional scores will be presented in the fol-
lowing chapters.
40 CHAPTER 2. THE GUIDO MUSIC NOTATION FORMAT
Chapter 3
A Computer Representation of
G UIDO Music Notation
Introduction
In chapter 2 it was shown that G UIDO is a powerful general music representation
language which is not only suitable for representing musical structure but encodes
notational aspects as well. As G UIDO is a text-based format, applications that
work on G UIDO descriptions require an internal storage format in order to effi-
ciently access the represented musical and notational data.1 This storage format
generally differs depending on the application – a music notation application has
other requirements than a musical analysis tool. In the context of music nota-
tion, great care must be taken when defining the internal representation: as will
be shown in the following chapters, music notation can be defined as transforma-
tions of under-specified G UIDO descriptions into well-defined G UIDO descriptions:
The under-specified G UIDO description does not contain complete score formatting
information, which is automatically added by the music notation system. These
transformations require a powerful internal representation, as many of the notation
algorithms will be carried out on this structural level. A well structured inner rep-
resentation of G UIDO is not only valuable in the context of music notation, but can
be used for a broad range of applications. It should be evident that all notationally
relevant information present in a G UIDO description is encoded in the implemented
inner representation.
The work presented here did not start from scratch. Other applications using
G UIDO have been developed before this thesis was written.2 Nevertheless, most
1
Storing G UIDO descriptions internally as textual representations requires continuous re-parsing
of the strings, which would be inefficient.
2
One major application certainly was the SALIERI-System [HKRH98]; this comprehensive com-
puter music system incorporates a musical programming language, whose musical data type is
closely related to G UIDO. As the definition of G UIDO became more complex, the development of
SALIERI stagnated; therefore the inner representation used in the SALIERI system can not repre-
41
42 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
of these applications are using a task-oriented inner representation that does not
cover all aspects of G UIDO. One common tool being used by almost every G UIDO
compliant application is the G UIDO Parser Kit [Hoo], which includes a complete
lex/yacc-generated parser3 for G UIDO. The G UIDO Parser Kit does not include an
inner representation for G UIDO; it merely offers a collection of hook-functions that
are called whenever a G UIDO-event or tag is recognized in the parsed string. It is
up to the user to build and fill an inner representation from these hooks.
The inner representation for G UIDO descriptions that was developed as part of this
thesis is called “Abstract Representation” (AR). This chapter deals with the object-
oriented design and the structure of the AR. It will further be shown that any G UIDO
description (and its corresponding AR) can be converted into a semantic normal
form; this normal form simplifies the structure of the AR and is also useful for clar-
ifying the semantics of a G UIDO description. In the following, the transformation
of G UIDO descriptions into the AR will be explained. Finally, the requirements and
methods for accessing and manipulating the AR are presented.
Views
Conventional score:
Document
Textual representation:
[ \clef<"treble"> \key<"D">
Access− and \meter<"4/4"> a1*1/2 b a/4.
g/8 f#/4 g a/2 b a/4. g/8
Abstract Representation manipulation f#/4 g a/2 a b c#2/4 d c#/2
b1 a/1 ]
protocol
one−to−one
correspondence
GUIDO description
The dvm was chosen as the model for the inner representation of G UIDO, because
this design pattern closely matches the structure of G UIDO descriptions with re-
spect to music notation: the musical information contained in a G UIDO description
do not necessarily represent a score. As will be shown in chapter 4, the process
of converting an arbitrary G UIDO description into a conventional score requires a
large number of steps. One graphical score represents only one possible view of the
G UIDO description. Other music notation algorithms may result in another score.
Therefore, the dvm captures the inherent structural properties of G UIDO.
4
WYSIWYG: What You See Is What You Get
44 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
GUIDO description
parse dump
Manipulation
Abstract
Representation
Manipulation
conversion dump
Graphical
Representation Score
Taken these consequences a step further, it is now possible to define the G UIDO
Semantic Normal Form:
The GSNF can now be used, to easily define the notion of semantical equivalence as
it is needed when interpreting different G UIDO descriptions:
Table 3.1 shows semantically equivalent G UIDO descriptions based on the above
rules; each line in the right column of Table 3.1 is the GSNF of the G UIDO de-
scription in the left column. Some of the examples in Table 3.1 give rise to several
questions; for example, there seems to be no possibility, to attach a range tag with
a simple tag (as in example no. 5, where one might interpret the ncresc-tag being
attached to an nintens-tag. If the musical/notational intent is to attach a range
explicitly to a tag (as could be the case in example no. 5), it is however possible,
to define empty events with no duration; the example no. 5 would thus be writ-
ten as [ ncresc( empty*0/1 nintens<"p"> c d ) ]. Different graphical off-
sets could then be applied to the ncresc-Tag, which would be attached to the empty
event, which is not directly shown in the score.
3.2. GUIDO SEMANTIC NORMAL FORM 47
2. If a range does not begin at an event, shift the beginning of the range to the
right until either an event or the range end is reached.5
3. If a range does not end at an event, shift the end of the range to the left until
either an event or the beginning of the range is reached.5
5. If more than one range tag begins or ends at the same event, order them al-
phabetically by their name.
As an example, consider the third row of Table 3.1: By applying rule 1, the beginning
of the slur range is shifted towards the right and the GSNF is obtained.
The GSNF leads towards a data structure for G UIDO sequences that distinguishes
between events, non-range tags, and range tags. Events and non-range tags can
be stored in an ordered list, where the order of elements is taken directly from
the G UIDO description. The time positions of the elements form a monotone rising
function.6 For example, the sequence of elements for the third line of Table 3.1 is [ c
nclef<"treble"> d e ]7 and the respective time positions are 0 41 14 42 .8
The range tags are each split into a begin- and an end-tag, which store a pointer to
the location of their respective events within the element list. Figure 3.3 visualizes
the data structure for the third row of Table 3.1. Note that the range tags are itself
ordered by the position they point to in the element list and also by alphabet, if they
begin or end at the same event.
The described data structure will be discussed in more detail when presenting the
structure of the AR in section 3.3. The advantage of separating the range tags
from non-range tags and events lies in the fact that they can thus be easily added,
removed or exchanged. These operations will be needed for the different notation
algorithms in chapter 4.
5
If another range tag is encountered during the search, it can be ignored.
6
There are some minor exceptions to this rule; consider for example the ncue-tag, which is used to
describe cue notes. All events within the range for the cue tag add their duration to the time position,
but when the range is closed, the time position is reset to the time position of the starting event in
the range. The internal representation deals with this by introducing a completely independent voice
and therefore maintaining the monotone rising time function for each individual voice.
7
Note that the nslur-tag is not present in this list.
8
In G UIDO, the duration for the first event is 14 if nothing else is specified.
48 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
Element list:
Position: 1 2 3 4
Event: c Tag: clef Event: d Event: d
contains
0..n
KF_IPointerList
ARSlur ARClef <ARMusicalObject> ARNote ARRest
1 1
0..n
49
50 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
Class ARMusicalObject
All classes within the AR inherit directly or indirectly from class ARMusicalOb-
ject. This class is the base entity for all musical information. Class ARMusicalOb-
ject stores the duration and the time position of the respective instance, as this
information is shared by all musical objects in a G UIDO description. In the case
of tags, the duration is always zero.11 The time position depends on the entities’
position within the G UIDO sequence.
Because almost all classes are descendants of class ARMusicalObject, collections
of events, or tags can be defined as lists containing objects of type ARMusicalOb-
ject. Using the runtime-type-information feature of C++, it is possible to deter-
mine the actual type of an object at runtime.
This helper class implements a doubly linked list of pointers to objects of type T;
it is implemented using the template-feature of C++. All functions for traversing
or adding and removing elements are implemented. Class KF IPointerList is in-
tensively used throughout the AR. In the future, class KF IPointerList might be
exchanged with the list class from the Standard Template Library (STL) [Rob98], as
the STL usually offers very efficient data structures. Because of the object oriented
design, such a change would be transparent to the user of the AR.
Class ARMusic
the definition of hierarchical scores (see also section 2.3.3): a complete score can be
interpreted as an event contained in another score.12
Class ARMusic defines some notational transformation functions, which will be de-
scribed in detail in chapter 4. Most importantly, class ARMusic offers parallel access
to the contained voices. This feature is important for manipulating the AR when it
is converted into a conventional score.
Class ARMusicalEvent
Class ARMusicalEvent is the base class for all G UIDO events: notes, chords13 and
rests. An event has a duration (which may be zero) and it can be the beginning or
end of a range tags. As G UIDO events can specify durations as fractions in com-
bination with dots (as in [ c1*1/4.. ]); class ARMusicalEvent has a member
variable, which holds the number of dots for an event (which would be two in the
previous example). The actual duration of the event is stored in a member variable
of class ARMusicalObject.
Class ARNote
Class ARNote inherits from class ARMusicalEvent and describes G UIDO note ev-
ents. Class ARNote has member variables for the note name, pitch, octave, inten-
sity,14 and a list of accidentals. When parsing arbitrary G UIDO descriptions, it may
very well be, that the duration of an instance of class ARNote is not displayable
as a single graphical element; these notes will be treated by the music notation
algorithms of chapter 4.
Class ARRest
Class ARRest inherits from class ARMusicalEvent and describes G UIDO rests.
Rests do not store additional information, because they are completely defined thru
their duration, which is already held in class ARMusicalObject.
Class ARMusicalTag
Class ARMusicalTag is the base class for all tag classes. It stores common tag data
and offers common tag functions; class ARMusicalTag stores the following data:
allowRange: is 1, if the tag is allowed to have a range (as for example the
nfermata-tag). This parameter is used during the parsing of GUIDO descrip-
tions: if it is zero, and the parsed tag has a range, the error flag is set and
the tag is ignored.
assoc: the association of the tag. This flag describes, the direction of the
association that a tag has with respect to events. The assoc flag has one of the
following value: Left-Associated (LA), Right-Associated (RA), Don’t-Care (DC),
Error-Left (EL), Error-Right (ER). If the tag is a begin-tag (like nslurBegin),
the association is always RA, which means that the tag is associated with the
the next event (being on the right hand side in the G UIDO description); if the
tag is an end-tag (like ncrescEnd), the association is always LA, which means
that the tag is associated with the previous event (which is on the left hand
side in the G UIDO description). If the tag is not a range tag, the association is
DC. In the case of an error (error-variable is set), the association is one of ER
or EL.
These parameters will be explained further in section 3.4, when the conversion
of G UIDO descriptions into the AR is presented.
Class ARPositionTag
Class ARPositionTag is one base class for all range tags.15 An instance of class
ARPositionTag stores a pointer to a position in the element list of class ARMusi-
calVoice and also a pointer to the matching ending position tag (which is again an
instance of class ARPositionTag): the begin-tag stores a pointer to its matching
end-tag and vice-versa. One range tag from a G UIDO description is internally bro-
ken into two instances of class ARPositionTag: one begin- and one end-tag. There-
fore, the AR of the two G UIDO sequences [ nslur(c d e) ] and [ nslurBegin c
d e nslurEnd ] is identical.16
A tag, which can occur either as a range tag or as a non range tag (the ntext-tag,
for example) inherits from both class ARMusicalTag and class ARPositionTag so
it can be handled as either one.
Class ARMusicalVoice
Class ARMusicalVoice completely represents a G UIDO sequence. It is used by
class ARMusic to represent a complete voice within a piece. Class ARMusical-
Voice directly inherits from class KF IPointerList<ARMusicalObject>, so it
15
As the AR uses multiple inheritance, a range tag usually has at least two base classes.
Internally, there is a distinction between the nslur-tag with a range and the nslurBegin- and
16
nslurEnd-tag, so that the original structure of the input is not destroyed. Nevertheless, the two
sequences are treated completely alike.
3.3. STRUCTURE OF THE ABSTRACT REPRESENTATION 53
can be interpreted as a container for musical objects. The inherited list stores all
events and non-range tags in the order in which they appear in the G UIDO descrip-
tion. Additionally, class ARMusicalVoice contains a list (class ptaglist), which
holds instances of class ARPositionTag. This list stores all range tags (begin- and
end-tags) ordered by their respective start- end end-positions in the voices element
list. The list-order follows directly out of the definition of the GSNF in section 3.2.
Figure 3.5 visualizes the structure of class ARMusicalVoice for the given G UIDO
sequence.
Element list:
Position: 1 2 3
Event: c Event: d Event: d
Class ARMusicalVoice offers all operations required for traversing the voice (see
section 3.5 for details on accessing the AR). The required state information is stored
in the helper class ARMusicalVoiceState.
Class ARMusicalVoiceState
Class ARMusicalVoiceState is needed when a voice is traversed sequentially (see
section 3.5). Class ARMusicalVoiceState stores all state information for an in-
stance of ARMusicalVoice, so that it is possible to resume a position within a
voice exactly. The information stored includes the current time position, the current
position-tags, and the current location within the element- and the position-tag-list.
Class ARSlur
Class ARSlur is included in this list of classes as an example for a range-tag. Class
ARSlur represents the nslur-tag; being a tag, it inherits from class ARMusicalTag.
54 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
Because the nslur-tag is a range-rag, class ARMusicalTag also inherits from class
ARPositionTag.
Class ARClef
The first approach for handling chords in the AR treated chords like other events
(notes and rests). This approach worked quite well at first, because chords were
treated as single entities that had no additional markup. This changed when it
became clear, that tags within individual chord notes are required for musical and
notational markup.17 Figure 3.6 shows a G UIDO description together with an in-
ner representation, in which chords are stored as single musical events (containing
other events). Because the chord is stored at one single position in the event list,
range tags can only point to this one single location. In the example of Figure 3.6
the ntieBegin-tag points to the second position of the event list. As can be seen
from this example, the chord must “redirect” the tag to its correct location within
the chord. This redirection must be handled by some mechanism within the data
structure – this shows that a chord cannot be regarded as a simple event. Repre-
senting chords as single events is also difficult when considering G UIDO Semantic
Normal Form: if a chord is a single event, a tag can only affect the whole event.
There is no mechanism for “partial” tags in the GSNF. Additionally, the conversion
of this inner representation into a conventional score has to deal with chords in a
specialized way, introducing an additional layer of complexity for dealing with tags
within chords. Therefore, a new solution was developed:
17
Think of partial ties, for example, that begin at a single note within a chord and end somewhere
later.
3.3. STRUCTURE OF THE ABSTRACT REPRESENTATION 55
[ c { c, \tieBegin e, g } e \tieEnd ]
Event: g
Event: e
Event: c Event: c Event: e
Position: 1 2 3
The idea for a suitable inner representation of chords came while looking at the
graphical representation of a chord in a score: A chord is an entity, where several
noteheads share a common stem, or at least a common horizontal position. Using
this observation, the idea of “graphical equivalence” was created: instead of repre-
senting a chord as a special form of an event, it can be represented by a series of
events combined with some formatting information. The implemented inner rep-
resentation of chords handles chords as graphical entities, which share a common
stem. Using this representation, all aspects of the GSNF (see section 3.2) can be
maintained. Figure 3.7 shows how the same example as presented in Figure 3.6 is
currently represented in the AR. The G UIDO description
[ c f c, ntieBegin e, g g e ntieEnd ]
[ c nchordBegin ndisplayDurationBegin<1,4,0>
nshareStemBegin empty*1/4 nchordComma ntieBegin e*0
nchordComma g*0 ndisplayDurationEnd nshareStemEnd
nchordComma empty*0 nchordEnd e*1/4 ntieEnd ]
At first, this transformation might seem exaggerated, but using this formalism, all
issues concerning tags within chords can be resolved. Four new internal18 tags are
used:
18
Internal tags are non-standard G UIDO tags that are used within the developed notation system.
They are not part of the G UIDO specification.
56 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
nchord: This tag groups all events and tags of a chord together. The musi-
cal duration of the chord is encoded in the first (empty) event of the nchord-
tag range, which is added automatically during the conversion. All the other
events in the nchord-tag range get a musical duration of zero. This is impor-
tant, because otherwise, the musical time of the voice would not match the
musical time of the original G UIDO description. In the example above, the mu-
sical duration of the chord (and therefore of the first empty event within the
nchord-tag range) is a quarter note.
ndisplayDuration: This tag enforces a specific visible duration to be used
when creating the notes and rests within its range. Instead of using the mu-
sical duration (as given in the G UIDO description) for creating the notation
symbols, the tag parameters are used to create note heads and rests. In the
example above, the displayed duration is 14 without dots (the last parameter
is zero). Therefore, all events within the tag-range are displayed as quarter
notes, even though their musical duration is zero.
nshareStem: All notes within the tag range share a common stem.
nchordComma: This tag groups the individual voices of the chord. Using the
nchordComma-tag in the transformed chord representation is mandatory, be-
cause otherwise the inner chord representation could not be re-converted into
the equivalent G UIDO description.19
The current chord representation has more consequences than might be first ex-
pected: when traversing a voice, an event-mode and a chord-mode must be distin-
guished. In event-mode, everything within the voice is iterated sequentially. This
mode is used when converting the abstract representation of a voice into a conven-
tional score. In chord-mode, all events that are part of a chord are read in one step.
This mode is used for many manipulating routines, which will be described later in
this chapter.
c*1/4 em pty*1/4 \chordCom m a c*0 \chordCom m a e*0 \chordCom m a g*0 \chordCom m a em pty*0 e*1/4
chordBegin
displayDurationBegin
shareStem Begin
tieBegin
shareStem End
3.4. TRANSFORMING GUIDO DESCRIPTIONS INTO THE AR
displayDurationE nd
chordEnd
kit [Hoo] and the AR. During the parse, the hook-functions of the G UIDO parser kit
successively call the routines of class ARFactory, which then builds the AR.
Class ARFactory is responsible for the following tasks:
Creating events (notes and rests) and handling the given parameters (dura-
tion, accidentals, pitch, and octave)
Checking that arbitrary pairs of begin- and end-tags are correctly matched.
Creating the inner chord representation, which was introduced in section 3.3.1.
After the G UIDO description has been parsed, each voice is transformed into G UIDO
Semantic Normal Form. This is important, because all further manipulation func-
tions on the AR require the input to be in GSNF; they “guarantee” (contract-model)
the output to be in GSNF as well.
(“U”) with name “dx”, default value “0”. The fourth parameter is similar to the
third; the fifth parameter is an optional (“o”) float-value (“F”) with name “size” and
a default value of “1.0”. A G UIDO description may contain the following nclef-tag:
[ nclef<type="bass",size=0.75,color="blue"> ] which defines a bass-clef
with 75 % of the regular size and a blue color.
It is possible to define more than one parameter-value string for each tag, so that
different parameter-constellations may be used in parallel. Class ARFactory uses
some helper classes to ensure that a tag in the given G UIDO description is checked
against all tag-parameter-lists. Using this mechanism, it is quite easy to implement
arbitrary parameter names and values for any of the defined G UIDO tags.
start position of the element list and resets the voice state information in
curvst.
3. GetAt(POSITION pos): This function just returns the object at the current
position of the element list.
{ 1 3 6 9 10 12 14
[ \clef<"treble"> \meter<"3/4"> c2/4. h1/8 | c2/2 _/4 ],
2 4 5 7 8 11 13
[ \clef<"bass"> \key<"D"> \meter<"3/4"> { d0/4, f# } { c, e } | c/2. ]
}
1 3 10 12
6 14
9
2 4 5 7 8 11
13
Note that all operations described here require the AR to be in GSNF and likewise
ensure that the returned AR is in GSNF as well.
Removing range tags There are situations, where range tags need to be removed
from the AR. Removing one tag actually removes two entries from the ptaglist,
as there is always a begin- and a matching end-tag. The removal-operation can be
carried out in a straightforward way: the begin- and end-tags are removed from the
ptaglist and the ptagpos is set accordingly: if the current position within the
ptaglist was set on a position-tag that is being removed, the position is set to the
following position-tag. If there is no following tag, the position is set to NULL, which
62 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
is interpreted as a traversal beyond the end of the list. Reading Figure 3.9 from top
to bottom visualizes the removal of a the ncresc-tag.
Element list:
Position: 1 2 3 4
current position
Event: c Tag: clef Event: d Event: e
Element list:
Position: 1 2 3 4
Event: c Tag: clef Event: d Event: e
(A) After removal:
Range tag list:
(B) Before insertion:
slurBegin
slurEnd
Figure 3.9: Visualization of range tag removal (A) and range tag insertion (B)
Inserting range tags When inserting range tags (like for example a ntie-tag),
the begin- and end-position of the range in the element list is needed, so that the
correct location in the ptaglist can be found. The actual insertion of the two
instances of ARPositionTag – the begin- and the end-tag – can then be done easily.
Some care has to be taken, if there are no previous range tags present in the AR, in
which case the ptaglist must be created. Reading Figure 3.9 from bottom to top
visualizes the insertion of a ncresc-tag.
Changing the start and end positions This operation is mainly required, when
events in the element list are being removed or split. Because the position tags in
the ptaglist store pointers to locations in the element list, all position tags point-
ing to such events must be updated to match the new position in the element list. As
long as the order of the ptaglist is not disturbed, the re-pointering is a straight-
forward process. Otherwise, the order of the elements in the ptaglist must be
changed to match the new situation. Figure 3.10 shows the effect of changing the
positions within the ptaglist in order match the situation in the element list: be-
fore the change, the nslur and the nbeam-tag both ended at the same event. After
3.6. MANIPULATION OF THE AR 63
the change, the nslur-tag now ends before the nbeam-tag. The order within the
ptaglist is changed.
Element list:
Position: 1 2
current position Event: c Event: d
Before change
Range tag list:
beamBegin
slurBegin
beamEnd
slurEnd
Element list:
Position: 1 2 3
After change Event: c Event: d Event: d
Figure 3.10: Changing the order within the range tag list
Splitting events When dealing with automatically inserted bar lines or nnew-
System- or nnewPage-tags, it is sometimes necessary to split one event into two
events. This happens, if the new bar-line has to be inserted at a time position that
lies within the duration of an event. If an event is split, the resulting events need to
be joined by a ntie-tag, so that the musical idea remains the same. Splitting events
is a manipulation that is required often, especially when a time signature is set and
automatic bar lines are inserted. To split an event, the duration of the resulting
two events must be known. If the original event has duration d, then d = d1 + d2
must hold, where d1 and d2 are the durations of the two resulting events. First,
the original event changes its duration to d1 . Then a new event with duration d2
is inserted (using the routine described below). Finally, the position tags must be
“repointered”. Depending on the tag-type, tags that began at the original event can
now either begin at e1 or at e2 . Figure 3.11 shows an example of splitting an event
because of a nbar-tag.
\bar
tieBegin
tieEnd
Because of the special inner representation of chords (see section 3.3.1), the process
of splitting a chord into two chords is much more complicated. In this case, all
elements of the chord must be copied and joined by ntie-tags. Because some tags
within the chord may alter the state of the voice (for example a nstaff-tag, that
changes the staff on which the voice is displayed), the state of the voice at the first
element of the original chord must be saved. This saved state information must be
restored before the second chord is inserted. Figure 3.12 shows a chord containing
3.6. MANIPULATION OF THE AR 65
a nstaff-tag and the implications for splitting this chord. All added or changed
information is displayed in bold type. Note that a new nstaff-tag is inserted at
the beginning of the second chord. In order to focus on the main points, the Figure
does not show the inner representation but uses the equivalent G UIDO descriptions
instead. Internally, the inner chord representation of section 3.3.1 is used.
{ [ \staff<1> \meter<"1/4">
{ \tieBegin:1 c/4,
\staff<2> \tieBegin:2 e }
\bar
{ \staff<1> c/4 \tieEnd:1,
\staff<2> e \tieEnd:2 } ],
[ \staff<2> \meter<"1/4">
c/4 \bar e ] }
Inserting Events The insertion of new events into the AR can be accomplished
rather easily. Once the location for inserting the event into the doubly linked el-
ement list has been found, it can be easily inserted. Because all elements of the
list store their time position, all elements following the inserted event must in-
crease their time position by the duration of the inserted event. Because the event
is stored at a new position within the element list, the range tags are not affected
by this change.
Removing Events The case of removing events is a little bit more complicated
than the insertion of events, because the range tags may have to be considered as
well. At first, the same premises as in the insertion case hold. The event that is
about to be removed has a position in the doubly linked list. When it is removed,
the following elements have to update their time position. Then it has to be checked,
whether any range tag begins or ends at the removed event. Depending on the tag,
66 CHAPTER 3. AN INNER REPRESENTATION OF GUIDO
it can either be removed from the position tag list, or its position can be changed to
either the previous or the following event. This depends on the type of the tag and
can not be generalized. As an example consider the G UIDO description
[ ncresc( c d e ) f ].
If the e is removed, the resulting G UIDO description looks like
[ ncresc( c d ) f ].
In this case, the ncresc-tag updates its range to the previous event.
3.7 Conclusion
In this chapter an application-oriented inner representation for G UIDO descriptions
called Abstract Representation (AR) was introduced. It was shown that the AR can
be regarded as the document in the context of the Document-View-Model, a design
pattern that helps in encapsulating the data from a concrete type of display or ma-
nipulation. In order to facilitate the handling of G UIDO descriptions, the G UIDO Se-
mantic Normal Form (GSNF) was identified and the conversion of arbitrary G UIDO
descriptions into GSNF was described. The GSNF is a normal form that leads to
a clear distinction of G UIDO elements: events, non-range tags, and range-tags. In
the following, the object oriented design and the structure of the AR was described
in some detail. The main classes of the AR were presented, and it was shown, how
G UIDO descriptions are converted into the AR. The final section of this chapter dealt
with access and manipulation of the AR. All of the described routines are required
by the notation algorithms, which will be described in the following chapter.
Generally, applications dealing with any form of music representation need a suit-
able inner representation. In this chapter, it was shown that the AR offers such
a flexible and powerful representation for G UIDO descriptions. As already stated
above, the AR can not only be used for music notation purposes, but is a complete
inner representation for arbitrary G UIDO descriptions. The contained manipulation
functions and routines make it useful for a whole range of computer music systems.
Chapter 4
Introduction
The conversion of an arbitrary G UIDO description into a conventional score is a
process requiring a multitude of music-notation algorithms which will be described
in detail in this chapter. As was shown in the previous chapter, there is a one-to-
one correspondence between G UIDO descriptions and the Abstract Representation
(AR). Because the AR is a powerful data structure, which was developed to allow di-
rect and efficient manipulations, many of the notation algorithms described in this
chapter work directly on the AR. Basically, all conversions needed to convert the
AR into a conventional score can be described as G UIDO to G UIDO transformations:
Beginning with an arbitrary G UIDO description and its corresponding AR, a series
of algorithms add notationally relevant information to the AR (like for example au-
tomatically creating bar lines, if a time signature has been encoded in the G UIDO
description). After these transformations, the AR is converted into the so called
“Graphical Representation” (GR), which is an object oriented class structure whose
classes closely correspond to the visible elements of a graphical score. Similar to the
AR, the GR and all necessary algorithms have been developed as part of this thesis.
The conversion of the AR into the GR is a two step procedure: first, the graphical
elements are created from the data contained in the AR, then another set of mu-
sic notation algorithms is responsible for finally placing the notational elements on
lines and pages.
This chapter is structured as follows. First, the notion of G UIDO to G UIDO transfor-
mations is explained in detail. Then, generic music notation algorithms are classi-
fied and their scope – which structural representation level they use – is identified;
the implemented algorithms are presented. After that, the Graphical Representa-
67
68 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
tion (GR) is introduced, and it is shown, how the AR is converted into the GR. The
conversion process automatically determines all information that is necessary to
build conventional score. Finally, conventional and newly developed algorithms for
spacing, line breaking, and page-filling are explained in detail. Where appropriate,
similarities to text processing are mentioned.
+
=
(b) [ nclef<"treble"> nmeter<"4/4">
c*3/4 ntie( d*1/4 nbar d*2/4 )
ntie( e*2/4 nbar e*1/4 ) ]
(c)
Figure 4.1: Simple G UIDO to G UIDO transformation
The notation system, which was developed as part of this thesis, was implemented
in such a way that a G UIDO description is first converted into a corresponding Ab-
stract Representation (AR).1 Then, a number of notation algorithms, which will be
described in detail in the next section, manipulate the AR. These algorithms are all
G UIDO to G UIDO transformations: they get a G UIDO description as their input and
return an enhanced or modified G UIDO description as their output. Each algorithm
only manipulates a strictly defined musical or notational aspect of a score. Then the
AR is converted into the Graphical Representation (GR). All elements of the GR are
1
As indicated before, this correspondence is a one-to-one relationship.
4.1. MUSIC NOTATION AS GUIDO TO GUIDO TRANSFORMATIONS 69
directly connected to elements of the AR; this is important so that all further algo-
rithms, which directly manipulate the GR, can still be described as G UIDO to G UIDO
transformations: the manipulation of an element of the GR is directly reflected in
the corresponding element of the AR.
Figure 4.2 shows, how a G UIDO description is first converted into the AR, which is
then manipulated through a set of notation algorithms. Afterwards, the AR is con-
verted into the GR, where it is again manipulated through another set of algorithms.
The GR can then be used to directly render a score – either within an application or
as a graphics file. The results of all notation algorithms are stored within param-
eters of the AR; the corresponding enhanced G UIDO description contains a textual
representation of the completely formatted score.
Abstract Notation
Representation Algorithms I
(GUIDO to GUIDO
transform ations)
Graphical Notation
Representation Algorithms II
enhanced
GUIDO description
Output:
Score
Because each notation algorithm deals only with strictly defined musical or nota-
tional aspects of a score, it is possible to exchange some of the algorithms to get a dif-
ferent score. As an example, consider the algorithm that decides on automatic beam
groups: the “standard” algorithm uses the current meter to decide on beam groups.
Another algorithm may use a Neural Network approach to find beam groups.
The musical information present in the original G UIDO description is not changed
by the notational transformations. The original information is rather intensively
used to to derive graphical “knowledge” that is added to the original description. In
conjunction with built in notational knowledge (spacing, line-breaking, etc.), the col-
lection of G UIDO to G UIDO transformations build the core of the notation renderer.
70 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
These three steps have to be performed in any music processor that converts musi-
cal information into a printed score. When considering interactive music notation
systems, some of these steps are not clearly defined: an interactive system may
not require the first step, because in this case, the user directly decides on the
symbols.3 Also, in interactive systems, the positioning of symbols is usually done
semi-automatically: the system applies internal notational knowledge to place the
symbols first; then the user can manually adjust their graphical positions if needed.
The printing process has vastly improved since 1984 (when Don Byrd wrote his
thesis); today, high resolution laser printers are available at little cost even for home
users. Nevertheless, there are still issues of device-independent output that have to
be considered – especially if the notation system is used on screen and for printing.
The music notation algorithms, which will be discussed in this section, may be clas-
sified according to the three steps from above. One class of algorithm deals mainly
2
Pages 74–75
3
Sometimes it is still required to break one musical note into several graphical entities as in the
case of automatically added bar lines.
4.2. MUSIC NOTATION ALGORITHMS 71
with the selection process: this relates to the first step mentioned above. These algo-
rithms work on the inner representation only; they can thus be described as AR to
AR transformations. As an example for this class consider the algorithm, which de-
cides upon which notes to group together for beaming (the group selection depends
mainly on the current meter). Another class of algorithm deals with the positioning
of symbols: this relates to the second step from above. These algorithms require
knowledge of the graphical environment (for example the width and height of the
page, the width and height of a staff-line, of a note, etc.). These algorithms mainly
work on the graphical representation. In the following, the notation algorithms on
the AR level are explained. The algorithms that manipulate the Graphical Repre-
sentation will be presented in sections 4.4 and 4.5.
— — —— —— —— —
& _——XÛ— ——XÛ ——XÛ —XÛ —XÛ —XÛ —XÛ —XÛ ——XÛ _——EÛ—
==================== =\
The brown fox jumps o- ver the la- zy dog.
4
All examples in this section are given using G UIDO descriptions. In the actual implementation
all algorithms of this section work directly with the (inner) Abstract Representation. For a better
understanding of the functionality of the algorithms, the equivalent G UIDO descriptions are used.
72 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
AutoDispatchLyrics
AutoBarlines
AutoBreaks
AutoKeys
AutoCheckStaffState
AutoDisplayCheck
AutoBeaming
AutoTies
is converted to
[ ntext<"The">( c1/4 ) ntext<"brown">( d ) ntext<"fox">( e ) ntext<"jumps">( f )
ntext<"o-">( g ) ntext<"ver">( g ) ntext<"the’>( g ) ntext<"la-">( f )
ntext<"zy">( d ) ntext<"dog.">( c/2 ) ]
AutoKeys This routine iterates through a voice and checks for changes in key-
signature. This is done, because if the key-signature changes in a score, the old
key-signature must be naturalized. This is done by placing a ARNaturalKey-
object before the changed key, which will later be used to create the corre-
sponding graphical naturalization elements in the score. This routine does not
depend on any other routine.
Example: The G UIDO description:
[ nkey<"E"> e1/4 f# g# nkey<"F"> h& a g f ]
#### nn
is converted to —— ——XÛ— ——XÛ n n b XÚ–– ——XÛ ——XÛ ——XÛ—
XÛ
==============
& – \
[ nkey<"E"> e1/4 f# g# nnatkey nkey<"F"> h& a g f ]
5
used internally to create the graphical elements for the naturalization key in the score.
4.2. MUSIC NOTATION ALGORITHMS 73
G UIDO description. If a time position tp for a new automatic bar line has been
determined, the algorithm must check, whether an event is currently active:
if the onset time of the current event is earlier than tp and the end time of
the current event is later than tp, the event must be split into two events, and
an additional ntie-tag must be added to graphically join the split notes. The
algorithm makes use of all manipulation functions provided by the AR, which
are described in section 3.6 on page 60.
An example for the AutoBarline routine can be seen in Figure 4.1 on page 68.
All of the following algorithms depend directly or indirectly on the output of
this algorithm.
AutoBreaks This is the only one of the notation algorithms, which requires
a parallel sequential access of all voices contained in the AR as it is described
in section 3.5.2. The AutoBreaks-algorithm does two things: first, it “multi-
plies” nnewSystem- and nnewPage-tags; this means that if a nnewSystem- or
a nnewPage-tag is encountered in one voice, a similar tag is added at the same
time position in all other voices, if no explicit break tag is already present
there. This may result in events being split (similar to the AutoBarlines al-
gorithm from above) in some voices. The second task this algorithm performs,
is to determine “possible break locations”: when traversing the voices in par-
allel, the algorithm looks for possible break positions, which are vertical cuts
through all voices of a line of music that can later be used as line break po-
sitions. For conventional scores, good possible break locations usually are
the common bar lines of all voices. In order to work for all kinds of music,
the AutoBreaks-algorithm does not rely solely on bar lines but evaluates how
“good” a specific time position within a specific voice is suited as a break po-
sition. The “goodness” of a break position for a voice is returned as a floating
point number that is calculated within each voice by taking into account the
current measure position6 – a time position at the very end of a measure is
better suited than a time position within a measure – and checking, whether
an event is active at the given time position. By adding the results for all
voices for a given time position, it can be determined, if the time position is
a suitable possible break location. If such a location is found, a npbreak-tag
(short for potential break) is inserted in all of the voices;7 sometimes, an event
must be split using the manipulation routines of the AR, because of an added
potential break. For conventional scores that have common time signatures
for all voices, this algorithm returns very good results. Nevertheless, problems
arise, if music that has no common time signatures (or no time signatures at
all) is converted into a conventional score. One can easily construct music that
6
Note that each voice may have a different time signature. Even though this is not very common,
it is easily possible to specify such music in G UIDO.
7
The npbreak-tag is again an internal tag that is only used within the described notation system;
it is not part of the official G UIDO specification.
74 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
«
f [ g/2 g g g : : : ] ,
...
[ g/4 g/2 g g g : : : g ] —— —— —— ——
& XÛ— EÛ—
========= EÛ— EÛ— ... =
In this case, there are no good possible break locations. One solution for the
AutoBreaks-algorithm is to simply leave the issue to the user: no automatic
line breaks would be calculated. Another solution is to simply add npbreak-
tags, if for a certain amount of musical time no “natural” breaks were added
to the music. This sometimes leads to falsely split events, because events are
split at time positions that are later not used as real line break positions; at
least, this solution returns a readable score. The current implementation uses
the second approach.
3 1 1
and set ddb := 1
= = ddisp
2 12 8
take next
which means that the event is displayed as an event
AutoBeaming This algorithm iterates through the events of a voice and de-
termines, which notes can be part of a beam-group. The algorithm strongly
depends on the current meter and also respects explicitly specified beams.
4 j j j
— j
— ——— j
j —— XÚ
Example: the G UIDO description
& 4 _—XÛ —XÛ XÛ— XÛ XÛ XÛ XÚ–––J –––J =
—
========= — — —
[ nmeter<"4/4"> c1/8 d e f g a h c2 ]
is converted to
nmeter<"4/4"> nbeam( 4 — —— —— —— ——— ———XÛ ———XÛ ——XÛ
& 4 _——XÛ —XÛ —XÛ XÛ XÛ
[ c1/8 d e f )
nbeam( g a h c2 ) ] ========= =
AutoTies This algorithm iterates through a voice and looks for ntie-tags.
As many of the previous algorithms automatically split events and join them
using ntie-tags, this algorithm has to be computed very late. The algorithm
breaks all ntie-tags, which cover more than two events into a series of ntie-
Begin- and ntieEnd-tags that cover two events each. After the algorithm has
been computed, any ntie-tag found in the AR can be represented by one single
graphical tie in the score.
Example: the G UIDO description
[ ntie( c/4 c c ) ] & _——XÛ _——XÛ _——XÛ=\
=====
is converted to
[ ntieBegin:1 c/4 ntieBegin:2 c ntieEnd:1 c ntieEnd:2 ] Corresponding Score
ferent computers using different operating systems, the function calls that produce
graphical output are encapsulated in a small module, which can be adopted for dif-
ferent graphical operating systems without changing the overall layout of the struc-
ture. This issue is not trivial, as each graphical operating system requires different
methods to access information regarding for example the width and height of font
symbols, or offers different capabilities with respect to virtual coordinate spaces
being mapped onto a screen or a printer.
In the following, the graphical requirements for any (conventional) music notation
system will be exemplified. Then, the major classes of the GR will be presented.
Pagetext
Staff
System
Slice
Slur split
by line−
break
Line
(System)
Notational
Element
Notational
(Tie)
Elements
as the current implementation has more than 90 classes, this overview covers only
the most important classes. Almost all of the classes in the GR directly or indirectly
inherit from class GObject, which contains the data and functionality required for
any visible element. The data of GObject includes, for example, a (graphical) posi-
tion and a bounding box.
Class GRMusic can be found on the bottom left of Figure 4.6. One instance
of this class is created for every score that is being created from a G UIDO de-
scription. As can be deduced from the diagram, GRMusic contains one or more
instances of class GRPage, which represent the pages of the score. To create
the pages and lines of a score, GRMusic “employs” a class GRStaffManager,
which is not shown in the diagram but will be explained below, when the con-
version of the AR into the GR is discussed in detail.
—— —XÛ—
Class GRSystemSlice is a part of a single line of music. A
set of GRSystemSlices build a GRSystem. For a simple un- XÛ
derstanding, it is convenient to think of system slices as mea- & XÚ––– =
===
sures of a score. Class GRSystemSlice contains one or more
——
instances of class GRStaff. Class GRSystemSlice is also re-
XÛ—
sponsible for graphical elements that belong to several staves;
this might be, for example, a beam that begins and ends in === Ú
X
& ––– =
different staves, as it is shown on the right.
Class GRStaff represents a part of a single staff of a line of music. The time
position and duration of the part being represented by an instance of class
GRStaff is directly determined by the containing GRSystemSlice. Class
GRStaff directly inherits from class GRCompositeNotationElement, which
is capable of storing an arbitrary number of instances of class GRNotation-
Element. Using this storage, class GRStaff stores graphical elements being
placed upon it. This might be musical markup, like for example an instance of
class GRClef or a musical event like an instance of class GRNote.
80
GObject GRTag
is a
contains
0..n
GRNotationElement 2
GRPositionTag
0..n
0..n
Class GRTag is the base class for all G UIDO-tags. As can be seen in Fig-
ure 4.6, a G UIDO-tag either inherits directly from class GRTagARNotation-
Element, or it inherits directly from class GRPTagARNotationElement. This
mechanism is used to distinguish between range- and non-range tags (see sec-
tion 3.2.2 on page 47 for a detailed description). As a range tag has a start-
and end-element, class GRPositionTag contains two pointers to instances
of class GRNotationElement. These are used to determine the start- and
end-point for a range tag; for example, the beginning and ending note of a
slur for class GRSlur. Because all tags use multiple inheritance to inherit
not only from class GRTag but also from class GRARNotationElement, all
tags can be contained in one of the composite classes (either GRARComposite-
NotationElement or GRCompositeNotationElement) or their derivatives
(like, for example, GRStaff, or GRSystem, or GRPage).
Class GRPositionTag is the base class for representing range-tags (as ex-
plained in the previous paragraph). The issue of line breaking is crucial when
dealing with range-tags within the GR: if both the begin- and end-event of a
range-tag are located on one line of music, the graphical object (like, for exam-
ple, a slur) can be created directly. If the begin- and end-event of a range-tag
are located on different lines, then the graphical object must be split into sev-
eral graphical objects. This case can be seen in Figure 4.5, where a slur begins
in the last slice of the second voice in the second line and ends in the following
line: here, the slur, which is represented by only one object in the Abstract
Representation, is broken into two graphical objects. A mechanism within
class GRPositionTag has been implemented to deal with these cases.
Class GRARNotationElement is the base class for all graphical objects that
have a direct counterpart in the Abstract Representation (which is shown by
using “AR” as part of the name). Consider, for example, class GRClef, which
inherits indirectly from class GRTag and from class GRARNotationElement.
This reflects, that the graphical “clef ” object has a direct counterpart in the AR
(which obviously is an instance of the class ARClef). An example for a class
that does not inherit from class GRARNotationElement is class GRStaff:
there is no class called ARStaff in the AR, because a staff is a purely graph-
ical entity, which is not directly reflected in the underlying abstract represen-
tation.8
The complete set of classes of the GR is too huge to be discussed in detail here. It
should be clear that any visible element of a score has a matching class in the GR.
8
Note that there is a nstaff-tag in G UIDO, which can be used to define the staff on which a voice
is being displayed. Nevertheless, there is no one-to-one correspondence of the graphical staff and the
nstaff-tag.
82 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
4.4.1 Manager-Classes
The two manager-classes, which play an important role when converting the AR
into the GR, will be presented in more detail in the following.
Class GRStaffManager
An instance of class GRStaffManager is used by class GRMusic to read the data
from the AR and to create a set of GRSystemSlice-classes. GRStaffManager uses
instances of class GRVoiceManager to read the data from the individual voices;
all voices are traversed in parallel, so that notational elements, which need to be
synchronized horizontally, are created in parallel. GRStaffManager is responsible
for the following tasks:
Creation of springs and rods for the spring-rod-model, which will later be used
for spacing and line breaking; details of this model are described in section 4.5.
4.4. CONVERTING THE AR INTO THE GR 83
Abstract Representation:
Class GRStaff
reads
ARMusicalVoice Class GRVoiceManager
Class GRStaff
...... ...... Class GRStaff
reads
ARMusicalVoice Class GRVoiceManager
Class GRStaff ......
Class GRStaff
......
Class ARMusic
Class GRStaffManager
Class GRStaff
Class GRSystemSlice
creates set of
Class GRSystemSlice
Notation Algorithms II
(Spacing, Line Breaking)
Graphical Representation:
GRPage
Class GRStaff GRSysSlice
GRPage
Class GRPage
Class GRMusic
Class GRVoiceManager
Some other classes, which have no direct graphical counterpart, are used during
the conversions from the AR into the GR. Class GRSpaceForceFunction, class
GRSpring, class GRRod, and class GRPossibleBreakPosition are mainly used
for spacing and line-breaking. The details of this will be explained in sections 4.5.1
and 4.5.2. An instance of class GRPossibleBreakState is created every time a
npbreak-tag is encountered in a voice (see section 4.2). This object saves the com-
plete state of all staves at the current time position; if the position is later used as a
line-break location, the saved information is used to create the graphical elements
at the beginnings of the staves in the next line. The required information is mainly
the current clef, the current key signature and the current time signature. It is also
essential to remember, which tagged ranges are currently active. In the case of a
slur or a tie, the correct endings and beginnings have to be created.
Another class, which saves state information is class GRStaffState. It stores the
current state of an instance of class GRStaff. Because different voices can be writ-
ten on the same staff, it is essential that the current state of a staff is always acces-
sible. The information of class GRStaffState is for example used, to determine the
vertical position of notes by comparing the pitch and register information with the
current clef. Class GRStaffState keeps also track of the current position within a
measure (which depends on the current meter), and the current key signature.
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 85
4.5.1 Spacing
When typesetting music, the issue of distributing leftover space in between the
graphical elements of a line is fundamental for obtaining good and esthetically
pleasing results. In the following, approaches for human and automatic spacing
are described. An improved algorithm for spacing, which was developed as part of
this thesis, is presented.
In order to estimate, how “good” a spacing algorithm performs its task, it is essential
to define, what constitutes good spacing. Helene Wanske defines the requirements
for spacing as follows [Wan88]:
5. The overall impression of a score page shall be smooth; there shall be no “black
clusters” nor “white wholes”. Any “surprise-effects” shall be prevented. Nev-
ertheless, the disposition of notational elements on one page shall follow eco-
nomic considerations.
These rules are ranked according to their importance for music notation. They are
definitely “fuzzy”, and it is generally not clearly decidable, whether a line of music
is “good” or “better” spaced than another one. During the work on this thesis, the
primary goal was the automatic calculation of spacing as it is found in “real” scores,
which have been produced by human engravers. In most cases, different editions
of one piece of music can be found. Often, these editions differ in the used musical
font and almost certainly in differences in spacing. Therefore, a secondary goal was
the ability to adjust the spacing algorithm to suit the users need. One thing that
needs to be kept in mind is that different readers may require different scores: it
is probably much easier for a professional musician to read a tightly spaced score,
whereas someone learning to read music notation clearly is better off with loosely
spaced music.
In conventional music notation, the space after a notehead (or a rest) depends pri-
marily on the duration of the event. Traditionally, these relationships are specified
with respect to the note durations of 16 1
: 18 : 41 : 12 . Different engravers have used
10
Wanske gives several examples where a spacing, which is calculated strictly according to the
given rhythmical structure, may lead to lines that appear to be unequally spaced. A line of music is
“optically balanced”, if equal note durations are perceived to be equally spaced by a human reader.
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 87
slightly different relationships between note duration and space. Helene Wanske,
who gives a thorough11 overview on (human) spacing, specifies a relationship of
1 : 1:4 : 1:8 : 2:2 meaning that an eighth note is followed by 1:4 times more space
than a sixteenth note [Wan88]. When calculating the values for a fixed distance of
5mm for the space after a sixteenth note, the resulting music notation can be seen
in Figure 4.8. These values give only a rough guideline for the actual engraving of
music: in most cases, additional symbols (like for example accidentals) or the need
of vertically aligning different voices leads to different amounts of white space af-
ter noteheads or rests. Additionally, when spacing different voices with unorthodox
note durations (for example a triplet versus a quintuplet) the individual tuplet-
groups should be spaced evenly (this is a consequence of Wanske’s third spacing
rule).
Automatically spacing a line of music has been discussed quite often in the past
[Byr84, Gou87, BH91, HB95, Gie01]. The most important contribution to the prob-
lem was given by Gourlay in 1987. His spacing model, which in turn was inspired by
Donald Knuth’s work on TEX [Knu98], seemingly builds the basis for a large number
of the currently implemented notation systems.12 His model can be described as a
spring-rod-model:13 a line of music is interpreted as a collection of springs with some
spring constants on which a force is exerted to stretch or shrink the line to a desired
length. Rods are introduced to ensure that noteheads and musical markup (like for
example accidentals) do not collide, especially when spacing is tight. The rods are
used to pre-stretch the springs to have a guaranteed minimal extent. In contrast to
formatting text, the issues found in formatting music are more complex. Because si-
multaneous voices must be vertically aligned, a spacing algorithm must be capable
of handling overlapping note durations in different voices while still maintaining a
spacing that closely follows individual note durations.
11
The section on spacing covers 46 pages! (pages 107-153)
12
Obviously, it is not really possible to detect the implemented spacing algorithm for software that
is distributed without the source code.
13
The original article by Gourlay does not use the word “springs” but speaks of boxes and glue,
which in turn was inspired by TEX. We find that the term “spring” more closely corresponds to the
physical model being used.
88 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
——XÛXj
——— —— b ÚX ——— ————
& Û XÛ ––– b XÛ=
==== # XÛEÛ—
Force 0
—— Xj
——Û —— b Ú–X b ——XÛ —— ——
Û
X
======
& XÛ –– # XÛ —EÛ
Force 350
——— Xj
—— —— b XÚ —— —— —
& XÛ Û —XÛ ––– b XÛ # —XÛ —EÛ =
=========
Force 690
——XÛ j
——— —— b ÚX ——— —— ——
ÛX XÛ ––– b XÛ # XÛ EÛ— =
===========
&
Force 850
The basic idea of Gourlays’ algorithm can be summarized as follows: a single line
of a score is interpreted as a list of onset times, which is sorted by time position; an
onset time occurs whenever a note or rest begins in one of the voices. In order to de-
termine the correct amount of space to put in between the onset times, springs are
inserted between successive onset times. In order to stretch (or shrink) the line to
its desired length, a force is exerted on the whole collection of springs. Each spring
is then stretched (or shrunk) depending on the external force and the individual
spring constant, which mainly depends on the spring duration14 and on the dura-
tion of notes (or rests) being present at the time position of the spring. Hooks’ Law
describes, how a force F , an extent x and a spring constant c are related: F = c x.
To avoid collisions of noteheads, accidentals, and other musical markup, rods are
introduced, which determine the minimum stretch for one or more springs. The
spring-rod-model generally agrees very well with Wanske’s spacing-rules, which
were defined at the beginning of this section. Almost all essential requirements
for spacing are automatically fulfilled, when the model is used.
Table 4.1 shows a short musical example together with the associated springs and
rods. The springs are shown above the line of music, the rods are shown below. It is
demonstrated, how different forces are applied to the same line of music. When the
applied force is zero, the rods prevent the notated symbols to collide horizontally.
As the force increases, more and more springs are stretched further than the extent
14
The “spring duration” is the temporal distance of the two onset times in between which the spring
is placed.
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 89
of the rods. At Force 690, all springs are stretched further than the pre-stretch
induced by the rods.
—— ——— The spring-rod-model also works very well for multi-
& XÛ ## =
—XÛX voiced music: because rods are created separately for
=== each voice (or staff), notational elements of different
voices/staves may overlap horizontally, as long as ex-
tent of the combined springs is at least as wide as
b b b b
x = 10cm
c = 1 + 11 + 1 = 1:5
1 2 3
c c c
f = 15
Figure 4.10: Connected springs
This means that the spring will only be stretched further by an external force Fe , if
Fe is greater than the pre-stretching force Fp. Otherwise, the extent will just be the
pre-stretched extent xp .
Therefore, when dealing with a series of springs from a line of music, the overall
spring-constant depends on the applied force and the pre-stretched springs: only
those springs are further stretched, whose saved force is smaller than the applied
force, and therefore, the overall spring constant must be computed using only these
springs. The other springs just add their pre-stretched extent.
The Space-Force-Function
To efficiently compute the force required to stretch a line of music a so called space-
force-function was defined and implemented during work on this thesis. This func-
tion calculates the force required to stretch a line of music to a given extent. Evi-
dently, the pre-stretching of springs is taken into account. The space-force function
being introduced here, is extensively used in line breaking and page filling, which
will be described at the end of this chapter.
Given a set of springs S = (s1 ; : : : ; sn ) with respective spring constants ci and pre-
stretch xi for 1 i n, the space-force-function s : extent ! force calculates the
required force to stretch the springs to a given extent:
x xmin 1 X
s (x) = =f with c = P and xmin = xi (4.2)
c 1
ci in
1in
1
ci xi f c x >f
i i
It is obvious that s in equation (4.2) cannot be computed directly, because the exact
values for c and xmin , which are required to calculate s can only be determined
when the overall result for s has already been computed. In order to compute the
function, a pre-processing step is performed, which sorts the set of springs by their
respective pre-stretching forces. Additionally, the sum of all pre-stretched spring
extents is computed. Then, the calculation of s simply requires a traversal of the
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 91
ordered set: Let S = (s1 ; : : : ; sn ) be a set of springs with their respective spring
constants ci and pre-stretching force fi and pre-stretching extent xi = fc (for 1 i
i
n). Let S be ordered, so that for all i j we have fi fj . Algorithm 1 computes the
i
4000
f=1200
3000
1/32 f=800
Extent
1/16
2000
1/8
f=400
1000
0
0 200 400 600 800 1.000 1.200
Force
where a is some constant (usually in the range of 0:4 and 0:6), and dmin is some
smallest duration for which the required space is predefined. In current notation
software (like, for instance in Finale, LilyPond, etc.) quite a number of minor vari-
ations of this scheme are actually implemented (see for example [Gie01]), but the
general principle of using a logarithmic function for calculating the required space
is agreed upon widely. Tweaking the parameters for a and dmin results in slightly
different spacing and it is usually a matter of personal (or publishers) taste, which
values to use.
The formula for (d) is also used to determine the spring constants for a given line of
a score: Let s be a spring between two successive onset times o1 and o2 . The resulting
spring duration ds is simply ds = timeposition(o2 ) timeposition(o1 ). Let di be the
shortest duration of all notes (or rests) beginning or continuing at timeposition(o1 ).
Then the spring constant c is calculated as
di 1
c=
ds (di ) space(dmin)
(4.4)
Gourlay introduced this formula in order to get optimal spacing even when rhyth-
mically complex structures (like tuplets versus triplets) are present. In the case of
monophonic pieces, the formula simplifies to
1
csimple = 1
(ds ) space(dmin)
because di = ds . Using this simple spring constant and putting it into Hooks’ Law
with a force F = 1 and a duration of ds equal to dmin stretches the spring to the
extent
F
x = = 1 (dmin) space(dmin) = space(dmin)
c
which is exactly the expected result.
As indicated above, a number of variations of Gourlays’ algorithm have been pro-
posed. The newest publication on spacing a line of music is by Haken and Blostein
[HB95]. They were the first to actually use the terms “springs” and “rods”. Even
though their terminology is different, the general ideas of Gourlay are present. The
only significant difference lies in the handling of rods, which are called “blocking
widths” by Gourlay. Haken/Blostein introduced an elegant way to pre-stretch the
springs using a two-step process: first, only those rods are considered that only
stretch one spring. Then the remaining rods (which all span more than one spring)
are checked. In this phase, the required force to stretch a chain of springs to the de-
sired rod length is calculated. The rod requiring the maximum force is applied first;
other rods stretching the same springs can be discarded. This procedure is more effi-
cient than Gourlays’ original algorithm, which just iterates through all rods without
pre-sorting.
GNU LilyPond, a musical typesetting system based on TEX [NN01], is also using
Gourlays’ algorithm for spacing a line of music. One interesting detail of LilyPond’s
94 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
spacing is the fact that the value for dmin in equation (4.3) is determined separately
for each measure. This sometimes leads to an uneven spacing of measures, which
might be musically misleading as shown in Figure 4.12, where the general rhythm
of measures one and two is quite similar, but the spacing is very uneven.15
surprising to detect that Gourlays’ algorithm (and therefore all of its direct descen-
dants) fails to produce good spacing in certain situations. The reason for these de-
ficiencies stems from the fact that sequential individual notes with equal duration
might be spaced unevenly because of simultaneous notes in other voices. As stated
above, the coupling of note duration and spacing is very strong, therefore notes with
equal duration should be spaced equally whenever possible.17 Because these cases
do occur in real scores (and also in computer generated music), a solution to the
problem was searched for and found. This solution led to an improved algorithm
for optimally spacing a line of music that automatically detects and corrects the
above mentioned spacing errors. First, groups of notes with equal duration are de-
termined in the score. Then, for each such group it is checked, whether the original
Gourlay algorithm creates spacing errors. If this is the case, an alternate spacing is
calculated and applied, if certain tolerance conditions are met.
Figure 4.14 shows, how the improved spacing algorithm (b) compares to Gourlays’
original algorithm (a): The spacing of the triplet in the first voice is exactly equal
when the improved algorithm is used. This effect is highly desirable, because the
graphical appearance of the triplet now directly conveys that each individual triplet
note has equal duration. When using the original spacing algorithm (a), the space
between the first and the second note of the triplet is significantly greater than
between the second and the third note, even though their duration is equal. We call
this error a “neighborhood spacing error”.
(a) (b)
Figure 4.14: Gourlays’ original algorithm (a) compared to the improved spacing
algorithm (b)
The reason for the unequal spacing can be seen rather easily: Figure 4.15 shows a
rhythmically simpler example together with the created springs and their respec-
tive calculated spring constants. In Figure 4.15 (a) the first triplet note spans two
1 1
springs (s1 and s2 ) with spring constants c1 = 161 (11 ) = (11 ) and c2 = 121 (11 ) =
16 16 16 48 12
4
1 . When combining s1 and s2 into a combined spring s1;2 the spring constant c1;2
(12)
17
This can be directly derived from Wanske’s third spacing rule of section 4.5.1 on page 86. When
dealing with scores that contain rhythmic complexity, this rule may have a rather low priority. Nev-
ertheless, it should be applied wherever possible.
96 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
Springnumber 1 2 3 4 Springnumber 1 2 3 4 5 6
Springduration (d s) 1/16 1/48 1/12 1/12 Springduration (d s) 1/16 1/48 1/24 1/24 1/48 1/16
Smallest duration (d i)1/16 1/12 Smallest duration (d i) 1/16 1/16 1/16 1/16 1/16 1/16
1/12 1/12 Springextent 4.33 1.45 2.89 2.89 1.45 4.33
Springextent 4.45 1.25 5.0 5.0
3
3
5.7 5 5
(a) (b)
Figure 4.15: Spacing of Gourlays’ algorithm including springs
It is obvious, that the spring constant c1;2 is not equal to c3 = c4 = (11 ) . Therefore,
12
when a force F stretches the line, the springs s1 and s2 are stretched to the extent
x1;2 = Fc = F ( ( 161 ) + 14 ( 121 )) and the springs s3 and s4 are both stretched equally
to x3 = x4 = F ( 12 1
). Clearly, x1;2 and x3 (and x4 ) are not equal.
To understand, why this error does not occur in the standard 3 against 4 case (see
Figure 4.15 (b)), one must understand how the fraction dd in equation (4.4) is sup- i
s
posed to work: consider Figure 4.15 (b), which differs from (a) by replacing the
second voice with four sixteenth notes. Now, the value of di for each of the springs is
1
16
, so that each spring is stretched as a partial of a sixteenth-note spring. To clarify
this, consider springs s2 and s3 of Figure 4.15 (b). The respective spring-constants
1 1
are c2 = 161 (11 ) = 31 (11 ) and c3 = 161 (11 ) = 32 (11 ) . When combining springs s2
48 16 16 24 16 16
and s3 the spring constant is
1 1
c2;3 = = 1
1
3
( ) + 3 ( 16 )
1
16
2 1
( 16 )
The result is, that the springs s2 and s3 are stretched so that their sequential com-
bination behaves just as a single spring for one sixteenth note. The same is true
for springs 4 and 5. Subsequently, the springs s1 , s2;3 , s4;5 , and s6 all have a spring
constant of (11 ) so that the sixteenth notes of the second voice of Figure 4.15 (b)
16
are spaced evenly. Regarding the first voice of Figure 4.15 (b), the respective spring
constants c1;2 , c3;4 , and c5;6 are all equal to 34 (11 ) , so each note gets the same amount
16
of space.
One other spacing problem of Gourlays’ algorithm concerns lines of music that are
spaced very loosely. Consider Figure 4.16, where a single line of music has been
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 97
stretched rather heavily. In this case, Gourlays’ algorithm (a) results in unequal
spacing of the eighth-notes in the second voice: because of the 32nd notes in the
first voice, the first eighth-note of the second voice is followed by much more space
than the following ones, which are all spaced equally. The improved algorithm (b)
distributes the space evenly.
(a)
(b)
Figure 4.16: Loose spacing: Gourlay algorithm (a) and improved algorithm (b)
is extended to also cover the new neighborhood. This step is necessary so that
the fixing a spacing error in one of the error-regions does not introduce a new
spacing error in another voice.
5. For the finally determined error-regions of step (4), three spring constants are
calculated: the spring constant for the original Gourlay algorithm, the spring
constant for using an average of the di , and a spring constant which uses the
minimum di for the whole region.
6. The difference of the original Gourlay constant and the average constant is
determined. If it is within a tolerance band (less than 0.05), then the average
value for di is taken for the region. Otherwise, the difference of the origi-
nal constant and the minimum-di -constant is calculated. If this difference is
within another tolerance band (less than 0.17), then the minimum duration of
the region is chosen as di . Otherwise the original Gourlay value for di is chosen
for the region.
7. Before a line is finally spaced, an additional step is performed to avoid the
wrong spacing of loosely spaced lines. This is done by determining, whether
any note with the smallest note duration of the whole line is followed by more
than two noteheads of white space using the calculated spring constant. If this
is the case, the smallest note duration for the whole line is taken as a value for
di and the spring constants for the line are recalculated accordingly.18
The values for the tolerance bands for step 6 of the algorithm have been chosen
through various experiments. Figure 4.17 shows an example for taking the average
(a), the minimum (b) and the original Gourlay value (c). Although the neighborhood
spacing error in the first voice is removed in cases (a) and (b), choosing the average
value results in too little space for the short notes in the beginning of the second
voice. Choosing the minimum duration (b) results in a spacing, which is too wide.
In this example, the original Gourlay algorithm (c) gives the best result, although
the neighborhood spacing error in the first voice is clearly visible. The tolerance
bands are indications, when to tolerate a neighborhood spacing error so that the
overall spacing remains acceptable.
To see, how the algorithm solves the neighborhood spacing problem of the last sec-
tion (see Figure 4.15 (a)), we give a trace of the improved algorithm.
1. The original Gourlay algorithm is used to determine the values d1 for each
spring: 16; 12; 12; 12. These values can be found in Figure 4.15 (a).
i
2. The first voice has a neighborhood list of (1; 3; 4; 5) which means, that the notes
covering springs 1 up to 3 (not including the last value) and 3 to 4 and 4 to 5 are
1
successive notes having the same duration (in our example this is 12 ). There is
no neighborhood list for the second voice.
18
This step was performed to obtain the result of Figure 4.16 (b).
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 99
3. The neighborhood list (1; 3; 4; 5) is checked for spacing errors: the average for
the first note (going from springs 1 up to 3) is 16+12
2
= 14, the average for the
second and third note is both 12. Therefore, a spacing error is detected.
4. Because there are no more neighborhoods, the error region reaches from spring
1 up until spring 5.
5. The three spring-constants are calculated: sgourlay = 1:81665, sminimum = 1:60325,
saverage = 1:81631.
6. The difference jsgourlay saverage j = 0:00034 is within the tolerance band. There-
fore, the average value for d1 = 16+12+12+12
i 4
= 13 is chosen. Spacing is done, as
if the smallest duration is a 13th note.
The resulting spacing can be seen in Figure 4.18 (a) and (b), which also shows a
comparison to the old Gourlay algorithm (c).
Springnumber 1 2 3 4
Springduration (ds) 1/16 1/48 1/12 1/12
Average duration (di)1/13 1/13 1/13 1/13
Springextent 3.93 1.31 5.24 5.24
To show that the algorithm not only concerns somewhat constructed examples, Fig-
ure 4.19 shows a two-measure excerpt from a fugue by J.S.Bach (BWV 856) as it is
spaced by Finale (a), the improved algorithm (b) and in the Henle Urtext edition
(c), which has been spaced by an expert human engraver [vI70]. It is clearly visible,
that (b) and (c) strongly emphasize on the equal spacing of the eighth-notes in the
bass voice of the second measure, whereas the spacing algorithm of Finale19 spaces
19
Finale spaces this example similar to Gourlays’ algorithm.
100 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
the first eighth note tighter than the following two. Of course it can be argued,
whether the spacing of (b) and (c) is better than the spacing in (a); it is surely a mat-
ter of personal taste and also of musical semantics which spacing is preferred. The
point made here is that any spacing algorithm should be adjustable to accommodate
personal preferences.
0,9cm 0,9cm
0,71cm
0,9cm 0,9cm
0,91cm
(b) (c)
(a)
Figure 4.19: Bach fugue BWV 856: Finale (a), improved algorithm (b), hand en-
graver (c)
2. If the piece covers more than 1 page, the individual lines should be created so
that the score covers all pages (including the last one) completely.
3. The locations where page turns occur are musically sensible: positions with
less activity should be preferred over passages where a lot of notes are being
played.
While the last of these points is far from being solved by any automatic system for
music notation, the other two can be tackled. As Hegazy and Gourlay have pointed
out in [HG87], the process of optimal line breaking in music is strongly related to
line breaking in conventional text typesetting. The most relevant research in this
area has been carried out by Donald Knuth [Knu98] and its results are part of the
TEX typesetting system. In music typesetting the work of Hegazy and Gourlay is the
first to cover the issue. Their ideas are derived from Knuth, but as will be shown
in the following section, the problem is a little more complex in music typesetting
than it is when setting text. Although the issue of page filling (second item from
the above list) is related to optimal line breaking, there exists no publication on the
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 101
subject, even though the music notation system SCORE by Leland Smith [Smi97]
contains an (unpublished) optimal page fill algorithm. While working on this thesis,
an algorithm for optimal page fill was developed. Because it is an extension to the
optimal line breaking algorithm proposed by Hegazy and Gourlay, we will first give
an overview of Knuths’ original algorithm for line breaking in text, then the Hegazy-
Gourlay algorithm for optimal line breaking in music is being presented. The new
optimal page fill algorithm will be presented in a separate section below.
The penalty is zero, if the words can be set without stretching or compression of s.
The penalty becomes 1, if there is no space left between the words of the line.21
20
Knuth’s algorithm deals with stretching and compression of space differently. His algorithms
also obviously deals with hyphenation.
21
Knuths’ penalty function is more complex; for reasons of simplicity, the above definition suffices.
102 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
The objective is to find a break sequence Bmin = fb0 ; b1 ; : : : ; bk g so that the sum of
penalties of all k lines of the paragraph is minimized:
k 1
X
8B; B is Break Sequence : P (B min ) = Pb +1;b +1 P (B )
i i
i=0
Given n words there are 2n 1 possible ways to break them into a paragraph.22 It is
therefore prohibitive to simply check all possible combinations. In reality, only a few
break sequences give good results, meaning that they have a low overall penalty.
Note that it is not sufficient to solve the problem by simply using a first-fit algo-
rithm, which would simply begin at the first line and makes choices as it sets each
line: this approach can easily lead to line breaks that do not result in an overall
minimum penalty; sometimes it is essential to set some earlier lines with a higher
penalty so that later lines can be set with a lower penalty and thus reducing the
overall penalty of the paragraph.
For the algorithm, Knuth introduced the concept of “feasible breakpoints”; a “feasi-
ble breakpoint” is a possible entry bi in the final sequence Bmin with the restriction
that there exists a way, to set the paragraph from the very beginning to the word
wb so that the individual penalties of all the individual lines of the thus created
i
subparagraph are within an allowed penalty range. When the algorithm runs, a
list of “active breakpoints” (called active list) is maintained: this list contains those
feasible breakpoints that might be chosen for future breaks.
Algorithm 2 shows a pseudo-code for Knuths’ original optimal line break algorithm.
The algorithm commences by adding the very beginning of the paragraph to the
active list.23 Then, subsequently all potential breakpoints (for our simplified case,
these are just the breaks between words) are checked with respect to each breakpo-
sition in the active list. If there exists a way to set the line from an active breakpoint
a to a potential breakpoint b so that the penalty of that line is within the allowed
range, b is added to the active list. The algorithm also remembers, which of the pos-
sibly multible existing active breakpoints a gives the lowest overall penalty when
setting the paragraph up to wb using one active breakpoint a. The best predeces-
sor is remembered so that the break sequence can be retrieved later. If a potential
breakpoint b is encountered, for which an entry a in the active list returns an infi-
nite penalty (this means, that the words from a to b can not be placed on one line),
then a is removed from the active list.
The algorithm finishes by chosing the feasible breakpoint for the last word of the
paragraph that has the lowest overall penalty. Because for each feasible breakpoint
the predecessor is known, all previous breakpoints can be subsequently accessed.
22
For a fixed number of lines k , there are n k 1 ways to break the sequence of words. If the number
Pn 1
of lines can vary from 1 to n, the resulting formula is k=0 n k 1 = 2n 1 (according to the binominal
theorem).
23
This location is not really a breakpoint; it is treated as a breakpoint so that the rest of the
algorithm must not deal with the first line as a special case.
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 103
The algorithm makes use of the principle of Dynamic Programming [Bel57, Pre00],
which solves an optimization problem by solving a series of carefully devised sub-
problems. Each solution is subsequently obtained by combining the solutions of one
or more of the previously solved subproblems. There are two places, where Dynamic
Programming is applied in the line breaking algorithm: first, the calculation of Li;j
(the length of a subsequence of words) does not need to be recalculated for every
combination of an active breakpoint a and a potential breakpoint j . It is sufficient
to just maintain a global variable L1;j that stores the length of all words from the
beginning up until wj . For each active breakpoint a, the length L1;a is stored. Then
La;j can be easily calculated as La;j = L1;j L1;a . The second place, where Dynamic
Programming is applied, is where the best previous breakpoint a is remembered
when a feasible breakpoint is added to the active list. This can be done, because the
subparagraph from the beginning up until wb using the breakpoint a has the small-
est total penalty. The line breaking of the remaing words (the lines after wb ) does not
affect the previous subparagraph. Knuth puts it into the following words: “The opti-
mum breakpoints for a paragraph are always optimum for the subparagraphs they
create”. This is even true if there is a way to set the paragraph from the beginning
to wb with a different number of lines using only feasible breakpoints. In this case,
only one previous active breakpoint a gives the lowest overall penalty – the number
of lines for the subparagraph is automatically chosen so that the global optimum is
maintained. This very fact is very important when comparing text setting to music
setting: if the number of lines of a paragraph must be variable, active breakpoints
must be stored for each different line number therfore increasing running time and
storage space. Also, even if the total number of lines does not matter, the described
mechanism only works if the remaining lines all have the same width. Otherwise, it
might be important which breakpoint occurs on which line in order to get the global
minimum penalty. The running time of Knuths’ algorithm is O (n ), where is the
maximum number of entries in the active list, which is just the maximum number
of words in one line of text. As indicated above (and also described in Knuth), the
running time increases, if the total number of lines needs to be adjustable and also
if the individual lines of the paragraph have differing widths.
Finding optimal line breaks in music has been previously discussed by [HG87].
Their work is very closely related to Knuths’ algorithm, which has been described
in the previous section. In order to adapt Knuths’ algorithm to be usable for music,
some specific musical issues must be considered. Instead of dealing with a collection
of words, we now deal with a collection of system slices. A system slice (or slice for
short) is a horizontal area of the score which cannot be further broken appart. For
a first understanding of the problem, it is sufficient to treat system slices simply
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 105
i=0
is minimal. Note that b0 = 0 for every sequence B . The resulting number of lines is
then m.
In order to calculate the sequence Bmin that minimizes P , Algorithm 3, which is an
adaption of Knuths’ original algorithm, is used.
Algorithm 3 is quite similar to Knuths’ algorithm. Instead of maintaining an active
list, an entry is stored for each slice, which saves the optimum predecessor and the
overall penalty of breaking at that particular slice. The algorithm automatically
leads to justified lines (including the last one), because the penalty function returns
smaller penalties for full lines.
Just as the text break algorithm, the line break algorithm for music makes use of
the principles of Dynamic Programming [Bel57], because for each slice, only the best
way to break the previous slices is remembered. The running time of the algorithm
is bound by the number of slices (n, the first loop), and the maximum number of
slices per line (the maximum value for j i, where (i; j ) 6= 1). We denote this
parameter . For all practical purposes, we can probably assume that < 10 for
most scores. Because the algorithm does not contain an explicit value for , it works
for any value found in real scores. The overall running time is then O (n Time()).
This shows that the running time of is a crucial part of the optimal line break
algorithm.
The actually implemented algorithm has to deal with certain musical aspects: when
a line of music is broken, the following line gets a clef and a key-signature. It is
therefore important to leave room (horizontal space) for these notation elements.
24
As there are instances in music notation, where line breaks do occur at positions which are
within a measure, or there are even scores with no measure information, the general algorithm just
deals with measure-independent slices.
106 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
Because the clef and the key-signature can change within a piece, a pointer to the
current clef and key has to be stored with each potential break location.
The penalty function is calculated as follows:
where s s ;s is just the merger of all s ’s from slice si+1 up to slice sj , lineextent is the
i j
desired width of a line and fopt is a constant value set by the user (and dependent on
personal taste). Merging the s ’s is a somewhat expensive operation: the merging
of two space-force-functions s 1 and s 2 has a running time of O (maxfj s 1 j; j s 2 jg),
where j s j is the number of springs in the space-force-function. If slices are
merged, each containing a maximum of springs, the overall cost is O (( 1) ).
Together with the formula from above, we get a running time for Algorithm 3 of
O(n ). Because most of the lines will later not be used, and a complete merger
of the s ’s is so expensive, a quicker (and less exact) mechanism was actually im-
plemented: Because a s is a piece-wise linear function, it is possible to identify a
slope for each linear section. This slope directly corresponds to the added spring
constants of those springs that are active at the respective force interval. In order
to efficiently approximate the merger of different s ’s, the spring-constant for the
linear segment at the optimum force is saved. This saved spring-constant approxi-
mates the function in the neighborhood of the optimum force. An approximate merg-
ing of different s ’s is then possible by just determining the overall spring-constant
using equation (4.1), which then just requires multiplications. Figure 4.20 demon-
strates, how this optimization compares to the exact merging of s ’s. The upper part
of Figure 4.20 shows three measures of a Bach piece. Each measure is treated as a
separate slice with its own s . The spring constants at the optimum force are shown
below the scores. For the approximation, the individual spring constants are simply
multiplied using equation (4.1). To determine the approximated force required to
stretch the three measures to a desired width of 4280, Hooks’ Law is used with the
approximated values. This results in an approximated force of 533:43. The lower
part of Figure 4.20 shows the actual merged space force function for the complete
three measure line. The actual force required to stretch the system is 536:3, so the
difference from the approximate value is less than one percent! For all practical pur-
poses, it is therefore sufficient to work with the approximated values. The running
time of is thus reduced to multiplications and additions.
To finally demonstrate, how the optimal line break algorithm works on a 27 mea-
sure long Bach fugue, consider Figures 4.21 and 4.22. The 27 measures are bro-
ken into 12 lines covering two and a half pages. The individual lines are spaced
evenly and the overall appearance is quite good. The overall penalty for all lines
is 1777.37, which gives an average penalty of 1777:37=12 = 148:11 for each line.
The user-selected optimum force for this particular calculation was set to 800. Fig-
ure 4.22 shows the calculated array of entries. The nodes in the graph represent
ends of measures together with the overall penalty, where a line break may occur.
108 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
—— ———— —
# # c a X——Û X—Û— _—XÛ n X———Û X———Û— X——Û X——Û X—Û— X———Û— —— ———XÛ X——Û X——Û ——— ——— X——Û X——Û X———Û ——— _X—Û _X—Û X—Û—
5
X——Û X——Û X———Û ——XÛ X——Û X——Û X——Û # X—Û— _X—Û XÚ– XÚ– X—ÚÛ––
4 1 3
3 2 2
XÛ
5
&
========= =‹ = =‹ =‹ ========
=
«
‹ ‹ ‹ ‹ ‹
X_Ú–– ‹ ‹ ‹ ‹ _XÚ–– _XÚ–– X–Ú XÚ– # XÚ XÚ– X–Ú _XÚ–– _XÚ–– _XÚ–– X–Ú ‹
2
Extent
Extent
2000 2000 2000
0 0 0
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
capp = 1 1 1
1
= 0:142363
0:488976 + 0:427088 + 379205
xapp = = 533
Force required for width 4280: (4280 533) 0:142353 = 533:43
&
========= =‹ ========
«
‹ ‹ ‹
_XÚ–– ‹ ‹ X Ú XÚ – – XÚ ‹X
Ú
2
XÚ _ – _X Ú
– XÚ _
X Ú _
– _
X
Ú
## c a XÚ––– XÚ––– a XÚ–– ‹‹ XÚ––– a XÚ–– ––– XÚ––– XÚ–––– XÚ ‹‹ XÚ–– –––– –––– –––– XÚ––– # XÚ––– XÚ––– –––– –– –– –– ––– ‹‹
?
========= J J– =‹ =======
3 J– –– ––– =‹ – ========‹ 2 1 3 1
10000
9000
8000
7000
6000
Extent
5000
4000
3000
2000
1000
0
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
Force
copt = 0:145318
xopt = 534
Force required for width 4280: s (4280) = 536:3
The edges in the graph depict penalties associated with a line beginning at the end
of the measure in node 1 and ending at the end of measure of node 2. As an ex-
ample consider the lines beginning at the very bottom of Figure 4.22: The first line
of the piece begins at end of measure 0 and goes to the end of measure 2 (penalty:
188:106) or measure 3 (penalty: 291:758) or measure 4 (penalty: 835:235). Figure 4.22
shows the optimum path and the first fit path, which is not the same: in order to
demonstrate that the first fit algorithm leads to a bigger overall penalty, the entries
for measure 25 and 27 are each shown twice. The actual algorithm only saves the
entries for the optimum line breaking path. Additionally, when calculating the op-
timal line break, only one predecessor is saved for each node. Figure 4.22 shows
more then one predecessor to demonstrate that the algorithm tries different break
combinations before the optimum solution is determined. To determine the break-
ing sequence, the algorithm begins at the top of Figure 4.22 in the node labeled
“27/1777.37”. The dashed edges are subsequently followed to produce the sequence
27; 25; 22; 20; 18; 16; 13; 11; 9; 7; 5; 3; 0. Reversing this sequence shows, which measures
are endings of lines.
FUGA I
J.S.Bach BWV 846
— — — — —— —— —— —— — — j — j — j — — —— ——— ——— —— — — —
c — — —— —— . X——Û X———Û X———Û— XÛ— X——Û— X———Û X———Û XÛ— X———Û X———Û
X
Ú
– X
Ú
–
X——Û XÚ– XX——ÚÛ– # XX——Û—Ú —XÛ— X———Û X——Û— X——Û— X——Û ———— X——Û— X———Û ——XÛ— X——Û— XX——Û—Ú XX——ÚÛ– # XÚ XXÛ—Ú– XÚ– XÛ—XÚ XX——ÛÚ XÚ– XX——Ú–Û X——Û X——Û— X——Û— . X—Û—
– – – – – X
——Û————
. X ——Û X——Ûj
— ——— X—Û — X——Û— X——Û XÚ X——Û— b X——Û— j
— X
—Û— X——Û X
X
——ÚÛ b XX——Ú–Û . X—Û— XÚ– XÚ– XX—Û—Ú XÚ–– X—Û— _XÛ XXÛÚ– X—Û— XX—Û—Ú– XX—Û—Ú–– XX—Û—Ú– XX———Ú–Û
========= # X
Ú
– –
XÚ–– XÚ–– XÚ––– XÚ–– XÚ–– XÚ–a XÚ–– XX——Ú–Û XÚ– XÛ—— XÚ– XÛ— =‹ XÛ=======
& a _XÚ– XÚ–– XÚ–– XÚ––– . XÚ––– XÚ––– XÚ–– XÚ–––– ‹ XÚ–– ======== – X Ú
– X
Ú
– X Ú
– ==
& XÚ––– =============
–– –– ––– # XÚ–– n XÚ–– XÚ– # XÛ XÚ– Ú
X X
Ú
– X Ú
– – ‹ – –
===========
– – – # X
Ú
– – – ‹ ==
& XÚ–– XÛ======== XÚ–– XXÚÛ–– XÛ XXÚÛ–– XÛ— XÛ— =‹ XÚ–– ======= –– XÚ– XÚ– XÚ– XÚ– –– ‹ – ======== –– – –– – XÚ– –– –– – –– = –– ‹
–– – – – – ‹ – –– – – – –– – –– –– _XÚ––– –– _XÚ––– _XÚ––– ‹ __XÚ–– –––– – – –– –– –– XÚ––– =‹‹ – – –– –– ––– –– –– – ‹ – – – – –– – – J–– a ¥ ‹ – – – – ‹ J– – J–– –– –– –– – ‹ – – – – – –– – ‹
‹ ‹– ‹ ‹ ‹ ‹ — — ‹ — — — — — — — ‹
‹ ‹ ‹ — — —
— —
— —
— —
— — — — — — j j
— ‹ XÚ
– —
— —
— — — — — — — — — — — — — — — — — — — — —
‹ ‹ ‹ — —
a —— # XÛ— # XÛ— XÛ . n XÚ XÛ XÛ XÛ— — — — — — _
X Û ‹ —— —— —— _XÛ— —— —— —— ——
‹ # —
X
Û X
Û Û
X X Û X
Û n # —
X
Û X
Ú X
Û —
—
X
Û ‹ X
Ú X Û Û
Ú
X ‹ XÛ— —— —— —— —— . XÛ— —— —— XÛ— ‹ —— XÛ— XÛ— XÛ— XÛ— —— —— —— —— —— XÛ— XÛ— b XÛ XÛ— ‹
‹ w XÛ— XÛ— XÛ XÛ XÛ XÛ ‹ XÛ—w XÛ XÛ XÛ— XÛ XÛ ‹
c XÚ––Û XÚ–– XÚ–– ‹ ==
X––Ú –– –– X––Ú XÚ– XÚ– XÚ XÚ XÚ XÚ– XÚ– X––Ú ––
? – ======== =‹
=========
? ‹‹ ======== =‹‹ ======= =‹‹ ==
? XÚ––– ============= – XÚ–– XÚ–– –––
–J J– J –
‹ XÚ XÚ––
‹ J–––=========== – J –
––– XÚ–– XÚ–– XÚ––
– – – ‹
– – – –– –– ––– –– –– –– –– –– ––– XÚ– =‹ =======‹ ========
– – ––
J
« — —— —— —
« «
— —— — —— _E————Û
——— ———— X——Û— X——Û— X——Û X——Û— X———Û X———Û X—Û— X———Û X—Û— # XÛ— XÛ X———Û— X——Û X——Û— X———Û X——Û ——XÛ— X——Û— ——— — — — ——— a ——— —— —— —— ——— ——— ——— ——— ——— ——— _X—Û— .
== XÛ XÚ–– XÚ– XÚ
& XXÛ–Ú–– ============ a ‹ XÚ–– ===========
a XÚ––– = XÚ– XÚ– XÚ– XÛ ‹
XÛXÚ–– . XÚ–– XÚ– X——Û XX——ÚÛ– X—Û— X——Û = — ¥ —— —— —— ——— X—Û—
X
Û
—— . — X———Û X———Û X——— XX——Û— X———ÛXÛ— X—Û—— n XX—Û—Û— . XÛ— XÛ X—Û—
# X
Û n X
Û Ú
X
–
EÛ b XÚ– XÚ– XÚ– XÚ XÚ– XÚ @ X
——Û— X——Û— n X——Û X———Û XÛ— XÛ— XÛ XÚ XÛ X——Û XÛ— XÛ— XÛ XÛ XÚ–– XÚ– _XÛ— _
Ú
X
– – – EEÚ–Û–
– –– ––– _XÚ––– J ‹ – –– –– –– – – –– –– ‹ ==
& XÛ—a======= X
Ú X Ú
– X
Û . X
Û X—
Û . X
Ú = ‹ X
——Û—========
——XÚ–XÛ X———Û —XÛ— XXÚ——Û X——Û j X
—Û—XÚ XX——Û— # XXX———ÚÛ XX———Û ‹ XÛX——Û======== # X
Û X
Ú
– XÚ =‹ ==
& –– ==============–– –– –– ––– –– ––– XÚ–– XÚ– XÚ–– . ––– ‹ –– ========= @ – – =\
J ‹ ‹ _XÚ–– ––– ––
–
_XÚ–– –– _XÚ–– ‹ _XÚ– –– __XÚ– –– ––– ––– ‹ _XÚ––– –– –– ‹ – –– – ‹
— — — — —
— ‹ — — —
— —
— —
— — — — — — ‹ – – – ‹ –– –– –– – ‹ J– ‹ ‹
— — — — — — — — — — — — — — — — — ‹ a ‹ ‹ _w
a —
—XÛ X——Û XÛ— _XÛ—. _XÛ _XÛ— XÛ— __XÛ ‹‹ X——Û _XÛ _XÛ __XÛ _XÛ _XÛ— XÛ— _XÛ— _XÛ— b XÛ— ‹‹
X Ú —X—Û— a j —XÛ— X——Û— X——Û ‹ —
—
‹ ——
‹ XÛ— ¥ ‹
——— ——— —— —— ———— X——Û— _X——Û— ———— E—Û—
XÛ XÛ— XÛ— XÛ XÛ XÛ ‹
X
Û ‹ ‹‹ w
==
? ============ =
‹‹ =========== a XÚ––– X––Ú– –– =‹‹ == ¥ =
‹a
‹ ======== XÚ– ‹ XÚ–– XÚ–– XÚ–– XÚ–– XÚ– X–Ú b X–Ú
– – – – – – – X
Ú
– X
Ú
– = ‹ ==
? w============== ==========\
? XÚ––– ======= XÚ–– XÚ––– X––Ú– XÚ–– . XÚ–––– XÚ––– X––Ú– ––– ‹ XÚ––– ========
– – – – –– ––
—
« «
——— a —— X——Û— X——Û X———Û . X—Û— ———XÛ X———Û _X—Û—
XÛ —— — ——— ——— —— — —— —— —— — —— b X——Û— — —— — —— —— —— — — —— —— —— ———
X ——Û X——Û— XX———ÚÛ X———Û E—Û— XÚ– XÚ XÚ X
X Û
Ú
– XÚ
–
==
& _XÚ–============ – = ‹ –– =========== ¥ =‹ XX—Û—–Ú . _XÛ— XÚ–– X–Ú XX–Ú—Û— XX—Ú–Û– XX—ÛÚ _XÛ— XXÚ––—Û X—Û X—Û—XÚ–– XÚ–– _XXÛ—Ú–– X–Ú XX—Û—–Ú _XÚ––– #_XXÛ—–Ú XX—ÚÛ–– X—Û XX—Ú–Û– X—Û XX—Û–Ú X—Û— XX—ÚÛ–– X—Û XX—Ú–Û– X—Û X—Û
–
–– ––– @ –– ––– XÚ––– ––– XÚ–– ––
– ‹ ‹ ==
& – ============ –– –– –– – –– –– – – – – ‹ –– ============ –– – – – – – – ‹
‹ ‹ ‹ ‹
— — — —
— ‹ j — — —
— —— ‹ —— ‹
— —
— —
— — _
X
—Û— ——— ——— n_X——Û— ‹‹ ——— _X—Û— _X——Û— _X——Û— _X——Û— ——— j
_ X
Û
_ n_XÛ _ — _ _ _ _ _ X
Û # _ —
—
X
Û a ¥ ‹
—XX—ÚÛ . _XÛ— XÚ– XÚ X——Û— _XXÛ——–Ú @ X——Û ——XÛXÚ– _XÛ— _XXÚ–Û— X–Ú XÚ– XÚ ‹‹ X——Û— XÚ
–
a X——Û— X——Û X——Û _XXÛ—–Ú . # XÚ _XÛ— _XÛ— ‹‹ a XÛ—— XÛ— #_XÛ _XÛ.
X
Ú X
Ú
– ‹ X
XÛ
Ú
– . X
Ú
– X
Ú
– X
Ú
– b X
Ú
–
– X
Ú X
Ú
– X
Ú
– X
Ú
–
– X
Ú
– X
Ú
– ‹
==
? –– ============ –– ––– X–Ú– ––– XÚ––– –– – –– –– ––– = XÚ ––– X–Ú– XÚ–– XÚ– XÚ–– XÚ– XÚ– XÚ XÚ––
‹‹ –– =========== –––– –––– =‹‹ # XÚ–– X––Ú XÚ– XÚ– –
–– –– – – –
– – –– – –– – – – – – – – –
– – ‹‹
« – – –– – –– –– ––– –– « == ––
? XÚ––– ============ ––– –– –– ‹‹ – ============
— — — — —— — —
— —— —— —— —— —— —— —— —— X——Û ——— —— —
X——Û XÛ— XÛ— _XÛ XÛ— XÛ— X——Û _XÛ _XÛ _ _XÛ XÛ— EÛ— X——Û . # XXÛ—Ú–– XÚ– XXÛ—Ú– XÚ–– _X—Û— ——— ——— X———Û X——Û— . —— j
# X—Û XÛ XÛ X—Û— a ¥—— —— —— — X——Û —— —— ¥—— X——Û —— —— a j
==
& ============ ‹ a=========== X–Ú– X–Ú–– XÚ––– XÚ––– . –– –– –– –– =‹ == X Ú
–
& –– =========== XÛ XÛ XÛ XÛ— XÚ– XÚ– X—Û— ‹
‹ ‹ J
a a XÚ–– # XÚ–– XÚ––– XÚ–– X–Ú n_XXÛ—Ú– . XÚ–– XÛ XÛ— =
–J – –– –– –– – ‹ –– ––––
XÚ # X——ÛXÚ–– XÚ–– XÚ–– XÚ– _EÚ–
‹ _XXÛ—Ú ============
– – – – –– – – XÚ–– = XÚ–– ‹
‹ ‹
———— __X——Û ————— _X———Û— _X——Û— __X——Û _X——Û— n_X——Û— ———— __X——Û ‹‹ __X——Û _XÚ _XÚ– _X——Û—_XÚ —_X—ÚÛ— ———— _—XÛ— ——— __X——Û _X——Û— _X——Û— ——— ‹‹
XÛ ‹ — ‹ ‹
XX–ÚÛ X–Ú– b X–Ú–– X–Ú– XÛ X–Ú X–Ú– n XXÛÚ ‹ X–Ú –XÚ– ––– –– –– X–Ú– X–Ú –XÚ –– XÛ XÛ
– – – – – J a
a — ——— #_X——Û _XÛ—— .
—
X
Û X
Û XÚ
– ‹ ‹
==
? – ============ # XÚ–– ––
–
–– ‹‹ – =========== ¥ =‹‹ XÚ–– XÚ–– XÚ–– XÚ– –– –– EÚ– X
Ú ‹ XÚ– XÚ– XÚ– XÚ X
Ú
– # X
Ú
–
– X
Ú
–
– ‹
==
? – =========== – –– –– –– = ‹‹ –– ============
–– –– –– XÚ–– XÚ– XÚ– # XÚ –– –– –– =‹‹
« — — —
« – – –– –– ––– _XÚ–– –– –– –
–
X——Û XÚ– # XXÚ–——Û XÚ– XÚ– X——Û n X——Û X——Û— X——Û— ——— X—Û— ——— ——— ——— ——— ——— ——— j a — — — ——
X Ú
– – – –
– – X
Ú
–
– X XÛ # XÛ XÛ— X——ÛÚ– . XÚ– XÚ– @XÚ X——Û X—Ú–Û– X——Û— # X———Û XXÛ—Ú– —
== –– – – – Ú––– a XÛa XÛ–XÚ XÛ =
& –– ============ ‹ XÚ
–
XXÛ—Ú– ===========
– – –
– – –
– — — — — —— —— ———— — — — —— —— —— —— —— — — —
J –– – – – – – –– ––– XÚ––– ––– =‹‹ ——— ——— X—Û— .
XÛ X—Û— X—Û— X—Û—— XÚ– X——Û X———Û X—Û— XÚ– X—Û— X——Û X—Û— X—Û— X—Û—— XX—Û—Ú– b XX—ÚÛ—– X——Û XXÛ—Ú–– XÛ— _XÛ XÛ— XXÛ—Ú–– X——Û X—Û— X—Û—
J ‹‹ – ‹ ==
& _X—ÛXÚ–XÛ——============ – X
Ú
– X
Ú
– X
Ú X
Ú
– XÚ
– – = ‹ XÚ
– =========== XÚ
– –
–– – – – – –– XÚ––– =‹
—_XÛ— ——— ——— _X—Û— _X——Û— _X—Û— ——— ——— — ‹ ‹ –– XÚ––
–
XÚ–– –– ––– –– ––– ––– –– –– ‹ ––
– – ‹
XÛ XÛ XÛ XÛ X—Û— ¥ ‹ ‹ ‹ — — — — — — ———— ‹
a Ú
X ‹ XÚ– . X Ú
– XÚ
– X Ú Ú
X
– X Ú
– X Ú
– X
Ú
– XÚ
– X
Ú
– ‹ — — ‹
==
? ============ XÚ–– XÚ––– ––– = ‹ –– =========== ––– –– ––– –– XÚ–– ––– –– –– –– =
–
–– ‹ _X—Û— _XX—Û—Ú– —XÛ—— X———Û ———— X——Û —— ——— X—Û—— _X—Û— _X—Û— ‹ __X—Û . _
_XÛ—— __X——Û _X——Û— ___XÛ— _X——Û— __XÛ—— __XXÚ–Û—— ___XÛ— __XXÛ——Ú– __X——Û ‹‹
– XÚ–– –– XÚ–– # XÚ–– XÛ XÚ– XÛ— EXÚ–Û– ‹ EÚ–– XÚ––
– –– –– ‹
–
« « ==
? – ============ – – –– = ‹‹ – =========== =‹‹
(c) Kai Renz, musical data taken from MuseData database and automatically converted to GUIDO
Figure 4.21: A Bach fugue broken with the optimal line fill algorithm
CHAPTER 4. CREATING THE CONVENTIONAL SCORE
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 111
27/1955.95 27/1777.37
20
26 3.
3. 25
20 3 7
70
1.
70
25/1752.69 229. 25/1574.11
091
0
23/1523.60 81
20
126 7.
75
2.
.39
4
66
0
877.163 21/1397.21 816.47
3
22/1371.45
8
98
3
.1
20
2
91
97 .80
452
.7
1.
83
02
7
19/1300.01 20/1287.73 151.378
401
18/1136.35
.40 1
0 50
1 24.
145.716
0
51
.36
3 .0
41
371
16/1011.85 125.310
182.172 15/972.124
17/1154.30 25 0
700.
384 2 80. 16
1.
21
13/886.540 1
127
129.
12/810.913
862.211
93.1106
11/757.413
844.
286
070
107.
10/717.802
9/650.127 45
3.
02
153 5
480.
6
7
35
99.573
.3
816
96
.47
3
7/553.791
8/618.229
934.614
127.726
37
112
6/490.502 6.
84
899.685
2
.28
0
40
4.
2
118. 44
894
37
804.247
7.
4/371.609
41
5/441.509
2
149.752
3
51
4 .50
. 183
2 74
5
23
3/291.758
5.
2/188.106
83
29
188 1.
.10 75
6 8
0/0
Figure 4.22: Partial calculation graph for the optimal line break of the Bach fugue
112 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
quence Bmin = (b0 ; b1 ; : : : ; bm ) where each bi contains three parameters: a slice num-
ber (referred to as bi :pos below), a position on a page (referred to as bi :height below),
and a flag, which is set, if the slice is the location of a page break (referred to as
bi :pagebreak); as before, Bmin should minimize the penalty function P
k 1
X
8B; B is Break Sequence : P (B min )= Pb :pos+1;b +1 :pos P (B )
i i
i=0
and additionally the following four conditions must be met:
1. 81 i < m and bi :pagebreak = false :
bi+1 :height = bi :height + height(bi :pos; bi+1 :pos)
2. 81 i < m and bi :pagebreak = true :
bi+1 :height = height(bi :pos; bi+1 :pos)
3. 81 i < m : bi :pagebreak = true ,
bi :height + height(bi :pos; bi+1 :pos) > pageheight
4. bm :height 0:75 pageheight
The first condition means that the height of each break location increases as long as
no page break is encountered. The second condition means that the height is reset
if a page break is encountered. The third condition states that a page break may
only occur, if the following line does not fit on the current page. The last condition
requires the final measure to end on the last 75 percent of the page.
The algorithm works by dividing a page into a distinct number of “page areas”,
which are called slots. Figure 4.23 shows an example page that is divided into 12
different slots. There are three allowed ways to break the first eight measures of the
Bach fugue (BWV 846) into lines. This means that each breaking sequence of Fig-
ure 4.23 contains only lines of music whose individual penalties are within the (user
defined) penalty range. On the bottom of Figure 4.23, only two lines are required to
set the eight measures. Here, measure eight ends in slot 5. The overall penalty of
the two lines is 1734:92, which is a very high penalty, which is not surprising con-
sidering how tight the two lines are spaced. In the middle of Figure 4.23, the end of
measure eight ends in slot 7. The penalty associated with this particular breaking
sequence is 796:896. At the top of Figure 4.23 the end of measure eight ends in slot 9.
The penalty associated with this setting is 618:229. When dealing with optimal page
fill, it is important to remember each one of the different possibilities to break the
music together with the associated slot. To meet this goal, the algorithm needs to
deal with the height of slices and lines. Using the required height of a collection
of slices, the output slot can be computed. For each slot only the best way break
sequence is saved – this is where Dynamic Programming is used again.
The maximum number of slots, which is denoted , directly determines how fast the
algorithm runs. The running time is bound by O (n ). Essentially, the algorithm
FUGA I FUGA I FUGA I
J.S.Bach BWV 846 J.S.Bach BWV 846 J.S.Bach BWV 846
— — —
— ———— —— — — —— —— —— — — — — — —— —— —— —— —— ——— — — —— X—Û— —— ——— X—Û— —— —— ——— —XÛ— ——— X———Û
&
c a
======================== X
Ú . X
Ú
– X
Ú X
Ú
– X
Ú
– X
Ú
– X
Ú
– X
Ú
– X
Ú
a X——Û— X——Û— X——Û X——Û . XÛ—X——ÛXX——ÚÛ XÛ— X——Û—XÚ XÛ— XÛ— XÛ— XÛ— X——Û ——XÚXÛ X——Û— X——Û— X——Û X——Û X——Û X——Û XÛ— XÛ— XÛ— XÛ— # XÛ XÛ XXÚ——Û
X
Ú # X
Ú – – X
Ú X Ú – X Ú a a= –– ‹ &
c a
======================== XÚ XÚ– XÚ XÚ XÚ– XÚ a X—Û— X—Û— X—Û X—Û . X—Û XX—ÚÛ X—Û—XÚ–– XXÛÚ–– XÚ– XXÛÚ–– XXÛÚ– = &
c a
======================== XÚ XÚ– XÚ XÚ XÚ– XÚ a X—Û— X——Û— X——Û
‹
_XÚ–– XÚ–– XÚ––– ––– ––– ––– XÚ––– ––– ‹‹ XÚ––– ––– –– –– –– ––– XÚ––– ––– XÚ––– XÚ––– _XÚ–– XÚ–– _XÚ–– _XÚ– ‹‹ __XÚ– ––– –– ––– ––– XÚ––– –––– XÚ––– ‹‹ –– ––– XÚ––– XÚ––– _XÚ––
– – J– ‹ _XÚ–– XÚ––– XÚ––– XÚ––– . –––– XÚ––– XÚ––– ––– ‹‹ XÚ––– –––– ––– –– ––– XÚ––– XÚ––– XÚ––– XÚ––– XÚ––– _XÚ–– XÚ–– _XÚ–– _XÚ– ‹‹ __XÚ– # XÚ––– –––
– – –
– – – –– ‹‹ _XÚ–– XÚ––– XÚ––– XÚ––– . –––– XÚ––– XÚ––– ––– ‹‹ XÚ––– –––– ––– –– ––– XÚ––– XÚ––– XÚ––– XÚ––– XÚ––– _XÚ–– XÚ––– _XÚ–– _=
– –
‹ – – – – ‹ –– –– ‹ J– ‹ ‹ – – – – ‹ –– –– ‹ ‹ – XÚ––– ‹‹
‹ ‹ ‹ ‹ ‹ ‹ ‹ ‹ ‹
‹ ‹ ‹ — — —— —— . X———Û ——— ——— _X—Û—— ‹ ‹ ‹ ‹ ‹ ‹
‹ ‹ ‹ X——Û XÛ— XÛ— _XÛ __XÛXÛ— _ ‹ c ‹ ‹ ‹ c ‹‹ ‹
c ‹ ‹‹ ‹‹ a ‹ ========================
? ‹ ‹ =‹ ========================
? =‹
========================
? ‹ =‹
« « ——— —
— ——— — — — — ——— — —— —— —— — — — — — —— —— —— —— X———Û ——— — — — — — — — — ——— X——Û ——— —— —— — — —— — —— —— ——— X——Û ——— X———Û # X———Û X—Û— ————
—— — — — — — a — —— —— —— X——Û———— _XÛ— X——Û— X——Û X——Û_XÛ—X——ÛX——ÛX——Û— _XÛ— _XÛ—_XÛ—_XÛ—X——Û X
X ——— X—Û— . X—Û— X—Û— XX—Û——Ú– X—Û X———ÛXÚ– XX—Û—Ú XÛ XÛ XÛ— XXÛ—Ú X——Û— X——Û XÛ— XÛ— XÛ— XÛ— XÛ XÛ
a
———Ú–Û X——Û— XX——ÚÛ— X———Û XÛ—— X——Û XÛ—— XÛ— XÛ— XÛ— XÛ—a # XÛ a XXÚ–——Û XXÛ——Ú– XX——ÚÛ— XXÛ—Ú XÛ—— X——Û X——Û— X—Û—XÚ. XÚ X—Û— X—Û—— ——— X—Û—— X—Û— X——Û— X——Û— XX——ÚÛ— X———Û EÛ—— XÚ– XÚ XÚ
=========================
& –– –– XÚ–– XÚ– _X–Ú – ‹ –– –– XÚ–– –– =========================
& _ # XÚ––– –– XÚ–– ‹ ––– XÚ––
– XÚ–– a XXÚ–––Û—=‹
XXÛ——Ú– XX——ÚÛ— XXÛ—Ú XÛ—— X——Û X——Û—X—Û—XÚ. XÚ X—Û—X—Û—— —— X—Û——X—Û— X——Û—X——Û—XX——ÚÛ—X——Û EÛ——XÚ–XÚ XÚ XX——ÛÚ– XÚ– XÛ—— XÛ— XÛ— XÛ . XÛXÛ—
========================= — ‹ =‹ – – – –– J– ‹ – – –
–– –– XÚ–– XÚ– XÛ ‹ _X–Ú –– @ –– –– XÚ–– –– XÚ– XXÛÚ––– =‹
–– – – ‹ –– – – – – – ‹ _XÚ–– –– –– ––– XÚ––– XÚ–––
– ‹ – XÚ–– _XÚ––
– – J ‹
& –– ––– XÚ–– –––
–
––– ––– XÚ–– XÚ–– XÛ ‹_XÚ– ––– @–– ––– XÚ–– ––– XÚ–– –– ‹ –– ¥
– – ‹ –– – – – – – ‹ ‹ ‹ J ‹ ‹ ‹ – ‹ J ‹
‹ ‹ ‹ — — — —— ‹ — — — —— — — ‹ — — ‹ ‹ — — — —
— — —
— —
— ‹ — —
— —
— —— — — — — — — ‹‹
— — — — — — ‹ —— ‹‹
—— X——Û— X——Û— _X——Û X——Û— ——— —— —— —— —— ‹‹ — XÛ— — —— ———— —— X——Û— ‹‹ — j
a
—X—Û— ——XÛ X———Û _X—Û— . _X—Û— _X—Û— X———Û __XÛ ‹‹ X——Û _X—Û— _X—Û— __XÛ_X—Û— _X—Û— X———Û _X—Û— _X—Û— b X———Û ‹‹ XX——ÚÛ . _XÛ— XÚ– XÚ X——Û— _XXÛ——Ú– @ X——Û XX———ÛÚ– _X—Û— _XX—Û—Ú– XÚ– XÚ– XÚ ‹‹ ‹ a X—Û— X——Û— X——Û _X——Û . _XÛ— ‹
_XÛ— _X——Û X——Û _
—XÛ_ ____XÛX—Û _XÛ _XÛ b X—Û ‹ XX—ÚÛ ._XÚ–XÚX—Û— _XXÛÚ– @X—ÛXX—ÛÚ– _XÛ_XÚ– XÚ–XÚ–XÚ ‹ X—Û—XÚ a X——Û— X———Û X——Û _XXÛ—Ú– . # XÚ_XÛ—_XÛ— ‹‹ XX——Ú–ÛXÚ– b__XXÛÚ–– XÚ– X——Û _XÛ— _XXÛ—Ú– __XÛ_XÛ— n_XÛ— n X——Û __XÛ ‹‹
?
========================= ‹
‹ a X Ú
– XÚ
–
– XÚ–– ‹‹ ––– –– ––– XÚ–– ––– XÚ–– –– –– –– –– ––– ‹‹
– – = ‹ ‹
?
========================= a X–Ú– X–Ú–– X–Ú–– ‹‹ ––– –– ––– X–Ú– ––– X–Ú–– ––– –– ––– –– ––– ‹‹ X–Ú– –––– –XÚ–– X–Ú–– X–Ú– X–Ú–– X–Ú– XÚ–– X–Ú– X–Ú–– –––– –––– ‹‹ –– –– –– –– # X–Ú– X–Ú–– –– X–Ú–
=‹‹ – – – – – ?
========================= ‹ =‹
––
— —— — —— ——— —— —— —— ——— ——— X——Û ——— —— —
a —— X——Û— X——Û X———Û . X—Û— X———Û X———Û _X—Û—
XÛ X—Û— XÛ— XÛ— _XÛ XÛ— X—Û X—Û _XÛ _XÛ _ _XÛ XÛ— X——Û —— X——Û X——Û ——— —— —— —— — — —— —— — —— —— ——— E——Û ———
=========================
& XÚ––– ¥ ‹ =‹ =========================
& X–Ú–– XXÛ—–Ú– XÚ–– X–Ú– XÛ XÛ— X—ÛX–Ú– . X–Ú– X–Ú– X—Û XX—ÚÛ–– X—Û— X—Û X—Û ‹ _XXÛ—Ú– X—Û XXÛ—–Ú– XÛ @ X–Ú–– X–Ú– X–Ú– X–Ú– XÚ–– XXÛ–Ú–– =‹
‹ ‹ – – – ‹ –– – – – – ‹
— — ‹ ‹
j — — — — — ‹ —— —— ——— ——— —— —— —— —— —— —— ‹
— — — — — — ‹ — ‹
—XXÚ—Û XÚ– XÚ a X——Û X——Û XÛ—— _XXÛ—Ú–– . # XÚ– _XÛ— _XÛ— ‹‹ XXÛ——Ú– XÚ–– b__XXÚÛ–– XÚ– X——Û _XÛ— _XXÛ—Ú–– __XÛ _XÛ— n_XÛ— n XXÛ——Ú __XÛ ‹‹ —X—Û— _X——Û _X——Û __XÛ— _X——Û _X——Û X——Û— _X——Û _X——Û b X——Û— ‹ X—Û— _X—Û ——— _XX—ÚÛ @ X——Û— X——Û— _X——Û _X——Û XÚ ‹
–– X Ú XÚ
– ‹ X Ú
– . Ú
X
–
– X
Ú
– X
X Û
Ú
– –
– X Ú X Ú
–
– X Ú
–
– –
– X
Ú
–
– X
Ú
– ‹
=========================
? –– –– ––– XÚ––– XÚ––– XÚ––– XÚ––– XÚ––– XÚ–– XÚ–––– ––– ––– ‹‹ – – – – # XÚ–– XÚ–– –
– – =‹‹ ?
========================= X Ú
a –– –– –– – ‹
‹ – – – –
–– – –– – –– –– – – = – – –
– ‹‹
–
« ——— — — —— —— ——— —— —— —— ——— ——— X—Û— ——— ——
a ——— X—Û— X——Û— X——Û . X——Û X——Û X——Û— _XÛ——
XÛ X—Û— X—Û X—Û _XÛ X—Û X—Û X—Û— _XÛ _XÛ _ _XÛ X—Û
X Ú
=========================
& ––– ¥ ‹ =‹
‹ ‹
—XÛ— j — —
— —
— ——— . X—Û— ——— ‹‹ ——— _X——Û— ———— X——Û— X——Û— _X——Û— X——Û— ——— ——— _X——Û— ‹‹
X
Ú
XÚ – XÚ–
a XÛ— XÛ— XÛ _XXÚ–Û # XÚ _ _XÛ ‹ XXÚ–Û XÚ– b_XÚ–– XÚ– XÛ— _ _XÚ– _ _ n_XÛ n XÛ _ ‹
––– ––– XÚ––
« =========================
? ––– –– –– XÚ––– XÚ––– XÚ––– XÚ––– XÚ––– XÚ––– XÚ–––– –
‹‹ –– –– –– –– # XÚ–– XÚ–– –
– – – =‹‹
4.5. SPACING, LINE BREAKING, AND PAGE FILLING
carries out the optimum line fill algorithm times. To determine the optimum
break sequence that also fills the last page completely, the algorithm simply picks
a small overall penalty value from the last slice which ends on the lower part of
the page. Going backwards from this entry reveals the locations for line- and page
breaks. Note, that the algorithm determines the number of pages automatically;
this is similar to the optimal line break algorithm, which determines the number of
lines automatically.
The user can manipulate the algorithm by numerous ways: either by specifying di-
rect line- or page-breaks and by discouraging or encouraging certain breaks. These
flags are directly used by the penalty function and result in optimum break se-
quences, which also consider user input.
Algorithm 4 is an extension of Algorithm 3. For each slice, there are now slots re-
served for entries that save the predecessor (slice and slot), the accumulated penalty
and the accumulated height. The algorithm iterates through all slices (loop for i = 0
to n 1). For each slice, all entries (loop for j = 0 to ) are taken as a starting point
for future lines. This is quite similar to the optimal line breaking algorithm. The
optimal page fill algorithm does not only compute the penalty of a potential line be-
ginning at end of slice i and going up until slice k , but it also computes the required
height. If the accumulated height of the previous ending slice (this is called prev-
height in the algorithm) added to the height of the current line is bigger than the
page height, then the new line is placed on a new page. The accumulated height is
used to determine in which slot the calculated break position should be placed. Only
if the calculated penalty is smaller than the slots’ current value, the new value re-
places the old calculation; this is again an application of Dynamic Programming. If
the penalty of the potential lines becomes too large, the next slot of the current slice
is calculated. When the algorithm has finished, the calculated entry array is used
to retrieve the page break sequence that fills all pages. The procedure to retrieve
the sequence is shown in Algorithm 5. The algorithm first looks at the lower part
of the last slice: it finds the minimum penalty for the lowest quarter of the page. If
no value is found (this mostly happens, if less than one page of music is being set),
then the lowest overall penalty is determined – this entry is similar to the optimal
line break algorithm. Finally, the break sequence is determined by going backwards
from the last entry.
In order to demonstrate, how the optimum page fill algorithm works, Figure 4.24
shows the Bach fugue (BWV 846) of the previous section automatically being set on
exactly two pages. Figure 4.25 shows the calculated entryarray grid. The measures
of the finally chosen path are highlighted. Note that the optimum line break path
of the previous section (requiring 3 pages) is also present in the calculated array.
Figure 4.25 also shows that the first fit approach leads to a bigger overall penalty
than the optimal line break algorithm.
4.5. SPACING, LINE BREAKING, AND PAGE FILLING 115
— — — ‹ ==
XX—Û—Ú X—Û X—Û— X——Û XX——ÛÚ XX——ÚÛ– b XX—Û—–Ú X—Û XX—Û–Ú XÛ XÛ XX—ÚÛ–– X—Û XX—Û—Ú X——Û X——ÛX——Û . X——Û X—Û—j — —X—Û — X—Û — X——Û X——Û XÚ– X——Û b XX——ÛÚ X——Û—XÚ X—Û— XÚ XX——ÚÛ–
– – – XÚ– – = – ‹
—— — —
— —
— —
— —
—
—— —— —— — ‹‹ — — — — —— — — — ‹ & ––– ========= ––– –– –– – –– ––– =‹ XÚ–========
‹ ––
XÚ–– XXÛÚ–– XXÛÚ–– X—Û X—Û =‹ XÚ–– ======
– – –
__XÛ XÚ– _XÚ– _XÚ–– _XÛ—_XÚ– XÚ– _
X Û
Ú
– —
X
Û _
X
Û —
XÛ
__XÛ _XÛ— _—XÛ X—Û— ‹ _X—Û —— X—Û— _X—Û _XÛ— _X—Û X—Û— —— ——
X
Û XÛ XÛ ¥ ‹ ‹ J– – J–– –– –– –– – ‹
‹ ‹
X–Ú – –– – – – X–Ú XÚ– –
– – – – – – J a ‹ a XÚ ‹ — —— —— —— _X——Û — —— —— _X——Û —— —— ‹‹ ‹
==
? ============ ¥ ‹
============ X
Ú
XÚ–– ––– –– =‹ – ‹— —
– __XE—ÛÚ– . __XÛ—__X—Û _X—Û __ _XX—Û—–Ú __XÛ— __XX–ÚÛ— ____XXÛ—Ú–– __X—Û ‹ XÚ X–Ú XÚ– XÚ
– –
– – – ‹ – – – X Ú
—X—Û— XÚ X——ÚÛ—–
X Ú –
‹ X—Û— —— —— —— —— . X——Û— ——— ——— X——Û ‹
‹ — — —
X
Û X
Û X
Û —
X
Û ‹
XÛ XÛ
« ==
? – ========= =‹ –– ======== –– ––– –– ––– XÚ–– XÚ– XÚ– XÚ– XÚ–– ––– ––– ––– XÚ=‹ w======
– –– –– – – – ––– =‹
a — — — — —
X
—— —— —— —— j
—Û XÛ— # XX—ÛÚ X—Û X——Û—Ú– . XÚ– XÚ– @XÚ X—Û— X——Ú–Û– X——Û # X——Û— XX—ÛÚ– XX—ÛÚ– XÚ– XX——Û—Ú– # XX——ÚÛ X——Û X——Û— X——Û X——Û X—Û— ——— X——Û X——Û— X——Û X——Û XX——ÚÛ
«
==
& XÚ–– ============– – – – # XÚ– n XÚ– XÚ # X—Û
‹ –– =========== ––– ‹ —— —— X——Û —— —— —— —— — — — —
— _
X
Û —
X
Û — — — — — — — — — — —
— —
— —
—
—— —— —— —— —— —— —— . X——Û _E———Û
_XÛ __
– – –– –– –– ––– ––– XÚ––– –– = ‹ – – –– –– –– ––– XÚ––– XÚ–– XÚ––– XÚ––– =
– ‹ b
‹ ‹ ==
& =– ========
XXÛ—Ú– . XÛ XÚ–– XÚ– XÚ– XÚ–– XXÛÚ XÚ–– XÛ XXÛÚ–– XXÚÛ–– XXÛÚ–– XÛ—XÚ– EX—ÛÚ– b XÚ– XÚ– XÚ XÚ @ X——Û X——Û n X—Û— X—Û XÛ— XÛ XÛ XÚ XX—ÛÚ– X——Û X—Û— X—Û X—Û XÛ— XÚ––– XÚ–– EEÛ—Ú–––
– – – – – – – – X
Ú – X
Ú – =\
‹ — — — — ‹ – – – – –– – – – – =‹‹ – ========= – –– –– ––– –– ––– XÚ–– XÚ––
– – XÚ–– . ––– ‹‹ – ====== @– –
‹ a ——— # X——Û # X——Û X——Û— . n XÚ X—Û X—Û— X——Û _X—Û ‹ ‹ – ‹
. XÚ–– XÚ– X–Ú XÚ–– XÚ ‹ XÚ–Û– XÚ–– XÚ– ––– XÚ–– ‹ — —————
==
? XÚ–– ============ –– –– –– – XÚ–– XÚ–––– XÚ––– ––– XÚ––– XÚ–– =
– ‹ X–Ú–– =========== J– J– J =‹ ‹ _w
‹
J X
——— X——Û— X——Û— X——Û ——XÛ— X———Û —— —— —— X——Û X———Û— X———Û b X—Û— ———XÛ— ‹‹ X——Û ——— X———Û X——Û— X——Û X—Û— _X—Û X——Û E——Û
Û XÛ XÛ— XÛ ‹
Figure 4.24: The Bach fugue (BWV 846) set by the optimum page fill algorithm
—w
« ==
? =========
‹‹ w XÛ
==========‹ w====== =\
117
(c) Kai Renz, musical data taken from MuseData database and automatically converted to GUIDO
«
118 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
7/4/1132.64 7/6/553.791
6/4/669.170 6/6/490.502
5/4/441.509
4/2/835.235 4/4/371.609
3/2/291.758
2/2/188.106
Figure 4.25: The calculated graph for the optimum page fill of the Bach fugue (BWV
846)
4.6. CONCLUSION 119
Some of the assumptions used when describing the optimal line break and page
fill algorithm must be reconsidered: mainly the assumption that slices are simply
treated as measures does not hold for all types of music. There are scores, where
breaks in between measures are necessary and lead to much better scores than
it is the case if only complete measures can be broken. Additionally, some music
does not contain measures, or different voices/staves can contain different meters
and therefore different bar lines. Then, the question of detecting suitable break
locations gets complicated. In rare cases, music can be constructed that contains
no possible break locations, because there are no common onset times for all given
voices (see also the AutoBreaks-algorithm on page 73).
Another issue that has to be mentioned is the question of how the slices are merged
and how rods are treated that lie at slice boundaries. In the case of bar lines, the
issue is pretty straight forward. If other break locations are considered, it must be
asserted that several slices can be merged into one line without visible disruptions.
Finally, Knuth’s line break algorithm has been fine tuned over a period of time by
carrying out user experiments. For example, Knuth mentions how loose lines being
followed by tight lines disturb the visual appearance. When dealing with optimal
line breaks, the fine-tuning of the penalty function and the setting of the optimum
force and the allowed penalty for individual lines must be carefully adjusted. This
certainly requires some experience with a large body of scores.
4.6 Conclusion
This chapter dealt with the process of converting an arbitrary G UIDO description
into a matching graphical representation of a score. First, it was shown that some
of the required music notation algorithms can be directly described as G UIDO to
G UIDO transformations. Then, those music notation algorithms, which operate di-
rectly on the Abstract Representation were presented. In the following, the graphi-
cal requirements for any conventional music notation system were exemplified and
the Graphical Representation (GR), which is an object-oriented hierarchical struc-
ture representing all visual elements of a score was introduced. Subsequently, the
conversion of a G UIDO description (and its corresponding AR) into the matching
GR was explained. Finally, an improved algorithm for spacing a line of music was
presented. Additionally, a completely new algorithm for optimal page fill was intro-
duced, which makes optimal use of all pages required by the score.
The algorithms and procedures described in this chapter show that the creation of
a score is a complicated task that strongly depends on the underlying data struc-
ture. In order to create “good” music notation, great care has to be taken when de-
signing the overall structure and the available manipulation routines. Overall, the
creation of conventional scores can be managed (at least partially) by using clever
120 CHAPTER 4. CREATING THE CONVENTIONAL SCORE
algorithms.
Obviously, some issues, which are essential for creating esthetically pleasing music
notation, have not been discussed, although they have been – at least partially –
implemented in the current system. These issues concern subjects like a completely
automatic collision detection (and prevention) or, for example, the exact and com-
plete description of how to calculate beam-slopes for any conceivable case. While
some of these techniques have been described elsewhere (see for example [Gie01]),
some have not been dealt with in the accessible literature. A system capable of
automatically avoiding all sorts of collisions of musical symbols has not been yet re-
alized, although it seems that progress using “intelligent” approaches is possible; a
fruitful endeavor might be the use of random local search algorithms in conjunction
with evaluation functions to avoid local collisions of symbols. It is nevertheless a
futile task to estimate if such a system will be available in the near or more distant
future.
Chapter 5
Applications
The previous chapters dealt with the algorithms and data structures involved dur-
ing the conversion of arbitrary G UIDO descriptions into conventional scores. This
chapter now presents the notation system, which has been designed and imple-
mented during work on this thesis. All of the ideas presented in the previous chap-
ters are incorporated in the so called G UIDO Notation Engine (GNE). The objective
of the GNE is the conversion of an arbitrary G UIDO description into a graphical
score. The GNE is realized as a platform independent library, which offers well
defined interface functions that can be used by applications, which require conven-
tional music notation to be displayed. The GNE has been compiled as a Dynamic
Link Library (DLL) for Windows and as a shared object library for Linux. As the
complete source code is written using ANSI C++, it is easily possible to compile the
notation engine on other systems. Currently, three different output mechanisms
for a graphical score exist when using the GNE: a score can be retrieved as a GIF
picture (which is a bitmap format), or it can be drawn directly into a window (us-
ing operating system dependent drawing routines), or a so called G UIDO Graphic
Stream (GGS) can be retrieved, which is a platform independent way to describe a
rendered score. The first two retrieval methods are currently available only when
using the Windows-DLL, while the third approach is available using Linux or Win-
dows.
At the moment, the GNE is used in three different application contexts: in connec-
tion with a standalone notation viewer, as an online service using standard Internet
services, and as a (prototypical) interactive notation editor based on Java. These ap-
plications together with extensions made possible through them, are presented in
this chapter.
121
122 CHAPTER 5. APPLICATIONS
running on Windows PCs. As the name suggests, the G UIDO NoteViewer is not a
notation editor. The NoteViewer reads G UIDO descriptions and creates a screen
display or a printer output of the rendered score. The user can zoom in and out of
the score; it is possible to scroll within the window and to change the displayed page
number. The G UIDO NoteViewer is somewhat similar to the Acrobat Reader, which
displays PDF documents, but does not allow them to be changed interactively.
Figure 5.1 shows a screen shot of the G UIDO NoteViewer. The G UIDO description of
the score in the lower window is visible in a text-editor.
a G UIDO description using the midi2gmn component, which has been realized by
J. Kilian [Kil99]. The flow control unit then sends the G UIDO description to the
GNE (called nview32.dll), where it is prepared and converted into an internal score
representation, as it was shown in the previous chapters. The flow control unit also
creates a window on the screen, in which the GNE directly draws the score using
functions supplied by the operating system. All user interaction (like, for example,
the zooming or changing of the page number) is handled by the flow control unit,
which sends formatting information to the GNE, which then redraws the score.
Another component deals with audio playback using MIDI: the flow control unit
sends a G UIDO description to the gmn2midi module, which creates a MIDI file.
This MIDI file is then played using routines provided by the operating system.
A text editor window within gmnview.exe can be used to edit the textual G UIDO
description of any displayed score. If the G UIDO description is changed by the user,
the flow control unit resends it to the GNE, which then redraws the score in the
respective score window. The changed description may also be saved in a file.
Printing is handled similar to screen-display: all the printing commands are gener-
ated by the GNE through functions provided by the operating system.
The communication between the flow control unit and the individual modules fol-
lows a well defined interface, which is partly described in the following subsection.
sends GUIDO
sends MIDI file
description and
formatting in− midi2gmn
formation Flowcontrol creates GUIDO
description
nview32.dll
(Notation Engine) plays sends GUIDO description
converts GUIDO MIDI file
description into gmn2midi
creates
conventional score sends
GUIDO MIDI
user− File
Text Editor
input creates
draws in window
window
Scorewindow
gmnview.exe
nview parse This function is used to start a new conversion process: a file-
name for a G UIDO description file is provided; the file is parsed by the GNE and
converted into an internal score-representation. Finally, an integer-handle is
returned to the caller. This handle is then used in all of the following routines
to uniquely specify the process.
OnDraw This function is used for score display and printing. Additional format-
ting information is supplied by the caller: a zoom-factor, scrolling-parameters,
a page-number, the size and a handle of the display-window. Additionally, a
flag can be set to trigger printing or Postscript-output. The GNE then draws
(or prints) the score within the window (or on paper).
getNumPages This routine can be used to determine the number of score pages
that were created by the conversion process. This is important for handling the
scrolling and printing of individual pages.
Several more interface functions are provided that are used in different application-
contexts, which will be described later in this chapter.
NoteServer. The G UIDO NoteServer is a free online service, which converts G UIDO
descriptions into images of conventional scores, which can be displayed in a stan-
dard Internet-browser [RH98, RH01]. Figure 5.3 shows a screen shot of the G UIDO
NoteServer: on the left, the default input page is shown; on the right, the resulting
score image is displayed.
Sibelius
Finale
Score Database
MIDI Score
or MIR System
(Converted
(Converted with from other
midi2GMN) Notation Packages)
Coded by hand
GuidoXML
G UIDO description
(eg. [ c d e f g])
Virtual Keyboard
CGI or
Secure Socket Layer
GIF of Score
GUIDO NoteServer
PERL script
(administration)
uses/renders
GMN2GIF
EPS of Score
G UIDO
Notation
GMN2EPS Engine
Communi−
cation
GMN2MIDI using
MIDI of Score G UIDO
Graphic
Stream
gifserv: this script gets a G UIDO description and additional formatting pa-
rameters and simply returns an image of the score. This service is very useful
for third party applications, that do not need a complete web environment, but
just want to include the score-image within their own environment.
midserv: this script takes a G UIDO description and returns a MIDI file, which
can then be directly played on the client computer using a standard sound-
card.
http://www.noteserver.org/scripts/salieri/gifserv.pl?defpw=16.0cm&
defph=12.0cm&zoom=1.0&crop=yes&mode=gif&gmndata=%5B%20c%20d%20e%20%20%5D%0A
128 CHAPTER 5. APPLICATIONS
The first part is a standard web address; after the question mark, a series of name-
value-pairs are separated by ampersands. In the URL above, the following variables
are used:
defpw=16.0cm This sets the default page width to 16 cm.
crop=yes This sets the page-adjust-flag. This ensures, that only that portion
of a page that contains the score is returned as the image.
GetMIDIURL This function takes a G UIDO description and returns the URL
that converts the given description into a MIDI file.
<html> <head>
<title>How to embed Music Notation in WEB pages
using JAVAScript</title>
<script language=’’JavaScript’’>
// this functions takes a GMN-string and returns the URL
// that converts it into a GIF-file
function GetGIFURL(gmnstring,zoom,pagenum)
{
gmnstring = escape(gmnstring);
gmnstring = gmnstring.replace(/\//g,"%2F");
return string;
};
// This function takes a GUIDO string, accesses the
// NoteServer (address specified as a constant above)
// and then embeds the GIF-Image in the document.
function IncludeGuidoStringAsPict(gmnstring,zoom,pagenum)
{
if (!zoom) zoom = "";
if (!pagenum) pagenum = "";
document.write("<img src=" +
GetGIFURL(gmnstring,zoom,pagenum) + ">");
};
</script>
For each one of the above mentioned JavaScript routines, there exists an additional
function, that takes a G UIDO-URL instead of a G UIDO description. A G UIDO-URL
is just a standard web address, which points to a valid G UIDO description. Using
this mechanism it is possible to easily create a score display for G UIDO based music
databases.
Because sometimes the scores that need to be embedded in web pages become quite
large, another approach using Java has been implemented. The developed Java
Applet uses the G UIDO NoteServer to display the image of a score. The image is
scrollable and it is further planned to realize an adjustable zoom factor. Figure 5.6
shows the Java Applet in a web page.
The shown Java Applet uses version 1.1.8 of Java, therefore, it runs on almost any
platform.1 The basic structure of the scroll Applet is quite simple: the CGI-based
access to the NoteServer is used to retrieve a GIF-image of the score. This GIF-
image is then displayed within the scrollable area of the window. Because Java
offers very easy to use routines for accessing Internet content, the realization of the
scroll Applet was fairly easy. Because of Java security issues, the Applet must be
stored on the same machine as the G UIDO NoteServer. Although it is possible to
adjust the security level for an Applet, it is generally very important to ensure, that
only trusted Applets are allowed access to other computers on the Internet.
1
Newer versions of Java (beginning from 1.2) do not run on Macintosh Computers with OS 9 and
before. Using the older Java version ensures that the scroll applet can be used by almost all Internet
users.
5.2. THE GUIDO NOTESERVER 131
The keyboard Applet is an excellent tool for learning how to write music using
G UIDO. Because both the score and the G UIDO description are shown in paral-
lel, and because the text can be edited directly, the intuitive nature of G UIDO can be
easily seen. By supplying buttons for the most common musical markup (like clef,
key, and meter), the user directly learns, how tags are used within G UIDO descrip-
tions.
The general architecture of the Java-based music notation editor is shown in Fig-
ure 5.9. The layout is a standard client-server-application; in this case, the client is
the Java-based music notation editor and the server is the G UIDO Notation Engine.
Communication between the client and the server is done using a newly defined
protocol language, which is called G UIDO Graphic Stream. This protocol language
will be described in the next subsection. Additionally, a “FontServer” has been de-
veloped: this server is needed to create images of musical symbols that are needed
for score display. Because the Java version being used for the development of the
editor does not allow the usage of arbitrary fonts, the FontServer had to be devel-
oped to guarantee platform independent notation. The FontServer not only creates
5.3. JAVA-BASED MUSIC NOTATION AND EDITING 133
the image of the symbol, but also returns information on the bounding-box, which
is needed for hit-testing when reacting to user input.
Two interesting features have been (partially) implemented in the notation editor:
hit-testing and editing-constraints. Hit-testing is the process required when evalu-
ating, which element needs to respond to a mouse-click. In the current implemen-
tation, a grid is used to identify those objects that might be candidates for a hit.
Editing-constraints define allowed movements for musical symbols: a note head, for
example can only be vertically placed directly on a staff-line or in the space in be-
tween. All other vertical positions are not allowed. Therefore, when moving a note
head, it must be ensured that the final location is at an allowed position.
Client Server
Internet
Java Music Notation Editor GUIDO Notation Engine
− interprets GGS GUIDO − converts GUIDO
− displays score Graphic description into GGS
− handles user interaction Stream − sends and interprets GGS
− creates and sends − updates GUIDO description
GGS for editing
requests
symbol
Font Server
− provides images of
sends
music notation elements
image
\unit<25>
\open_page<4979,7042>
\draw_staff<1,5,474,674,594>
\draw_image<"treble_clef",2,417,824>
\draw_image<"ledger_line",3,444,724>
\draw_image<"qnotehead",3,495,724>
\draw_stem<3,525,724,175>
\draw_image<"qnotehead",4,686,699>
\draw_stem<4,716,699,175>
\draw_image<"qnotehead",5,877,674>
\draw_stem<5,907,674,175>
\draw_image<"endbar",6,1028,674>
\close_page
the use of unique (integer) identifiers, which are used to group and identify individ-
ual graphical elements. Different GGS-commands may use the same id to specify
that all graphical elements using this id are treated as a unit. This can be seen
in Figure 5.10, where the individual notes are created from note heads and stems,
which share the same id. The id is also used for sending editing instructions from
the client to the server. An editing instruction might look like nmove<4,725,599>
which instructs the server to move the element with the id 4 to the given position.
Currently, only a small subset of the GGS is actually implemented in the G UIDO No-
tation Engine; it is planned to realize a complete GGS output for all implemented
notational elements. The fully implemented GGS would directly result in a com-
pletely platform independent music notation viewer (and editor).
The conversion of a GGS description into another document oriented format, like
(encapsulated) Postscript or PDF, is very easy, because it merely requires a one to
one conversion of GGS-commands into other drawing commands.
it was shown that G UIDO is an ideal candidate for structure based MIR. The (pro-
totypical) system, which was developed, allows the search for a musical melody (or
fragments of such) in a body of scores, which are stored as G UIDO descriptions in a
database. The search is specified using an enhanced form of G UIDO. If a match is
found in the database, the score can be retrieved, where the location of matches are
highlighted. This highlighting is made possible by an enhancement of the G UIDO
NoteServer: the interface was extended to offer the possibility to mark certain frag-
ments of a score. An example of such a highlighted score can be seen in Figure 5.11.
On the left, the search phrase [ g f e d ] can be seen; on the right, a score is
displayed, where the found search pattern is shown in red.
By using the G UIDO NoteServer to display the result of the query, all of the present
features can be fully exploited. The user can listen to the music and scroll to other
parts of the score. It is also possible to embed the image of the score on other web
pages or in research papers.
5.5 Conclusion
The applications presented in this chapter give an overview of the variety of mu-
sic notation applications made possible through the use of G UIDO Music Notation.
136 CHAPTER 5. APPLICATIONS
Even though the developed applications are used by a growing number of people
around the word, it became clear during the work on this thesis that the develop-
ment of a complete notation system is an enormous, never-ending task. In spite
of the incomplete nature of the music notation system, the general idea of using
G UIDO and especially the online aspect of music notation is interesting for many
users. Because online music notation and editing has not been presented elsewhere,
it seems like a fruitful endeavor to further improve the G UIDO Graphic Stream and
the Java-based editor. A more complete notation editor could be easily used for on-
line music education tools. Overall, the collection of applications proves that G UIDO
Music Notation and the G UIDO Notation Engine are a solid foundation for music
notation applications.
Chapter 6
Conclusion
Many music notation systems have been developed since computers are being used
for music processing. While most systems are capable of automatically creating
nicely formatted scores of simple music, a completely automatic system for complex
music has not yet been build. The reason for that lies in the complexity of a musical
score, where different, sometimes contradictory constraints have to be met. Very of-
ten when setting complex music, some typesetting rules have to be explicitly broken
in order to get a reasonable result. Currently, human interaction is indispensable
when producing high-quality scores.
In this thesis, the internals of a new music notation system were presented; it differs
from other systems by its underlying music representation language. Using G UIDO
Music Notation as the input language for the implemented music notation system
created insights in the general procedures which are necessary when automatically
creating conventional scores. Because G UIDO is an adequate, intuitive text-based
language, the creation of musical data, represented as a G UIDO description, is fairly
easy. Nevertheless, the process of automatically creating a graphical score from an
arbitrary G UIDO description is far from trivial.
As G UIDO descriptions are human readable plain text, the conversion into a graph-
ical score requires a number of steps. First, it was shown that G UIDO descriptions
can be converted into a so called G UIDO Semantic Normal Form (GSNF). The defi-
nition of the GSNF was very helpful for defining an object-oriented computer-suited
representation for G UIDO, which is called Abstract Representation. A G UIDO de-
scription in GSNF is converted into a one-to-one corresponding instance of the Ab-
stract Representation. This instance is used for storing, manipulating, and travers-
ing the musical data. The Abstract Representation has been designed and imple-
mented in C++ using object-oriented design patterns, therefore the data structure
and operations on the data are both part of the class library.
Once the G UIDO description has been converted into the one-to-one corresponding
instance of the Abstract Representation, a number of music notation algorithms
are executed. These music notation algorithms enhance a given G UIDO description
by analyzing the input and applying common musical typesetting rules. All im-
137
138 CHAPTER 6. CONCLUSION
Once the music notation algorithms have been performed on the Abstract Represen-
tation, the contained data is used to create an instance of the so called Graphical
Representation, which is closely related to a graphical score. The conversion into
the Graphical Representation requires the creation of graphical notational elements
visible in the score. Additionally, those notational elements have to be identified
that require an equal horizontally position, because they have the same onset time.
A suitable way to deal with the issue is the usage of the spring-rod-model, which
uses springs and rods to describe the layout of a line of music. Once the elements
have been associated with springs, a line of music can be stretched to a desired ex-
tent by applying a “force” on the virtual springs. To prevent horizontal collisions,
rods are used to pre-stretch the springs. The spring-rod-model has been first used
for spacing lines of music in 1987 and produces quite good results, which closely
match scores from human engravers. The issue of calculating spring constants for
this model is not easy, and it was shown in this thesis, that it can be improved,
especially if rhythmically complex interactions of voices are concerned.
After the Graphical Representation has been created from the Abstract Represen-
tation, another set of music notation algorithms needs to identify optimal line- and
page-break positions. These depend on the spacing of individual lines and also on
the degree of page-fill of the last page (in music, the last page of a score should be
completely full). An optimal line-breaking algorithm for music based on the line
breaking algorithm for TEX has been previously published by [HG87], but there has
been no publication on optimally filling pages of a score. In this thesis a new optimal
page fill algorithm was designed and implemented. The new algorithm is an exten-
sion of the optimal line breaking algorithm; it makes use of dynamic programming
and is scalable so it can be adapted for different requirements.
The developed music notation system is used within a couple of applications, which
were also implemented during the work on this thesis. The stand-alone G UIDO
NoteViewer can be used to create graphical scores from G UIDO descriptions, which
can also be printed. The online G UIDO NoteServer is a client-server application
on the Internet that converts a G UIDO description into a picture of a score. This
free service can be accessed using any standard web browser. Because G UIDO is an
intuitive and powerful music representation language, it is used by a continuously
growing number of people and research groups. The developed G UIDO based music
notation tools are also being used and accepted widely by users around the world.
139
Outlook
A music notation system is never complete in the sense that every single notation
feature required in any conceivable score is already implemented. As music notation
is continuously evolving to represent new musical ideas, there will probably never
be a single system that covers all needs.
Many features of conventional music notation have not been implemented in the
presented music notation system. Some of these features can be implemented rather
easily by merely using diligence (for example the addition of different note shapes).
Others are quite complicated and require a lot of additional research: these include
automatic collision detection and prevention. The framework of the implemented
music notation provides a solid ground for further research. Additionally, the un-
derlying music representation language G UIDO is a powerful notation interchange
format. Therefore, any enhancement to the implemented music notation algorithms
can be directly coded into a G UIDO description, which can then be read by any
G UIDO compliant music notation software.
Another interesting topic for further research is the enhancement of the prototypi-
cal Java based music notation editor, which uses the G UIDO Graphic Stream proto-
col. Because Java is platform independent, this could provide the basis for a truly
interchangeable music notation system. Nevertheless, a lot of additional work is
necessary to create a fully functional music notation editor.
140 CHAPTER 6. CONCLUSION
Bibliography
[Boo91] Grady Booch. Object Oriented Design with Applications. The Ben-
jamin/Cummings Publishing Company, Inc., 1991.
141
142 BIBLIOGRAPHY
[How97] John Howard. Plaine and Easie Code: A Code for Music Bibliography.
In Eleanor Selfridge-Field, editor, Beyond MIDI – The Handbook of
Musical Codes, pages 362–372. The MIT Press, 1997.
[HR78] David Halliday and Robert Resnick. Physics. John Wiley & Sons,
1978.
[HRG01] Holger H. Hoos, Kai Renz, and Marko Görg. GUIDO/MIR – An Ex-
perimental Musical Information Retrieval System Based on GUIDO
Music Notation. In Proceedings of the 2nd Annual International Sym-
posium on Music Information Retrieval (ISMIR), pages 41–50, 2001.
[RH98] Kai Renz and Holger H. Hoos. A WEB-based Approach to Music No-
tation using GUIDO. In Proceedings of the 1998 International Com-
puter Music Conference, pages 455–458, University of Michigan, Ann
Arbor, Michigan, USA, 1998. International Computer Music Associa-
tion.
[RH01] Kai Renz and Holger H. Hoos. WEB delivery of Music using the Guido
NoteServer. In Proceedings of the First International Conference on
WEB Delivering of Music – WEDELMUSIC, page 193. IEEE Com-
puter Society, 2001.
[Ros87] Ted Ross. Teach Yourself the Art of Music Engraving & Processing.
Hansen House, Miami Beach, Florida, 1987.
[SN97] Donald Sloan and Steven R. Newcomb. HyTime and Standard Mu-
sic Description Language: A Document-Description Approach. In
Eleanor Selfridge-Field, editor, Beyond MIDI – The Handbook of Mu-
sical Codes, chapter 30, pages 469–490. The MIT Press, 1997.
[TME99] Daniel Taupin, Ross Mitchell, and Andreas Egler. MusiXTEX– Using
TEX to write polyphonic or instrumental music, version t.93 edition,
April 1999.
[W3C02] HyperText Markup Language Home Page. World Wide Web consor-
tium; http://www.w3c.org/Markup, 2002.
[Wan88] Helene Wanske. Musiknotation – Von der Syntax des Notenstichs zum
EDV-gesteuerten Notensatz. B. Schott’s Söhne, Mainz, 1988.
The following is the complete Advanced G UIDO description of one page of Bach’s
Sinfonia 3 (BWV 789). The score page is shown in Figure 1 on page 148.
% The first page of a BACH Sinfonia, BWV 789
% Most of the element-positions are specified
% using Advanced GUIDO tags. The layout has
% been very closely copied from the URTEXT
% edition by the Henle-Verlag.
% This example has been prepared to show, that
% Advanced GUIDO is capable of exact score-
% formatting.
%
{ [ % general stuff
\pageFormat<"a4",lm=0.8cm,tm=4.75cm,bm=0.1cm,rm=1.1cm>
\title<"SINFONIA 3",pageformat="42",textformat="lc",dx=-2cm,dy=0.5cm>
\composer<"BWV 789",dy=1.35cm,fsize=10pt>
% voice 1
147
148 APPENDIX . BACH SINFONIA 3
SINFONIA 3
BWV 789
—— ———— —
# # c a X——Û X——Û _—XÛ n X———Û X————Û X———Û X——Û X——Û X———Û ——— ———XÛ X———Û X——Û ———— ———— X——Û X———Û X———Û X————Û _X—Û _X—Û X——Û X—Û— X——Û X——Û ——XÛ X——Û X———Û X——Û # X——Û _X—Û XÚ– XÚ– X—Ú–Û–
5
4 1 3
3 2 2
5
&
========= =‹ ========
« _ XÚ–– ‹ X Ú ‹ XÚ–– _XÚ–– XÚ– XÚ– # XÚ XÚ– XÚ– _XÚ–– _XÚ–– _XÚ–– XÚ– ‹‹
_
‹ – X
Ú ‹
2
# – a X–Ú–– –– X
Ú a XÚ–– ‹‹ X–Ú–– a XÚ–– –– –– ––– X
Ú – ‹ XÚ ––– ––– –– –– –– –– –– – –– – – ‹‹
? #c
========= J 3
–J======== –J 2
–– XÚ––– =‹ ––– ======== 1 3 1
—
1 5
X X Û XÚ–– XXÛÚ–– XXÛÚ– # XX—ÛÚ– ––– ––– XÚ––– XÚ–– XÚ–– XÚ– XÚ–– XÚ–– XÚ– XÚ–– XÚ––
45
X
Ú X
Û
Ú X
45
&
============== ‹ XÚ– =========== =‹
‹ – –– –– – – – – – – – – – – – – – ‹
« # # # XÚ––– XÚ–– # XÚ––– XÚ––– XÚ––– _XÚ––– XÚ––– n XÚ–– XÚ––– XÚ–– XÚ–– XÚ––– XÚ––– XÚ–––– XÚ––– n XÚ––– ‹‹ XÚ–– XÚ– X——Û —— ——— ——— X———Û X——Û XÚ–– XÚ–– n XÚ––– XÚ–– ‹‹
2
?
============== 1 1
– – – – = ‹ – ===========
4
– 1 _X——Û X—Û XÛ
2 3
3
– – – = – ‹
—
# # _XX—ÛÚ– ————XÛ X———Û X——Û X——Û X——Û X——Û X———Û # X——Û ——— X———Û X——Û X———Û X——Û ———XÛ n X———Û X——Û —— —— X——Û X——Û X——Û X——Û X———Û ——
2 5
—Û XÚ
1 3 4 5 4
X
4 3
– XÚ XÚ––
X
Û — X
Û XÚ–– XÚ––– XÚ––– XÚ–– XÚ– –––– –––– =
X Û
5
& ––
============== XÚ–– XÚ–– XÚ–– = XÛ
‹ XÚ––– =========== XÚ––
– – – –– ‹‹
«
– – – ‹ – –
‹ — —— —— ——— —— —— —— —— —— XÚ– XÚ– XÚ ‹
12
# # XÚ–– XÚ–– XÚ–– XÚ––– XÚ––– n XÚ– XÚ XÚ–– XÚ–– XÚ––– XÚ ‹ —— X—Û XÛ XÛ —— —— XÛ XÛ X—Û XÚ– ––– –– ––– ‹
? – – – –– –– ––– –– –– –– ––– =
============== ‹ XÛ=========== XÛ X—Û –– –– –– = – ‹
X Ú
– X
Ú —
X
Û Ú
X X
Ú
– # X
Ú X
Ú
– X Ú
– — — —
XÛ XÛ X—Û XÛ XÛ XÛ XÛ— — — — # X
Û —
X
Û
X
Ú X
Ú
– X
Ú
– X
Ú
– X
Ú
–
& –– ––– XÚ––– ––– –– –– –– – EÚ––
============== – = ‹ XÚ–– =========== – – – – XÚ– – – =‹
‹ – –– – – – – ––
«
– —
— ‹
# # X–Ú– XÚ–– XÚ– XÚ–– X–Ú– XÚ–– X–Ú– # XÚ––– XÚ––– XÚ–– XÚ––– . X
Ú ‹
––– ‹ n XÚ–– X–Ú XÚ– X–Ú XÚ–– X–Ú XÚ–– XÚ–– ––– XÚ–– XÚ–– XÚ X
Û ‹
– – – – – – – – – – ‹
? – – –– – – – – – –
============== = ‹ =========== – – – – =‹
# # XÚ–– XÚ–– XÚ– XÚ–– XÚ–– X Ú
– X Ú
– X Ú
–
– ——— X———Û X———Û X——Û ———— ———— X——Û X———Û X———Û XÚ––– XÚ–– XÚ––
& –
============== – –
– – – –– n XÚ–– # XÚ–– ––– –– ––– n XÚ–– = – ‹ X—Û=========== X—Û X—Û # XÚ–– –– –– =
–– ‹
«
– – – ‹ ‹
——— ——— ——— _——XÛ _X———Û _
—X—Û ——— ——— —— — —— —— ——— ——— ——— —— ‹ — —
_ _XÛ _XÛ X—Û # X———Û # X—Û X—Û _XÛ _XÛ _XÛ X—Û ‹ # X—Û ——— # X———Û X——Û X——Û — — — —
— —
— —
——Û X——Û _X—Û _X—Û __XÛ X——Û ‹‹
—
X Û # X
Û X Û X
# EÚ–
? # ––
==============
XÚ––
– XÚ––
– =
‹ XÚ–– XÛ
‹ – =========== @ XÚ––– XÚ––– XÚ––– n X–Ú–– XÚ––
– =‹
‹
« ——— —— —— — —— ——— – – ‹ – – – – – – – – – ‹
X Û n —
X
Û # —
X
Û — X—
Û X Û ‹ X Ú
– # X Ú
# XÛ XÚ ‹ – – XÚ– # XÚ XÚ XÚ– XÚ–– ‹‹
? # EÚ–––
============== ––– XÚ–– ‹ XÚ–– ===========
= XÚ–– –––– –––– ––– ––– XÚ–– # X–Ú–– ––– ––– XÚ––– = –
– – –
\bm(d c# d e)
\newSystem<dy=3.9cm>
% measure 10 (voice 1)
\staff<1,dy=1.48cm>
\bm( \stemsDown f# e d e)
\bm(f#/8 \acc(a1))
\bm(g# c#2/16 d)
\bm(e/8 \acc(g1))
% measure 11 (voice 1)
\bm( \stemsUp f# h/16 c#2)
\bm(d/8 f#1)
\bm( \stemsUp<13hs> e/16 \stemsUp d2 c#
\stemsUp<9hs> h1 \stemsUp )
\bm( \stemsDown<5.5hs> a# \stemsDown g2 f#
\stemsDown<8.5hs> e \stemsDown)
\newSystem<dy=4.45cm>
% measure 12 (voice 1)
\staff<1,dy=1.09cm>
\bm(d/16 c# h1 c#2 )
\bm(d h1 c#2 d)
\bm( \stemsUp<9.5hs> e#1 \stemsUp g# a
\stemsUp<6hs> h \stemsUp )
\tieBegin<dy1=3.2hs,dx2=0,dy2=3.2hs> c#2/4
% measure 13 (voice 1)
\bm(c#/16 \tieEnd f#1 g# a)
\tieBegin<curve="up"> h/4
\bm( h/16 \tieEnd e# f# g#)
\tieBegin<dx1=2hs,dy1=2.8hs,dx2=0hs,dy2=-2.1hs,h=1.1hs>
a/4
\newPage
% Here, the new page begins ....
% measure 14 (voice 1)
a/16 \tieEnd
% .....
] ,
[ % voice 2
\staff<1>
% measure 1 (voice 2)
\restFormat<posy=-8hs,dx=2.5cm> _/1
% measure 2 (voice 2)
_/1
% measure 3 (voice 2)
\restFormat<posy=-2hs> _/2 \restFormat<posy=-5hs> _/8
\stemsDown \beam( \fingering<text="2",fsize=7pt,dy=6hs>(c#2/16)
\stemsDown<7.5hs> d \stemsDown )
\beam( \stemsDown<8hs> e/8 \stemsDown
\stemsDown<5hs> \acc<dx=0.02cm>(
\fingering<"2",fsize=7pt,dy=-3hs,dx=0.8hs>( g1 ) )
\stemsDown )
% measure 4 (voice 2)
\bm( \stemsDown<4.5hs> f#1/8 \stemsDown
\fingering<text="2",fsize=7pt,dy=-6.5hs>(h/16)
\stemsDown<6hs> c#2 \stemsDown)
\bm( \stemsDown<7hs> d/8 \stemsDown<4.25hs> f#1 \stemsDown )
\bm( \stemsDown<5hs> e \stemsDown a/16 \stemsDown<7hs> h \stemsDown)
\bm( \stemsDown<7hs> c#2/8 \stemsDown<4.5hs> e1 \stemsDown )
% measure 5 (voice 2)
\bm( \stemsDown<5hs> d/16 \stemsDown
\fingering<text="3",fsize=7pt,dy=11hs>(c#2) h1
\stemsDown<7hs> a \stemsDown )
\bm( \stemsDown<6hs>
\fingering<text="1",fsize=7pt,dy=12.25hs>( g# )
\stemsDown f#2 e \stemsDown<8.5hs>
\fingering<text="1",dy=5hs,fsize=7pt>(d)
\stemsDown)
151
Curriculum vitae
1970 born in Darmstadt, Germany
1976-1980 Schillerschule, Darmstadt
1980-1983 Lichtenberschule, Gymnasium, Darmstadt
1983-1987 Georg-Büchner-Schule, Gymnasium, Darmstadt
1987-1988 Northridge Highschool, Middlebury, Indiana, USA
1988-1990 Lichtenbergschule, Gymnasium, Darmstadt
Allgemeine Hochschulreife
1990-1991 Städtische Kliniken Darmstadt, Zivildienst
1991-1993 Technische Universität Darmstadt
Studium der Physik, Abschluss Vordiplom
1993-1997 Technische Universität Darmstadt
Studium der Informatik, Abschluss Diplom
1997-2002 Technische Universität Darmstadt
Fachgebiet Automatentheorie und Formale Sprachen
Wissenschaftlicher Mitarbeiter
Erklärung
Hiermit erkläre ich, die vorliegende Arbeit zur Erlangung des akademischen Grades
Dr.-Ing. mit dem Titel “Algorithms and Data Structures for a Music Notation Sys-
tem based on G UIDO Music Notation” selbstständig und ausschließlich unter Ver-
wendung der angegebenen Hilfsmittel erstellt zu haben. Ich habe bisher noch keine
Promotionsversuche unternommen.