Procedural Generation of Text
Procedural Generation of Text
i
Contents
1 Introduction 1
1.1 What is Procedural Generation? . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
I Previous Works 4
2 Relevant Technologies 5
2.1 Natural Language Processing . . . . . . . . . . . . . . . . . . . . 5
2.2 Natural Language Generation . . . . . . . . . . . . . . . . . . . . 9
3 Related Software 13
3.1 Procedural Content in Games . . . . . . . . . . . . . . . . . . . . 13
3.2 Tale-Spin (and its Relatives) . . . . . . . . . . . . . . . . . . . . . 17
3.3 Almost Goodbye . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
II Development 31
4 Concept 32
4.1 Project Goal and Delimitation . . . . . . . . . . . . . . . . . . . . 32
4.2 Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Scope and Requirements . . . . . . . . . . . . . . . . . . . . . . . 35
5 Architecture 38
5.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Text Tree Generator . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 Text Writer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ii
CONTENTS CONTENTS
6 Main Components 43
6.1 Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.3 Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.4 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.5 Plugin API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
III Evaluation 65
7 Results 66
7.1 Same Story, Different Actors . . . . . . . . . . . . . . . . . . . . . 67
7.2 Changing the Mood . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.3 Generating Factual Texts . . . . . . . . . . . . . . . . . . . . . . . 75
7.4 Issues and Applications . . . . . . . . . . . . . . . . . . . . . . . . 84
8 Summary 86
9 Outlook 89
IV Appendix 101
iii
1 | Introduction
Give a man a puzzle and you entertain him for a day; teach his computer to
create puzzles and you entertain him for a lifetime.
Quote 1.1: Not the original English proverb, but it illustrates the point.
Other problems are more complicated to handle, especially when they re-
quire (what could be described as) common sense or creativity to solve. An
algorithm can create an image of a wooden texture, because it can be modelled
and understood to the point that allows to describe its creation process in an ab-
stract way. But can it paint a tree, or the moody landscape where it stands? Can
1
1.2. MOTIVATION INTRODUCTION
1.2 Motivation
Two things from the field of computer science have always fascinated me: Emer-
gent behaviour and artificial intelligence. I also enjoy playing video games once
in a while and even more so, developing the technology behind it. When I no-
ticed a small trend for Procedural Generation in modern games, I was intrigued:
Instead of game artists and designers preparing all the game content in advance,
clever algorithms took part in the process of creating, adapting and recombining
content; whenever needed, directly on the players computer. They started draw-
ing maps and assembling levels, randomizing items and creating player quests,
generating an endless stream of game content that would have been too tedious
and costly to be created all by hand.
But one thing was missing. Despite all the variation and different fields of
content generation, I did not encounter a single example of a procedurally gen-
erated story. Surely, applying Procedural Generation to something like the main
narrative of a story-driven game had to be tricky but could it be done? And if
it can be done, to what degree? Investigating this in its full depth would require
to develop not only a testbed game first, but also the complete infrastructure for
injecting a procedurally generated story and presenting it properly; a task that I
deemed too big for a few months work, and one that would also introduce a lot
of unnecessary side-tasks as well.
2
1.2. MOTIVATION INTRODUCTION
In the end, I decided to simply strip away all game-related topics as well as all
interactivity. Instead of game narrative, I decided to focus on written language as
a more common and generic medium. What remains is just a few core questions:
To what degree can Procedural Generation be applied to written stories, and text
in general? When setting the goal to receive practically usable results, what kind
of work can an algorithm be trusted to do in sufficient quality? Is it possible to
create a general-purpose framework for bringing human authors and algorithms
together as collaborators?
While dealing with these questions, the goal of this work is not to discover
something fundamentally new, but to investigate existing technologies and com-
bine them to a bigger whole creating the prototype of a productive system that
allows human authors to enrich their texts with computer-generated variation,
and dynamically adjust them to different contexts.
3
Part I
Previous Works
4
2 | Relevant Technologies
This chapter deals with previous research and relevant techniques that are either
required or at least useful for understanding this works subject. Starting with a
general look on the field of Natural Language Processing, a broad overview
on the topic is provided before focusing on the discipline of Natural Language
Generation.
5
2.1. NATURAL LANGUAGE PROCESSING RELEVANT TECHNOLOGIES
Figure 2.1: A naive sentence split algorithm has plenty of opportunities to fail.
Correct results require more sophisticated algorithms.
Figure 2.2: A token can consist of multiple words. Failing to recognize them will
change their meaning.
6
2.1. NATURAL LANGUAGE PROCESSING RELEVANT TECHNOLOGIES
Part-of-Speech Tagging: After splitting a text into sentences and tokens, each
token can be tagged with dictionary information, such as the part of speech3
it represents. Again, ambiguity can make this task more difficult: The En-
glish word set could be either a noun, a verb or an adjective depending
on the context it is used in (Fig. 2.3).
Figure 2.3: For some words, there is no easy one-to-one mapping for parts of
speech, which can be solved trivially using a dictionary. Multiple interpretations
are possible, depending on the context.
Figure 2.4: In the above parse trees, individual tokens are grouped recursively
into (noun) phrase (NP) nodes. Depending on how the original plain text phrase
is read during parsing, the age of the addressed gender changes. Source: [13]
7
2.1. NATURAL LANGUAGE PROCESSING RELEVANT TECHNOLOGIES
software. However, the parse tree can be analysed for cues that map purely
linguistic entities and concepts into the working domain at hand: From a
linguistic point of view, take or ball are just words but for a software
controlling a robotic arm, they are also actions and objects. Transform-
ing the linguistic data structure that a parsed sentence represents into a
stream of commands is the final step in NLP4 that can enable a robot to
follow orders and a chatbot to respond to complex user questions.
Figure 2.5: Once natural language input has been transformed into a form that
the software can work with directly, further processing can take place outside
the domain of NLP.
While the above list of steps can only provide a very rough overview on
the different tasks in a typical Natural Language Processing use case, it should
have become clear that many of them do not have obvious solutions, and their
straight-forward approaches often fall prey to the ambiguity of natural language,
or the lack of context and deep knowledge. In many cases, a sentence or even
a word could easily be interpreted in various different ways, and which of them
is the right one may not be immediately obvious. Until today, machines are
merely able to work with text, but the degree to which one could speak about
understanding text still remains arguable5 .
Still, it is important to note that Natural Language Processing is not only
about understanding language and that working with language, as opposed
to understanding it, is not solely caused by a lack of technical capabilities, but
characterising a different subset of goals as well. There are many applications
and tasks in NLP where truly understanding natural language is in fact not a
requirement to perform at a sufficient quality level.
Speech Recognition is an NLP discipline that deals with the task of transcribing
spoken language into text, which can be regarded as a largely independent
4
This is, of course, a large simplification. Several more layers of artificial intelligence tasks
may follow, but they are not necessarily bound to the domain of Natural Language Processing.
5
In fact, the task of truly understanding natural language is considered an AI-hard problem,
meaning that solving it would imply solving the problem of artificial intelligence in general.
8
2.2. NATURAL LANGUAGE GENERATION RELEVANT TECHNOLOGIES
Of course, the above list is largely incomplete, but it illustrates that most of
the common NLP tasks deal with reading and processing natural language - but
not producing it.
9
2.2. NATURAL LANGUAGE GENERATION RELEVANT TECHNOLOGIES
They also describe a second category of NLG6 systems called Authoring Aids,
which help people create routine documents. An example in their own words:
A doctor, for example, may spend a significant part of her day writing
referral letters, discharge summaries, and other routine documents;
while a computer programmer may spend as much time writing text
(code documentation, program logic descriptions, code walkthrough
reviews, progress reports, and so on) as writing code. Tools which
help such people quickly produce good documents may considerably
enhance both productivity and morale.
10
2.2. NATURAL LANGUAGE GENERATION RELEVANT TECHNOLOGIES
Lexical choice: Given a conceptual phrase to express in concrete text, the prob-
lem of lexical choice deals with selecting the right words for it. When
writing about a downward price change on the stock market, whether its
value is described to decrease, fall, or even plummet depends not only on
the magnitude of the change, but contextual knowledge as well. Deciding
which of N synonymous words to use for any given case is at the core of
this stage, which attempts to create a suitable mapping from semantics to
wording.
11
2.2. NATURAL LANGUAGE GENERATION RELEVANT TECHNOLOGIES
Each of these stages represents a complex task to be solved, not all of which
are well-understood and practically explored at a general level. Thus, it is im-
portant to note that, even though Dale and Reiter described the above stages
in detail and proposed potential approaches for dealing with them, their article
is focused primarily at providing a basic overview and summary of generating
natural language texts, as well as presenting typical problems that arise in the
process. Overall, the field of Natural Language Generation appears to be far from
reaching a conclusive state on a large scale, even though promising results may
have been achieved in various sub-tasks and simplified or specialized problem
setups.
12
3 | Related Software
The concept of Procedural Generation is nothing new and in fact, there have
been a lot of works dealing with it in one way or another. In the context of this
thesis, three cases are especially interesting and will be presented in this chapter:
Procedural Content in Games will be used as a starting point, because it
exhibits a great variety of generation techniques that can contribute to giving a
basic overview. Getting more specific, the section about Tale-Spin will present
some of the story generating experiments that have been conducted until today
and demonstrate why their focus does not match with the goals of this work. Fi-
nally, the story experiment Almost Goodbye is chosen as an example to illustrate
a more suitable approach.
13
3.1. PROCEDURAL CONTENT IN GAMES RELATED SOFTWARE
the goal not being to dynamically create content, but to serve a huge amount of
static content in an extremely efficient way. [15]
However, as storage devices and system memory showed a drastic increase of
capacity in the following years, Procedural Generation was no longer a necessity
to deliver large or detailed game worlds all of the required content could be de-
livered directly with the game and simply accessed when needed. Additionally,
video games became gradually more sophisticated in matters of audiovisual pre-
sentation, and quality requirements on game content consequently scaled with
them up to the point where the expectations of players and developers left
creative algorithms unable to compete with human artists. [1]
In recent years, the Procedural Generation of game content is experienc-
ing a tentative renaissance, with increasing attention from developers and a
more frequent usage in commercial games. [1] Its common purpose, however,
has changed: Rather than being a vessel for delivering game assets in the most
memory-efficient way possible, its focus has shifted towards achieving a seem-
ingly infinite variety of game content that would be too time-consuming or im-
practical for human artists to create manually. Among the typical usage examples
for Procedural Generation in modern games, three closely related forms can be
identified:
14
3.1. PROCEDURAL CONTENT IN GAMES RELATED SOFTWARE
When assuming that there are no holes or caves crossing it, a landscape can
be described as a sequence of height values along a horizontal axis. These
height values can be calculated using a mathematical function depend-
ing on their horizontal position and any number of additional parameters.
When choosing a simple term like a sine function, the landscape may look
artificial or alien to the human observer but depending on the context the
game provides, more complex terms can yield results that are sufficiently
convincing.
Because the Procedural Content of a game is usually generated as needed
on the players computer, there is no practical way of reviewing each re-
sulting instance or manually removing undesirable outcomes, so content
generating algorithms are required to directly produce results of the de-
sired quality. This restriction leaves only a narrow window for feasible
use cases, especially when operating largely without human input. One of
the more common ones is generating a believable terrain, like the world
generation algorithm in Minecraft, to name one prominent example. In
a completely autonomous setup, it is able to create an arbitrary amount
of (practically) infinite maps (Fig. 3.2) that are consistent with the game
world. [32]
15
3.1. PROCEDURAL CONTENT IN GAMES RELATED SOFTWARE
Figure 3.3: Four different variations of the same animal template in No Mans
Sky. Source: [24]
16
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
Since it was first introduced, Procedural Generation has become more and
more capable, but is in most cases still far away from generating results on a
quality level that is comparable to human-authored assets. However, modern al-
gorithms slowly aspire to be valuable collaborators in the content creation work-
flow of games, enabling developers to provide more varied, situation-dependent
and customized experiences.
17
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
In his publication, Meehan presents some of the generated text outputs, in-
cluding both the expected results (Lst. 3.1) and some interesting failure cases,
which he comments to shed some light on their technical cause and possible so-
lutions. Most of them are attributed to a lack of knowledge, either by failing
to properly simulate the perception that characters have of the world, or by not
18
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
GEORGE WAS VERY THIRSTY. GEORGE WANTED TO GET NEAR SOME WATER. GEORGE
WALKED FROM HIS PATCH OF GROUND ACROSS THE MEADOW THROUGH THE VALLEY
TO A RIVER BANK. GEORGE FELL INTO THE WATER. GEORGE WANTED TO GET NEAR THE
MEADOW. GEORGE COULDN'T GET NEAR THE MEADOW. WILMA WANTED GEORGE TO
GET NEAR THE MEADOW. WILMA WANTED TO GET NEAR GEORGE. WILMA GRABBED
GEORGE WITH HER CLAW. WILMA TOOK GEORGE FROM THE RIVER THROUGH THE
VALLEY TO THE MEADOW. GEORGE WAS DEVOTED TO WILMA. WILMA LET GO OF
GEORGE. GEORGE FELL TO THE MEADOW. THE END.
Listing 3.1: The usual output that Tale-Spin produced.
19
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
Once upon a time there was a dishonest fox and a vain crow. One day the crow was sitting in
his tree, holding a piece of cheese in his mouth. He noticed that he was holding the piece of
cheese. He became hungry, and swallowed the cheese. The fox walked over to the crow. The
end.
Listing 3.2: Transcript of Tale-Spins rather "pragmatic" The Fox and the Crow
interpretation.
a lot of failure cases are commented in a way that suggests the existence, lack
or transfer of knowledge to be the main culprit. While this assumption sounds
reasonable with regard to the potentially very limited testing database and the
prototype nature of the presented work, it certainly does not cover the full scale
of the problem. The above example of The Fox and the Crow is picked by Meehan
to demonstrate a case where adding more knowledge can produce unexpected
results [2] because as soon as the crow knew about the food in its mouth,
eating it seemed like the most reasonable course of action.
Albeit being a correct observation, this does not deal with the actual root of
the problem: Tale-Spin simply didnt understand the purpose of the story it was
supposed to tell, and it had no concept of a writers intention either. It did not
write a fable in order to convey a message or illustrate a morale, but merely
documenting the progress of an internal character simulation3 . This is not the
way a typical human writer would work, and one can imagine that an approach
like this will hardly be able to deliver plotlines of hand-authored quality, as a
meaning can only emerge through human interpretation of coincident acts.
Going forward, this is only one of the problems which has since been ad-
dressed in Tale-Spins successors, and there have been many similar works on
the Procedural Generation of narrative until today. Dealing with all of them in
full detail is beyond the scope of this work, but fast-forwarding to more recent
times the following selection of prominent examples should provide a rough
overview:
20
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
Universe (1983) built up on both Tale-Spin and Author, but aimed for some-
thing slightly different: Not a self-contained, complete story was the de-
sired output, but a never-ending series of episodes, featuring complex char-
acter interaction and development. [4] This particular story generator algo-
rithm resembled a TV series director more than a novel author and was
focused heavily on the interaction of a multitude of characters. As such,
both individual character traits and author goals were tracked, in order
to achieve a certain level of consistency and at the same time add an el-
ement of drama to keep things going. In addition to a planning system
like previous algorithms used, Universe also made use of pre-defined plot
fragments with stereotypical roles that could be filled dynamically when
their requirements were met with the characters at hand. Resorting to this
mixed approach of human-authored content and dynamic planning, Uni-
verse was able to generate more complex and complete fictional worlds
than similar prototypes at the time. [18]
21
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
The reader can imagine a Knight who is sewing his socks and
pricked himself by accident; in this case, because the action of
sewing produced an injury to the Knight, Minstrel would treat
sewing as a method to kill someone.
Brutus (1999) was a story generator focused on the topic of betrayal and the
first program that actually attempted to produce high-quality text in addi-
tion to a meaningful plotline. Based on a mathematical theorem prover,
it uses logical pattern matches to assemble a story from pre-defined ele-
ments and was largely dependent on human input in order to generate a
story [21] which enabled it to create texts on a comparatively high qual-
ity level (Lst. 3.3). They actually looked human-written, because (for the
most part) they were. [7] Its authors consequently stated that Brutus is not
creative at all, but a reverse engineering effort to see if a particular story
can be re-assembled by an algorithm, given the appropriate input. [17]
Dave Striver loved [...] the fact that the university is free of the stark unforgiving trials of the
business world only this isn't a fact: academia has its own tests, and some are as merciless as
any in the marketplace. A prime example is the dissertation defense: to earn the Ph.D., to
become a doctor, one must pass an oral examination on one's dissertation. [...]
Dave [...] needed the signatures of three people on the first page of his dissertation, the
priceless inscriptions which, together, would certify that he had passed his defense. One of the
signatures had to come from Professor Hart, and Hart had often said [...] that he was honored
to help Dave secure his wellearned dream. At the defense, Dave thought that he eloquently
summarized Chapter 3 of his dissertation. [...] There were no further objections.
Professor Rodman signed. He slid the tome to Teer; she too signed, and then slid it in front of
Hart. Hart didn't move. "Ed?" Rodman said. Hart still sat motionless. Dave felt slightly dizzy.
"Edward, are you going to sign?"
Listing 3.3: A short excerpt from a text that was created by Brutus, using pre-
written text fragments. [7]
Fabulist (2010) was progressing on works like Tale-Spin, Author and Universe,
and had a strong focus on balancing character and plot consistency. Al-
though being an author-centric planner, i.e. simulating the goals of a vir-
tual author, the system was designed to make sure that character behaviour
within the authored plot was consistent and believable. While Universe at-
tempted the same by enforcing character traits to match with their roles
in plot fragments, Fabulist expanded on this by evaluating each characters
individual motives and goals as well. Fabulists author Mark Riedl explains
the distinction with an example:
22
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
When a plot sequence requires characters to act in ways that are not deemed
believable, Fabulist would attempt to repair the story plan by finding rea-
sons for said characters to act like they are required to. This form of back-
tracking allows it to achieve a higher level of character and plot consistency
than Universe. [8] Like most of its predecessors, Fabulist created rather tech-
nical English transcripts of the plot it had modelled without focusing too
much on literary value (Lst. 3.4).
[...] The genie is in the magic lamp. Jafar rubs the magic lamp and summons the genie out of
it. The genie is not confined within the magic lamp. Jafar controls the genie with the magic
lamp. Jafar uses the magic lamp to command the genie to make Jasmine love him. The genie
wants Jasmine to be in love with Jafar. The genie casts a spell on Jasmine making her fall in
love with Jafar. Jasmine is madly in love with Jafar. Jasmine wants to marry Jafar. [...]
Listing 3.4: A short excerpt from a text that was created by Fabulist. [16]
[36]
Listing 3.5: A text that was created by MOSS.
Although approaching the topic from different angles and sometimes with
different goals, a certain characteristic is quite common among all of the above:
The assumption that at least part of a story generation process can be interpreted
as a planning problem in some form. Most of the mentioned systems focus on
23
3.2. TALE-SPIN (AND ITS RELATIVES) RELATED SOFTWARE
modelling said planning problem, and solving it in a way that leads to satisfac-
tory results with regard to the hypothesis that motivated their creation (Fig. 3.6).
However, none of them were actually intended to be used productively they
were vessels for exploring abstract ideas and concepts on the domain of narra-
tive generation and computational creativity.
Figure 3.6: An overview of Tale-Spin and its relatives, as presented in this chap-
ter.
It feels like a natural conclusion of this kind of exploration, that issues from
the area of concrete applicability have largely been left untouched. From an
academic point of view, a lot has been achieved in the last forty years of occa-
sional research on the topic yet, when asking a layman about the quality of the
generated texts, he would hardly notice a similar improvement (See Lst. 3.1 and
Lst. 3.5). One prominently absent topic appears to be the overall writing style
of generated texts: Looking at the output of Tale-Spin and similar systems, they
usually follow a fixed structure and read more like technical transcripts than
actual stories5 .
Of course, this observation misses the point of these works, but at the same
time demonstrates a dilemma that authors of story generating systems are fac-
ing: If one text generator is able to creatively build new stories while another one
merely assembles pre-written fragments, the first one obviously appears prefer-
able; but what if the first one only returns texts of poor quality, while the second
5
The only exception here is Brutus, which resorted to using human-authored text fragments
for constructing its final results.
24
3.3. ALMOST GOODBYE RELATED SOFTWARE
25
3.3. ALMOST GOODBYE RELATED SOFTWARE
Table 3.1: Examples for two kinds of satellite sentences, as given by Reed. [10]
Figure 3.7: A sample text from Almost Goodbye, before and after filling in satel-
lite sentences.
26
3.3. ALMOST GOODBYE RELATED SOFTWARE
For example: When the main character is afraid, a rule is active that allows to
replace a TENSION symbol with Maybe its too late for this and similar sen-
tences (Lst. 3.6). This is defined in a specific afraid grammar that will only be
active as long as the current context features a main character with a matching
emotional state. On the other hand, one that feels remembered would result
in deactivating the afraid grammar and activating one that contains rules ex-
panding to a more positive inner monologue like a simple Its going to be
okay. Similarly, descriptions of the scenes environment will be replaced differ-
ently, depending on the location that was chosen or the time of day.
TENSION : *NOGO * ANXIOUS
Listing 3.6: An excerpt from the context-bound afraid grammar.
27
3.3. ALMOST GOODBYE RELATED SOFTWARE
28
3.3. ALMOST GOODBYE RELATED SOFTWARE
of text it deals with: Satellite sentences can only react to dialogue, and the
dialogue itself is restricted to two characters, one of which being the first
person narrator of the story. All the story-specific parameters like emotions,
progression of time and locations are part of the system as well, so there
would be no way to change them without rewriting parts of it.
7 Doesnt Scale Well: Because all text variation depends on a set of small,
interdependent grammars, consistent sentence structure and narration can
only be achieved through careful construction. Many grammar symbols
represent globally used sentence fragments like AIRQUALITY or LIGHTVERB
that can only be used when knowing what exactly they might expand to,
and defined when knowing what exactly they are expected to represent.
This knowledge, however, cannot be acquired easily: Since all context sen-
sitivity in the narration is achieved by activating and deactivating small
sub-grammars, the definition and usage of certain symbols is scattered
across the whole grammar database, and each newly introduced gram-
mar could both use and define any symbol. There is no such thing as a
scope, everything is defined globally and naming collisions between gram-
mar symbols are likely to occur and potentially hard to track. All of this is
perfectly manageable in a small story like Almost Goodbye, but becomes
hard to maintain once the number of grammars grow and it becomes hard
to imagine setting up a general-purpose text generation library with this
concept.
29
3.3. ALMOST GOODBYE RELATED SOFTWARE
modifying the source code, which can get complicated quickly in decen-
tralized environments like internet communities9 .
Summarizing this, Almost Goodbye is, within its scope, a great example for
practically applied Procedural Generation but not a general-purpose solution.
9
When two authors develop extensions independently from each other and a third one wants
to use both, he will manually have to merge them into one consistent patch.
30
Part II
Development
31
4 | Concept
In the course of this thesis, the software prototype of a custom text generator
has been implemented. This chapter describes the core ideas behind its creation
and illustrates the mindset that governed the process. As a reference to previous
works and hint towards its practically oriented nature, the name Scribe has been
chosen as a working title.
Not a Plot Generator: Even in the context of storytelling, this work does not
deal with plot generation or variation, as previous works like Tale-Spin
did; but rather with textual representations of stories. The difference in
32
4.1. PROJECT GOAL AND DELIMITATION CONCEPT
Not Specialized: This work aims for a general-purpose text generation system
that could be used in various ways. It is not tied to a specific use case
like for example Almost Goodbye. It is also not designed for interac-
tive fiction, although it could certainly be used as a key component of an
interactive fiction system.
As its name suggests, Scribe aims to be primarily One who writes focused
on the practical aspect of writing and not being a truly creative author in its own
right, but a diligent helper. Humans may be far better text generators than any
current software system could hope to be, but their throughput is limited; they
cant write thousands of specialized text variants in a reasonable time. This is
where Scribe has its place as a tool for generating variation, expressing data
and assembling human-authored text dynamically.
1
A plot generator like Tale-Spin and a text generator like Scribe would actually make a good
team: Their tasks are mostly distinct and each could contribute some puzzle pieces that the
other can not.
33
4.2. USE CASES CONCEPT
Story Variations and Flavours: Authors could write stories with varying text
passages, settings or moods. Important themes and morales can remain
constant, but could be represented differently. Characters could be al-
tered, scenes relocated and props switched out. In an experimental setup,
an artist could even release a printed story, where each books content is
slightly different from the next. In a more grounded use case, fables or
short stories could within limits adapt to the reference frame of differ-
ent cultures, or to different target audiences in general.
Structured Reports: Certain repetitive and well structured types of news arti-
cles could be generated using data about the event other systems have
demonstrated this, for example, by writing experimental articles about lo-
cal soccer matches. By spending less time on writing boilerplate text
that uses a similar structure and phrases most of the time, news authors
could afford to cover smaller events that might otherwise have fallen out
of scope. [30] Likewise, any type of report that is primarily dependent on the
data it seeks to communicate could be automatically generated as well.
Personalized Advertising: When possible, internet ads are often selected with
regard to the specific user that is visiting a site. Text-based adverts could
go a step further and adjust depend on the interests and traits of its current
viewer or depending on the website that currently incorporates them, in
order to blend in and have a higher chance of being perceived as appropri-
ate or useful.
34
4.3. SCOPE AND REQUIREMENTS CONCEPT
common or generic phrases, but could still provide a more valuable starting
point for human authors and readers than a non-existent page. [37]
Game Development: Like Almost Goodbye, games have already proven to hold
a viable place for Procedural Generation. They are an interactive medium,
and interactivity can be a tricky thing to handle when all content is pre-
defined and all situations have to be anticipated. Procedural Generation
allows to create content on-the-fly in dynamic situations whenever they
arise. Scribe could be part of that as well either as a software system for
purely generating game texts when needed, or as a partial reference model
for implementing certain types of Procedural Generation systems.
Of course, there is no reason to regard the above list as closed and complete,
but a general tendency already becomes apparent: Most of the above would
leverage a text generation system in order to reduce repetitive writing and
increase the re-use of common text fragments and phrases, adding variation to
otherwise static text, or adapting it to fit various different scenarios.
Simple: Authors of literature and authors of program code are not commonly
known to do each others job very well, so requiring too much programming
skills is not desirable for providing text content. Within reasonable limits,
the data that powers the system should be as easy to read, understand and
write as possible.
35
4.3. SCOPE AND REQUIREMENTS CONCEPT
Modular: For both data storage and overall framework organization, a modu-
lar approach should be favoured over a monolithic one. This should ease
collaboration between multiple authors, as each author can work only on
a small, isolated part of the whole system. It could also allow the creation
of self-contained text libraries that deal with a specific topic and can be
added as a module to existing text projects.
All of the above points have been considered in the design of the presented
prototype. As a general course of action, a top-down approach was chosen:
Beginning with a completely pre-written sample text, parts of it have been re-
placed with their own abstract representation, which can dynamically expand
to the original text, or a semantically equal variation. This was repeated in an
iterative process until further abstraction seemed no longer reasonable.
With regard to the overall goal of maintaining a high quality of output text,
this approach had an obvious advantage over attempting to generate new texts
from scratch: Due to the nature of a bottom-up approach, a rough, yet func-
tional sketch of the whole text generation framework would have been required
in order to receive any kind of result and evaluate that result for quality. Since
neither framework architecture nor the degree to which Procedural Generation
2
Random Number Generators (or RNGs) cant really create random numbers, because com-
puters are inherently deterministic. To overcome this limitation, most RNGs maintain an internal
state that is transformed in a complex way to produce a series of pseudo-random numbers. The
initial state of an RNG determines the complete sequence of numbers following it, and is conse-
quently called the seed value.
36
4.3. SCOPE AND REQUIREMENTS CONCEPT
can be applied were known at the start of this project, this seemed very imprac-
tical. On the contrary, the iterative top-down process that was chosen produced
a readable output at all times and could easily be monitored and evaluated after
each change.
37
5 | Architecture
5.1 Preliminary
In its essence, the task that needs to be performed by Scribe can be regarded as
the process of transforming abstract information into human-readable text.
This process is usually not a sequential one, neither for human authors, nor
algorithms: Texts are much more than just a sequence of words, and one can
imagine that it is impossible even for the most talented authors to write a bigger
text one word at a time without planning ahead. Human-readable text is not a
continuous stream, but a very structured kind of transporting information al-
beit not in the same sense that computer scientists tend to think about structured
data.
On one of the lower levels, there is sentence structure, as expressed through
their syntax. Paragraphs are not randomly grouped sentences either, but instead
show some resemblance to a thought process, each with an introduction, a core
narrative and a closing part wrapping it up. Multiple paragraphs form sections
or chapters, which together can form even bigger structural elements, and so
on. Although the actual usage and meaning of each of those elements may vary
strongly, they are neither random nor without purpose and likely implement a
certain concept that helps readers to follow through. As a result of the structured
nature of text, an abstract representation will exhibit similar structured elements
and so will a Procedural Generation system for text that has to deal with them.
38
5.2. TEXT TREE GENERATOR ARCHITECTURE
Figure 5.1: Formal grammars and similar nested structures can be visualized
as a tree that is recursively expanded. In Scribe, a text tree is built during the
generation process.
The algorithm at the core of the text tree generator is rather simple in con-
cept. For each tree node that is expanded, beginning with the root node, the
following basic procedure is run (Fig. 5.2):
39
5.2. TEXT TREE GENERATOR ARCHITECTURE
1. Query knowledge from the source data, and perform basic reasoning when
requested. The resulting information will be used in the text generation
process.
2. Evaluate dynamic expressions in the current node. This may include in-
corporating information into the text, as well as deciding how to expand
certain symbolic parts of that text.
3. Resolve the symbols that were determined in the previous step, to get a
hold of the text segments to insert in their place.
Figure 5.2: An illustration of the five steps of the text tree generation algorithm.
The terms Ontology and Templates will be discussed in detail in chapter Main
Components. A concrete step-by-step example of the generation algorithm is
available in Appendix A.
Like it was stated in the concept chapter before, a top-down approach is use-
ful to make sure that the desired high quality of output texts can be reached:
40
5.2. TEXT TREE GENERATOR ARCHITECTURE
As more and more parts of a formerly pre-written text are substituted with dy-
namically generated ones, a sudden loss in quality will be easier to spot during
development than a failure to gain it when trying to create a completely new
text from scratch.
Building up on the concept of recursive expansion, the data structure of a tree
emerges naturally. An advantage of also using it for representing the generated
text itself is the fact that each tree node can again be regarded as the root of
a sub-tree, each of which individually represents a smaller, but more concrete
portion of the output text. This not only allows the system to easily re-generate
or modify only certain segments of a previously created text, but also helps users
to understand its results by providing them in a hierarchically structured way.
(Fig. 5.3)
Figure 5.3: Each text tree node represents a text portion that includes the ones
represented by all of its child nodes.
When looking at the tree of text elements, it also resembles the way a human
author might approach writing that text: Its root node represents the text that
has yet to exist and the seed from which it will grow. Going one layer deeper,
there is a very broad structure of the text, like a separation into introductory
chapters, conclusions and the main narrative all of it still being abstract, but
not quite as much as their parent node. Each subsequent child layer adds detail
and fleshes out the content a bit more, until a leaf node is reached that really
just contains a concrete text fragment. Similarly, a human might start with an
idea, then broadly draft the narrative structure and refine it until he is ready to
concretize it by writing. (Fig. 5.4)
Although the text tree generator does most of the work in Scribe, the task
of text generation is not yet done when it has finished working. After all, it
didnt produce an output text, but an abstract representation of it which can be
subject to further processing and modification, if the user so wishes. A text tree
may be a useful data structure, but it cannot be read by a human; transforming
it into a more accessible format is a step that still needs to be done.
41
5.3. TEXT WRITER ARCHITECTURE
Figure 5.4: A text tree is also the result of recursive concretization of an abstract
origin.
42
6 | Main Components
The Scribe prototype can be split into five different areas of responsibility. Each
of them will be explained in this chapter to shed some light on the thought and
design process behind them and illustrated with select implementation details.
First, a generic Template system is described that expands on the approaches
of Almost Goodbye to define pre-written text fragments in a structured way. In
a consequent extension of this, the implementation of an Ontology layer is de-
tailed, which acts as a knowledge backend of the system; for more language-
specific data, a simple Dictionary is introduced next. Tying all of these systems
together, an Expression parser allows to evaluate small logical statements within
Template texts in order to replace them with a suitable text or invoke system
functionality. Finally, a Plugin API opens the framework to external additions in
a modular way, without requiring source code access.
6.1 Templates
At its core, Scribe is built upon a data structure that is called a Text Template, or
simply Template. In the very first iteration of the prototyping process, a Template
consisted of a fixed text block that was forwarded to the output mechanism
which was of course not very Procedural. As a first measure for introducing
text variation, a feature was added to allow Templates to leave blanks within
their text body, which could refer to a different Template that would be filled in
later (Fig. 6.1). In its simple form, this was equivalent to the formal grammars
that were used in Almost Goodbye; each Template being a replacement rule
mapping its name symbol to a text statement.
Still, the question remains why Scribe would use pre-written text in the first
place: In a different approach, a text generation system could try to synthesize
natural language text by itself based on structured data and knowledge about
how to form sentences which would enable it to deviate from pre-defined frag-
43
6.1. TEMPLATES MAIN COMPONENTS
ments and express any kind of meaning1 . However, this would also deprive
users of any direct influence on the text output beyond modifying abstract data
models. Writing quality would be limited by the capabilities of a text synthe-
sis module, which is not desirable with regard to the previously stated project
goals. A big advantage of choosing a grammar-like backend is not only its high
text quality potential, but also the added flexibility and predictability: Since all
output text is derived directly from pre-written and recombined text fragments,
an author has full control over the generated results and can apply his writ-
ing skills in a useful way. Stylistic devices, corporate speech2 or just peculiar
preferences of a single artist can be incorporated into generated texts without
problems.
Figure 6.1: A trivial Template design similar to the formal grammars from Almost
Goodbye.
44
6.1. TEMPLATES MAIN COMPONENTS
is adding the file defining them to the appropriate folder; again, this kind of
workflow reinforces the modular nature of the system and aims towards easing
collaboration. (Fig. 6.3)
Figure 6.2: When multiple Templates share the same name, they form a pool
from which one instance is selected randomly.
Figure 6.3: Multiple Templates can form a module, which can be added to and
removed from the Template Database individually.
To reduce the need for these dependencies, Template Parameters were in-
troduced as a feature: When referring to a Template, a special function call
syntax can be used to pass any number of arguments to it. They can be of any
type, including various forms of raw data like numbers and text strings, but most
45
6.1. TEMPLATES MAIN COMPONENTS
importantly Template symbols as well; which can then be expanded within the
text, or passed on with subsequent calls. This allows different Template modules
to communicate with each other without having to know or access their inter-
nals they can instead provide a public API for others to invoke, and exchange
information through Parameters. (Fig. 6.5)
46
6.1. TEMPLATES MAIN COMPONENTS
Figure 6.5: Template Parameters can help reduce complexity by rectifying access
patterns to flow in one direction only. Here, Module B is entirely self-contained
it doesnt depend on any outside information and can be used in various setups.
changing something. Not only was this diminishing usability, it also required
careful maintenance and would otherwise be an opportunity for mistakes. This
issue was addressed by allowing each Template to define more than one text
body, one of which would be selected randomly when being invoked while
sharing the same context.
To allow a more specialized text generation regarding Feature and Parameter
values, optional Conditions were introduced for both Templates and individual
text bodies: Whether they are considered available in a random selection event
is determined by the result of a boolean expression which can depend on all
Parameters and Features that are present in their scope. This allows users to sup-
ply very specific text fragments, which are used only when their equally specific
conditions are met resulting in a more fine-grained and less generic context
awareness.
In their final design (Fig. 6.6), Templates have become a versatile device
for assembling text from pre-written fragments. They can be defined in a self-
contained, modular way, provide the means for configuration through parame-
ters and scoped data access, as well as variation and conditional specialization.
Unlike the formal grammars from Almost Goodbye, they do not double as vari-
ables or context carriers, but are clearly aimed towards generating text. Using
Parameters, any kind of data can be passed to a Template for evaluation and text
generation, which can then be made implicitly available to subsequently invoked
Templates through Features.
One thing that wasnt detailed so far is that Features also have a second
purpose: They allow Templates to submit queries for retrieving data themselves.
47
6.2. ONTOLOGY MAIN COMPONENTS
6.2 Ontology
Ontology deals with describing domain-specific knowledge in a structured form.
While Templates only deal with parameterized written text and its variation, an
Ontology layer can provide the added value of both world and domain-based
knowledge.
In the context of Scribe, this solves the problem of information availability:
Consider a Template that provides a text passage describing the early Monday
morning of a character. Following modular principles, the characters name is
passed as a Parameter, so it can be incorporated in the text but that name
alone doesnt tell us much about him. Does he prefer coffee, tea or something
else? Is he a morning person? Does he like his job? The Template which has the
job of describing that persons morning has no idea about all of this, and thus
can hardly describe anything in a meaningful way. Passing all of this information
explicitly via Parameters or Features seems impractical and short-sighted, as it
would limit future text fragments to a fixed set of variables that strictly define
what can be known. The calling site might have all the information, but the
invoked Template has no way of asking for it and would be bound to whatever
kind of useful information was anticipated when first defining it. (Fig. 6.7)
With an added Ontology layer, this is no longer a problem. Rather than
passing a characters name as a Parameter, an Ontology symbol that refers to that
character can be passed; allowing the Template to not only access his name for
48
6.2. ONTOLOGY MAIN COMPONENTS
a textual representation, but also any kind of information that is available in the
data layer. This includes relations to other entities and abstract knowledge as
well: Does he have a pet? Whats the name of that pet? Is it a dog? Suddenly,
all of this becomes available for text generation and specialization, where a lack
of information previously would have forced authors to resort to writing rather
generic texts (Fig. 6.8). It also allows a limited degree of reasoning: If the
character has a pet, which is a dog, and dogs tend to wake their owners, it can
be assumed that it would be reasonable for the character to be woken by his
pet3 .
Another benefit of introducing a distinct Ontology layer is the clear separation
of responsibility into writing and knowledge representation. Templates are no
longer required to work on both domains (using facilities that arent up to the
task), but can instead focus on assembling text bodies, while querying data from
a component that is specialized in providing it. In a more metaphorical sense,
when regarding Templates as subset of a virtual authors writing abilities, the
Ontology layer would assume the role of his semantic memory.
For representing knowledge in the Scribe prototype, an implementation of the
Resource Description Framework (RDF) was chosen. Using this specification, all
knowledge is modelled as a set of relations. Each relation consists of a subject,
a predicate and an object, and is typically called a data triple. Consequently,
an RDF dataset consists of any number of triples and can be visualized as a
relational graph of semantic entities. Unlike more traditional, object-oriented
3
Example: If the main character has a OwnsPet relation to a Toby entity, which has an IsA
relation to Dog, and Dog has a WakesUpOwner property, an appropriate Ontology query can reveal
that the main character will be woken by Toby.
49
6.2. ONTOLOGY MAIN COMPONENTS
Figure 6.8: With an added Ontology layer, all knowledge is available to all Tem-
plates. Information about specific entities can be queried using a symbol.
Figure 6.9: The object-oriented approach on the left results in a single, atomic
block of information that can be neither extended, nor split up. Contrarily on
the right, the RDF approach allows both.
This approach is reinforced by the core concept of RDF, where anyone can
say anything about anything [12] a very useful trait with regard to the modular
structure that Scribe is aiming for. When designing the Template Database in the
previous chapter, it was a main goal to keep it open for additions: Each module
could be designed in a self-contained way, but still be extended by any other
module, and introducing new modules can be done simply by dropping them
into a file system folder. For the same usability and maintainability reasons,
the systems knowledge representation layer should aim for a similar degree of
50
6.2. ONTOLOGY MAIN COMPONENTS
modularity and extensibility; and this is where the triple structure of RDF really
pays off: Because each object is an aggregate of independent triples, data can
be organized freely across multiple distinct datasets. For a single object, differ-
ent Ontology modules can contribute different kinds of knowledge in the same
way that different books can deal with different aspects of a single topic. Mirror-
ing the Template components approach, the Ontology Database additively loads
every RDF dataset that is located within a certain file system directory it acts
like a metaphorical bookshelf where individual books can be added and removed
as needed.
For accessing knowledge, an implementation of the SPARQL query language
was chosen. The main application for a query language in Scribe is to obtain a
list of semantic entities that match certain conditions and SPARQL is a text-
based way of expressing those conditions, so the underlying query engine can
process it. Being designed specifically for RDF, a SPARQL query defines a set of
relations that need to be matched; either as simple as checking for a property
value (Lst. 6.1), or involving more complex relations (Lst. 6.2).
SELECT ? name
WHERE
{
@ {Param : SomeEntity } P r o p e r t y : Name ? name .
}
Listing 6.1: The core of a simple SPARQL query in Scribe - it retrieves the name
of an entity that is passed via parameter.
In Scribe, queries are typically defined by Templates, which make use of them
for setting default values of Parameters and Features. When a query yields mul-
tiple result values, each Feature or Parameter can be configured individually to
either accept the first result, a random result, or all of them at once. When
no results are available at all, a sequence of fallback queries and values can be
specified, so the Template can still produce a generic text when the required spe-
cific data is not available. To further configure queries, they have access to all
Features and Parameters in their defining Templates scope.
With RDF and SPARQL, Scribes Ontology component is able to provide arbi-
trary amounts of world and domain knowledge, which can be used in the text
generation process. This not only allows to manage information in a much more
structured and maintainable way than with a pure formal grammar approach as
used by previous systems, but also provides limited reasoning capabilities on top
of that.
51
6.2. ONTOLOGY MAIN COMPONENTS
SELECT ? entity
WHERE
{
? entity P r o p e r t y : IsA * ? actor .
? actor P r o p e r t y : HasAttribute F l a g : Actor .
Listing 6.2: A more complex query. It retrieves a semantic entity that is both
flagged as an actor, and preyed upon by the main actor, which is passed as a
parameter.
An Ontology layer can also serve as a solution to deal with certain gram-
matical aspects of writing: When an actor is provided via Parameter, or chosen
dynamically using a query, text fragments need to address the topic of grammat-
ical gender accordingly. Is it he, she, or it? Is it his arm, her arm or its arm? What
is the plural of wolf ? A correct sentence cannot be formed without knowing this,
and clearly, providing pre-written sentences for each grammatical configuration
is not a viable solution.
Instead, each semantic entity in the Ontology database could specify the
proper wording and Templates could access it dynamically via Ontology query.
The entity John Smith would specify he and his, while Jane Smith would
return she and her. An entity Wolf could specify Wolves as a plural form.
By abstracting away grammatical information, Templates can use the same base
fragments and still produce correct texts independently from an entities gender.
Unfortunately, this approach has several problems: For one, Templates would
get cluttered with a lot of queries, because each single word that depends on
the grammar of an entity would require one. An author would not be able to
form a single sentence without having to think about queries and data access
which is a no-go from a usability perspective. More importantly though, storing
grammatical information is not what the Ontology layer was designed for in
the first place; it should store world and domain knowledge, but not language-
specific lookup tables. Lastly, creating a strong tie between Ontology and specific
languages only for the purpose of looking up grammatical cases seems like an
unnecessary way to introduce synchronization issues in multi-language setups.
It appeared obvious that a different solution should be considered.
52
6.3. DICTIONARY MAIN COMPONENTS
6.3 Dictionary
Dictionaries are a way to store language-dependent data of words and phrases;
and they can provide a solution to the problem of creating grammatically correct
sentences without knowing all their elements up front.
In the domain of Natural Language Processing, a Dictionary usually describes
a database that holds information about words how to use them in a grammati-
cally correct way, and sometimes additional data like synonymous groups, trans-
lations or pronunciations. Using this database, text synthesis and translation
programs are able to form grammatically correct sentences, find translations or
provide text-to-speech services. Manually applying grammatical rules requires
deep knowledge of the language at hand, and also introduces strong ties be-
tween the grammar algorithm and the language it is dealing with neither of
this is considered to be a necessary addition to this work, so a much simpler,
language-independent and entirely data-driven approach was chosen.
Singular Plural
Noun:Wolf wolf wolves
Noun:Sheep sheep sheep
Noun:... ... ...
Table 6.1: An excerpt from a simple noun Dictionary. Each symbol can map to
multiple words, depending on the required form.
53
6.3. DICTIONARY MAIN COMPONENTS
Table 6.2: When referring to a dynamically chosen entity using a pronoun, the
correct form can be retrieved from a special Dictionary that holds gender-specific
word forms.
word itself directly instead? The real advantage of Dictionaries emerges when
allowing to link Ontology entities to Dictionary symbols: They do no longer have
a direct string representation that is inserted into the text, but instead refer to
a Dictionary row; which enables fragment writers to refer to the entities plural
without knowing which entity it will be (Lst. 6.3).
Of all the { Var : SomeAnimal . Plural } , one was especially clever .
Listing 6.3: Linking Ontology and Dictionaries allows accessing the correct form
of an entity that is not known when writing the text.
54
6.3. DICTIONARY MAIN COMPONENTS
Table 6.3: By directly linking the Noun:Wolf row to the grammatical one of
the appropriate gender, it inherits all gender-specific data and can access the
appropriate word forms as if they were defined directly.
John opened his eyes to a sharp ringing noise that tore through the comforting silence of his
room. He reached out for the alarm clock.
Mary opened her eyes to a sharp ringing noise that tore through the comforting silence of her
room. She reached out for the alarm clock.
Figure 6.10: Using linked Dictionary rows, all grammar information is accessible
with only one main symbol that can be accessed as if it had defined all forms
itself. This allows to conveniently write texts without previously knowing the
gender of certain entities.
55
6.4. EXPRESSIONS MAIN COMPONENTS
system decide which word to choose for a certain entity while at the same time
maintaining grammatical correctness as described above.
Aware of the projects modularity goal, the Dictionary component also follows
in the footsteps of Ontology and Templates: Users can define any number of
Dictionaries, which are additively loaded from a file system folder. This approach
was successfully used in previously described main components for creating a
modular interface to user-generated content, and there was simply no reason to
deviate from that path, as Dictionaries are inherently modular anyway.
6.4 Expressions
Expressions are instructions for processing a set of values to produce a new value.
In Scribe, they play a vital role in connecting human-written text to the rest of
the system.
In the very first Template implementation, a text fragment was nothing but a
completely static block of text: It would simply be passed to the system output
as is, and there was no way to insert variables or other Templates at specific
locations. While conceptualizing these first steps and adding more and more dy-
namic text generation tools to the system, it quickly became obvious that simple,
generic % markers like Almost Goodbye used wouldnt be enough: They solved
the issue of knowing where to insert generated text, but not what kind of text
to generate which was not a problem in the specific and limited scope of a
short piece of interactive fiction, but it certainly would be in a general-purpose
environment.
His favourite cat was { Var : FavouriteCat . Age } years old .
Listing 6.4: Similar to accessing word forms from a Dictionary in Fig. 6.10, it
should be possible to access Ontology entity relations to retrieve data as well.
To the very least, each marker in the text should be able to specify a symbol
that refers to the Template or variable value to insert. This first setup already
allowed users to access most of a Templates feature set, simply by allowing them
to specify which kind of content to insert, and where to insert it in the text. It
was not able, however, to account for accessing different word forms using the
Dictionary component, or a more in-depth usage of the Ontology layer: Both of
them require the user to specify not only the name of a target entity, but also
the name of the member to access (Lst. 6.4). Furthermore, Templates that
56
6.4. EXPRESSIONS MAIN COMPONENTS
expose parameters require some way to specify their value when referring to
them, similar to a typical function call in a programming language. Lastly,
there needs to be some way to not only access Ontology data, but also process it
like checking whether one character is older than another in order to get the
most out of it.
Table 6.4: An open list of different use cases for expressions in text fragments.
With arithmetic and logical operations, function calls and member access
already on the growing requirement list, the stretch from sophisticated text in-
sertion markers to full expression evaluations became smaller and smaller. So,
rather than extending the initial support for symbol markers, they were re-
placed entirely with a custom expression language that is able to provide both
the required basic features of text insertion, as well as more complex opera-
tions. Its syntax and semantics were specified directly in source code, using a
library called Sprache [34] , which does the low-level work of parsing and trans-
forming any provided input text accordingly. Depending on the specification that
is passed to the library, its parsing output can be any kind of structured data
like the abstract representation of an XML document. In Scribes implementa-
tion, the resulting data structure is an expression tree [35] that can be compiled
into a method at runtime, which can then be executed in order to evaluate the
expression. (Fig. 6.11)
Both of these processing steps parsing and evaluation require a certain de-
gree of context information to work. When parsing an expression, identifier sym-
bols can mean different things depending on where that expression is located:
An identifier like Var:Actor can refer to one variable when used in Template A,
57
6.4. EXPRESSIONS MAIN COMPONENTS
but a different one when used in Template B, depending on the scope of each.
From an implementation perspective, this can be solved by passing the required
context data on to the expression parser via parameter; but since it would re-
quire either anticipating what kind of context data is needed, or blindly offering
all of it, this could later turn out to be a poorly maintainable choice. Not only
would the exact specification of each context parameter affect the way in which
both the parser and the environment invoking it need to be implemented, it
would also make adding, removing or changing these parameters more difficult.
(Fig. 6.12)
Figure 6.12: Passing all context information via parameter ties parser and envi-
ronment strongly together. This can make maintenance more difficult, especially
when having multiple different environments to support.
To prevent this, an interface for querying specific context data was defined.
By passing only the interface on to the expression parser, it remains entirely
decoupled from its environment while at the same time being able to access
context information as needed. A setup like this also makes it easier to use the
expression parser in different environments - like inside of Ontology queries, or
as conditional statements for Templates and individual text bodies. (Fig. 6.13)
58
6.4. EXPRESSIONS MAIN COMPONENTS
Figure 6.13: Introducing an interface that allows the parser to retrieve context
data from its environment can help providing a clear separation.
Figure 6.14: Similar to the IParserContext interface from Fig. 6.13, the
IEvalContext interface provides a clean separation between expression eval-
uation and the context they are evaluated in.
59
6.4. EXPRESSIONS MAIN COMPONENTS
Identifiers: All identifiers in the Scribe expression language are treated as liter-
als, which are resolved at runtime. That way, they can be passed around
or stored in variables, similar to pointers in C or C++ . Since every object
is referred to via identifier, this works not only with Templates, but also
Dictionary entries, Ontology entities and even with variables themselves.
Being able to handle all kinds of objects that way makes it a lot easier for
Templates of different modules to communicate with each other because
they have the means to exchange the building blocks on which the whole
text generation system is based upon.
60
6.4. EXPRESSIONS MAIN COMPONENTS
61
6.5. PLUGIN API MAIN COMPONENTS
to specify an intuitive short name for each word form. The plural of a noun
can be shortened simply with an s, resulting in a very minimal to request
the proper plural form of a variables Ontology entity:
Var : SomeAnimal . s
62
6.5. PLUGIN API MAIN COMPONENTS
Maintenance efforts are putting a high workload on the task of creating a new
programming language and also introduce a second point of failure: Does
my script produce the wrong output because there is an error in my script,
or because there is a problem with the language it is written in? Especially
in the context of this thesis, implementing and maintaining a custom pro-
gramming language would carry a high potential to be a significant time
sink.
The way a plugin interface is specified will also define the contact points
between core and plugin; where exactly it is possible to add custom functionality
and how it is done. In the current Scribe implementation there are two of them:
63
6.5. PLUGIN API MAIN COMPONENTS
64
Part III
Evaluation
65
7 | Results
In the concept phase of this work, a number of potential use cases for Scribe
was defined, most of them falling into at least one of three categories: Using
text generation as a way to automate writing repetitive text segments, using it
to introduce variations to otherwise static text, and finally leveraging it as a tool
to deal with the complex task of adapting a text according to multiple variables.
With these categories in mind, a set of three test cases has been determined in
order to evaluate the prototype that was created in the course of this thesis:
1. Telling the same story with different actors can illustrate Scribes ability
to complete a text fragment with variables, including both finding suitable
solutions for variable constraints and properly reacting to them within the
text for example, adapting to different genders of an actor.
3. Generating short factual texts from data serves the purpose of testing
Scribes usefulness on a different domain than varying fictional stories,
where text is written and consumed primarily to deliver information, rather
than for its entertainment value.
For each of these test cases, a concrete example has been chosen in order to
add an element of practicability. While in some instances, further consideration
was necessary to find an optimal angle on how to use Scribes facilities, no more
system changes were implemented at this point. All following discussion deals
purely with different ways to use the previously described text generation system
and should not be regarded as a continuation of development, but rather as an
exemplary display of potential workflow elements.
66
7.1. SAME STORY, DIFFERENT ACTORS RESULTS
The Lamb that belonged to the sheep whose skin the Wolf was wearing began to follow the
Wolf in the Sheep's clothing. So, leading the Lamb a little apart, he soon made a meal off her
and for some time he succeeded in deceiving the sheep, and enjoying hearty meals.
Listing 7.1: The version of The Wolf in Sheeps Clothing that was chosen as a basis
for this test case. Source: [44]
The Sheep is considered the secondary actor, which, in addition to the basic
actor constraints, is required to be an entity that the main actor is preying
upon.
The Lamb is a special version of The Sheep, which is led astray and devoured by
the main actor. It is not vital to the story and can be described simply as a
child or young of the secondary actor, in case there is no special entity in
the database for filling that role.
67
7.1. SAME STORY, DIFFERENT ACTORS RESULTS
The Shepherd is, on an abstract level, the protector of the secondary actor, who
watches over them and keeps the main actor at bay. This role is required
for the original version of the fable, but, in cases where the Ontology layer
does not provide such an entity, can be skipped by assuming that the group
of secondary actors simply protect or watch over themselves.
The Dog is an optional actor that serves the role of the protectors helper. Sim-
ilar to The Lamb, its appearance is non-vital and more of a decoration to
the overall story.
The first three roles are determined in the stories root template, as they will
be used throughout the story independently from any variation that might be in-
troduced. While the young version of the secondary actor remains optional due
to its paraphrase fallback, main and secondary actor themselves are required for
the story to exist in the first place. Stories where either of those roles remain un-
filled will be discarded by the system at an early stage and trigger a backtracking
step to find a valid solution. On the contrary, the protector roles are optional
for a different reason: When no suitable protector can be found, text fragments
that require one simply become unavailable, leaving only the ones where it is
assumed for the secondary actor to not have a protector. Rather than requiring
certain entities to exist, the text is shaped based on their availability.
Once upon a time, there was a hyena who liked to eat antelopes, yet had a hard time getting at
them, because they were very careful and kept their eyes open for intruders. But the hyena
was patient and followed the antelopes in some distance, day in and day out, collecting old
parts of their fur that they had left behind, and making a cloak out of it.
After a while, an unsuspecting antelope ended up separated from the rest, and alone with the
hyena, who soon swallowed it.
Listing 7.2: A generated version of the fable with different actors. Note that,
since there is no protector defined for antelopes, the text has been altered to not
require one.
68
7.1. SAME STORY, DIFFERENT ACTORS RESULTS
Difficulty arises from the fact that, while the fables morale only requires a
very abstract setup, its concrete form includes various details depending on its
choice of actors. The secondary actor having a protector or not is an example
of the more significant ones, but it also shows in details like the construction
of a disguise: Finding the pelt of a flayed secondary actor only makes sense if
that actor both has a pelt in the first place and is a domesticated animal that is
flayed or sheared. At the same time, other kinds of secondary actors might offer
different ways to obtain a disguise that might be described instead.
A fox found great difficulty in getting at the chickens, as they always keep their eyes open for a
threat. But the fox was patient and followed the chickens in some distance, day in and day out,
collecting old parts of their feathers that they had left behind, and making a cloak out of it.
Without being noticed, one of the chickens' children started to follow the fox in chicken's
clothing and was swallowed soon after.
Listing 7.3: Another generated version of the fable. In this case, chickens are
preyed upon who do not have a pelt like the original actor. Instead, the fox
ends up collecting their feathers.
The process of concretizing an abstract idea is a creative one that feels natu-
ral to humans, but is yet very difficult (if not impossible) to solve for a machine.
Scribe itself has no way of coming up with new ideas, or even determining that
picking up the flayed pelt of a chicken doesnt make any sense. It doesnt under-
stand the text it is writing, but simply evaluates and executes the patters that are
defined by previous human input. Consequently, its problematic inability to be
truly creative is solved by not trying to be creative at all and instead leveraging
human creativity wherever possible. When describing the act of obtaining a dis-
guise in Scribe, a viable practice is to provide various conditional text fragments
with specific descriptions1 , and a generic text template2 as a fallback. While the
generic variant ensures that there is a valid text description in all cases, specific
versions can paint a more lively picture of the concerned plot point whenever
their conditions are met.
Although this was only the first test case for Scribe, it quickly became clear
that its Expressions and Dictionary components were able to serve their purpose
1
For example a specific text fragment for Animals that are flayed, or one for Animals with
some form of fur or coat.
2
For example simply not detailing how exactly the disguise is obtained and just stating that
this is the case.
69
7.1. SAME STORY, DIFFERENT ACTORS RESULTS
as intended and provide an easy way to integrate dynamic variables into the text
across different grammatical situations. As detailed previously, grammar was
not applied by the system in any way, but the human author writing the text
templates, and using member and dictionary lookups on variables in order to
invoke different forms of an entities text representation.
However, a rather fundamental issue became evident as well: Re-using a
text fragment in different contexts and thus reducing human workload is only
possible if that text fragment is generic enough to be equally applicable in all
of them; but especially in creative texts, this genericness can easily be perceived
by the reader as lifelessness, lacking the carefully arranged detail one would
expect from a human-written text. Purely factual texts, like business reports or
corporate E-Mail dont have this problem to the same degree, because a certain
sobriety is expected from them. As far as more creative text for recreational
reading is concerned, this problem can be countered by preferring a high number
of specific text fragments over a few generically varied ones. (Fig. 7.1)
Figure 7.1: Text generation based on pre-written and adapted text fragments
puts the author in a position of conflicting goals writing generic text to re-
duce workload and writing specific texts to improve quality. The ideal balance
between the two depends on the nature of the written text.
70
7.2. CHANGING THE MOOD RESULTS
ent aspect, while One who has a goal will find a way completely switches from
the sheeps cautious perspective to appraising the value of the wolfs cleverness.
The likely reason this drastic change still works so well is because the human
reader does all the work by reinterpreting the text according to the morale it
tells him to convey although nothing changed in the text itself, it appears that
the readers view of it can be shifted easily using only a few anchoring cues. This
could have implications not only in the domain of procedural story generation,
but also when it comes to creating variations of factual texts and reports that
present the same content with different sentiments and shifted views. However
interesting, expanding on this would exceed the scope of this work and will be
left to the investigation of future works.
I showed him the plot and we watched at the kitchen window so he could see the birds he was
meant to scare. He took it all in, but I could see his hands shake. I said 'Are you nervous?' He
said he had Essential Tremors and he probably drank too much, but the shaking helped his
work. 'Makes you look real,' he said.
Before dawn every morning I let the cat out, and I see Yincent setting up. I think it's nice he's
there and I wave, and he waves.
Listing 7.4: The original text of the short story Scarecrow. Source: [45]
1. Creating a story template with a fixed plot that can be procedurally varied
to be delivered with either a positive or a negative sentiment. Where the
first test case could theoretically afford to operate mostly on an easy to do
insert actor here replacement basis, this is not possible here. Rather than
being a part of the text itself, the overall sentiment is a meta information
71
7.2. CHANGING THE MOOD RESULTS
that emerges from the sum of all content and wording, as well as the in-
teraction between the readers expectation and writers intention. Changing
the mood does not require to replace core facts, but to alter the way in
which they are presented.
2. Extending an existing story template with a new one in a modular way. This
point is about testing whether a viable modularity can be achieved when
designing procedural story templates and data structures. In theory, would
an author be able to write a distinct set of templates and data that adds
a different spin to the story, without modifying any parts of the existing
setup?
The first one was achieved by defining an environment variable that stores
the overall mood value positive or negative and allows all of the stories
text fragments and templates to react to it by describing things either in a good
or in a bad light. First results dont indicate fundamental problems with this
approach and it should be possible to scale it accordingly when using it on bigger
text bodies. Of course, reducing a stories mood to a simple, constant, binary
representation of positive or negative is hardly an adequate practice for more
complex texts, but beyond the scope of a simple proof-of-concept test case, there
is no reason not to use a more fine-grained adjustment when needed.
So we hired this scarecrow. Frederick. One of my friends had this cousin who offered to do the
job and I thought it sounded like a weird job to do at all; so I called him up to see if he was
serious. He came around before I drove to work to meet up and I just gave him the job right
away.
I walked him around the plot and we watched from a distance so he could see the birds he was
meant to scare. He glared at them with a strange look in his face, and I could see his hands
shake. I said 'Are you nervous?'. He didn't answer at first, and with his empty eyes stared into
oblivion; then snapped out of it and said something about 'Essential Tremors' and drinking too
much.
When the cat wants out every morning, I see Frederick setting up. Our cat doesn't really want
him here, and I'm having second thoughts about this as well.
Listing 7.5: One of the negative versions of the story Scarecrow.
72
7.2. CHANGING THE MOOD RESULTS
Tinkering with ideas from the realm of plot variation, it is important to keep
in mind that unlike TaleSpin and its relatives, Scribe has no way of performing an
actual character or world simulation. All dynamic variation would be the result
73
7.2. CHANGING THE MOOD RESULTS
of an authors careful planning, and the combinatoric complexity that arises from
assembling a lot of dynamically adjusted small parts to a single whole. Whether
or not a system like Scribe could be sufficiently used to create a procedural text
from the scale of a book is yet to be determined and while a reasonable scepti-
cism is indeed advisable on that matter, the test cases that have been performed
so far have not yielded a fundamental reason to reject this possibility.
Shifting focus to the domain of extensibility, the second test case was per-
formed in order to find out whether it was possible to extend existing procedural
text template in a non-intrusive and modular way. Aside from the previously
existing positive / negative mood setup, would it be possible to introduce a dif-
ferent theme to the story, without the need for modifying existing structures? As
an example, an alternative version should be introduced where the scarecrow is
perceived as a ghost or supernatural entity, in order to give the story a different
spin with a touch of mystery. The example case was consciously chosen to in-
troduce a new element to the text, without the comfort of being able to simply
re-use existing structures with additional data.
We hired a scarecrow. Gudrun. One of my friends had this cousin who offered to do the job
and I thought it sounded like a weird job to do at all; so I called her up to see if she was
serious. She came around noon to meet up and I just gave her the job right away.
She seemed like a nice girl, but I somehow couldn't really focus on what she said. Her words
just slipped my mind. I walked her around the plot and we watched from a distance so she
could see the birds she was meant to scare. None would show up, which was a little strange.
'They know when I want them to leave', she said, 'and I don't want them to bother you'.
Gudrun is setting up every morning now, but most people don't notice her. Only our cat does.
Scout was always a bit anxious around the wide open field, but since we have Gudrun, that
somehow seems to be fine. I think it's nice she's there.
Listing 7.6: The basic story Scarecrow in a positive mood, but using ghost
story templates that were introduced using a second module.
74
7.3. GENERATING FACTUAL TEXTS RESULTS
Figure 7.3: In order to give an existing story a new spin, its root Template can
be wrapped in a dummy which sets an environment variable. Conditional child
Templates can then check that variable in order to override existing text frag-
ments with different ones.
75
7.3. GENERATING FACTUAL TEXTS RESULTS
While human authors are certainly much more capable of writing encyclo-
paedic articles than an algorithm that has no way of understanding the topic it
is dealing with, the main advantage of automatically created text on Wikipedia
is language coverage: English and German are among the most represented lan-
guages between one and five million articles each, but around 160 languages
are well below the 10.000 mark, and a lot of them dont even reach a thou-
sand articles. [41] Even the least represented languages Kanuri and Herero have
around five million speakers but not a single article beside their front page.
Supplying the less maintained language-specific Wikipedia sites with content is
a huge effort with only very few active contributors. Since a lot of articles can
be grouped into categories with a core set of information that can be described
3
In Wikipedia terms, a stub article is one that contains only a few sentences of very basic
information, without any in-depth exploration of the topic. Articles of this kind usually act as
placeholders, until someone invests the time and research effort to expand them.
4
Cebuano, also known as Visayan, is a language spoken by about 20 million people in the
Philippines.
5
Waray-Waray is spoken by around 2.6 million people and, like Cebuano, originates from the
Philippines as well.
76
7.3. GENERATING FACTUAL TEXTS RESULTS
Merging Redundant Data: Some information like the name or the run length
of a film was expressed redundantly through multiple data properties, but
neither of them was guaranteed to be used at all. Since iterating through
a set of potentially existing properties at runtime in order to search for a
bit of information that may or may not exist seemed like an unnecessary
point of failure, all redundant data was mapped to a single, well-defined
property each. As a result, the name or run length of a film can be looked
6
Plants and animals have a scientific classification, works of art have an artist and a time of
creation, etc.
77
7.3. GENERATING FACTUAL TEXTS RESULTS
up using a single identifier and if that doesnt yield any results, it can be
safely assumed that the requested information is simply not available.
Extracting Implicit Data: Evaluating explicit property data like the name or
runtime of a film is trivial, but covers only a part of the overall available
information. In the ontology data that was obtained from DBpedia, there
is a notable amount of information encoded within the categories a certain
entity is associated with. If a film is categorized under 1940s American
Western, its (rough) release date, nationality and genre can be derived
from it most of which would have been unknown otherwise. Implicit in-
formation like this can then be transformed into explicit property values to
allow easy and well-defined access patterns.
78
7.3. GENERATING FACTUAL TEXTS RESULTS
ysed for cues on its subjects gender. An easy way to do this is counting the
number of occurrences of each genders personal pronouns7 and assuming
that the gender with the highest count is likely the one of the person that
is described.
As it turned out, preparing the film ontology dataset for its usage in Scribe
was a non-trivial task that took its time to complete, mainly due to prevalent
problems with the source data. The transformation process into a more usable
form is still far from perfect, but was deemed good enough for the task at hand,
especially since the main focus of this exercise are mostly Scribes text generation
capabilities, rather than practical data mining applications. The final ontology
dataset that was used by Scribe consists of a guaranteed Name property and a
range of additional properties (See Table 7.1), which may or may not be defined
for any given film entity.
Once the film ontology dataset and Dictionary was generated and ready to be
used, writing the actual text Templates in Scribe was a comparatively small task,
consisting mostly of coming up with suitable ways to describe each fact, struc-
turing the text accordingly and reducing repetitive wording in the final result.
A recurring problem was the fact that nearly all data properties are optional,
resulting in a vast amount of conditional text fragments that are used, replaced
with a fallback or ignored, depending on the existence of their required proper-
ties. Simply remarking a films writer and director in the same phrase requires
four different variants, depending on which of the properties are available and
which arent:
written by { Var : Film . Writer }
directed by { Var : Film . Director }
written and directed by { Var : Film . Director }
written by { Var : Film . Writer } and directed by { Var : Film . Director }
Other text fragments form full sentences based on property values and are
either included in the final text, or simply ignored when no appropriate data is
available; which is easier to handle on the author side, but can lead to a list of
disjoint sentences with a bad reading flow. Building at least parts of the final text
on a sub-sentence level allowed to manually join sentences and create transitions
7
he, him, himself, his / she, her, ...
79
7.3. GENERATING FACTUAL TEXTS RESULTS
Table 7.1: A list of relevant film data entries from the processed result ontology.
instead at the cost of a more complex Template structure and higher effort for
the text author. This again highlights the issue of specialization versus reusability,
as pictured earlier in Fig. 7.1: When trying to achieve a continuous reading flow
and narrative, specialized text is required; whereas the easiest thing to do with
a dataset is generating a list of disjoint sentences, each stating a simple fact.
The more specialized a text gets, the less reusable and dynamic it becomes, so
pursuing the goal of text quality appears to be in direct conflict with attempting
to reduce the overall workload by sticking to generic patterns.
Evaluating the quality of Scribes generated text can be a bit tricky to do in an
objective way, because the quality of a text cant be easily defined when moving
80
7.3. GENERATING FACTUAL TEXTS RESULTS
beyond describing stylistic issues. [22] Instead, each generated film summary was
compared to the corresponding excerpt from the human-written Wikipedia text
in order to put it into perspective. When asking test candidates to decide which
version of a text was taken from Wikipedia, and which one was generated, they
were usually able to tell the difference but not always easily so. Although
without an actual study and a lot more participants, this finding has little more
than anecdotic value, it was still notable that some candidates even scored as
low as 50 percent in classifying the origin of a given text; meaning that, in these
specific cases, the generated texts remained essentially indistinguishable from
actual Wikipedia excerpts.
When assuming that these first results are representative of Scribes poten-
tial, it can be expected to be a viable helper for populating some of the lesser-
maintained languages on Wikipedia, one specific domain at a time.
However, it also became apparent that generating more than just a quick
summary would pose a serious problem to the system, as writing a whole article
about a topic has more creative elements to it than wrapping a predefined set
of data points into sentences; in fact, even this task alone turned out to be a
less than ideal fit to Scribes capabilities: Being primarily a tool for varying and
adjusting pre-defined text Templates, it is easy to generate individual sentences
from facts but there is no way to make sure these sentences actually form a
continuous text, a single narrated entity. (See Fig. 7.5)
It was written by {Var:Film.Writer}. It was directed by {Var:Film.Director}. It yielded a gross
revenue of {Var:Film.Revenue}.
Written by {Var:Film.Writer} and directed by {Var:Film.Director}, it yielded a gross revenue of
{Var:Film.Revenue}.
Figure 7.5: The first text is easy to generate, because each sentence is optional
and completely unrelated to the others which also makes it hard to read. The
second text is much easier to read, but requires a lot more effort in its imple-
mentation, because it is unclear which of the required facts are available and the
more fluent version creates a strong dependency between them.
Since all writing is done by an author, there is no easy way to insert a new fact
into the existing text, because all sentences have been carefully crafted to work
in the absence or presence of various fragments and yield a nicely readable text
result. Adding a new sentence inbetween requires complete knowledge of the
surrounding text generation behaviour which complicates the writing process
81
7.3. GENERATING FACTUAL TEXTS RESULTS
significantly. While Scribe works reasonably well with a high text-to-data ratio
like in the previous two test cases, it starts to be a lot less useful when it comes to
transforming raw data into a tightly packed text summary. It is a system that is
most helpful when the main issue is organizing text and adjusting it to different
variables; not for writing the text itself.
82
7.3. GENERATING FACTUAL TEXTS RESULTS
Table 7.2: A direct comparison between a select set of film summaries, generated
from data on one side and manually written by humans on the other.
83
7.4. ISSUES AND APPLICATIONS RESULTS
3 Structure: The main benefit of using Scribe over just writing a lot of text
variants is the framework it provides for not only structuring data and
text, but also integrating both to produce a result. Templates, Dictionary,
Ontology and Expressions work together as anticipated and allow authors
to separate data from text, access it using an abstract layer and present
it using a pre-defined scheme of text fragments, which are dynamically
assembled to a complete text. Especially in cases with a high number of
variables and variations, this can provide a helpful structure to keep track
of it all.
3 Throughput: Once set up, the system requires no further human input
besides the data it should process during text generation. Given a uniform
dataset and a set of Templates on how to transform this data into a text,
the time consumption of generating a new output becomes negligible. This
can help with tasks like generating Wikipedia article stubs, but also reduce
maintenance efforts for introducing changes to a uniform text template
compared to maintaining text variants manually: Independently from the
amount of generated texts, as long as they are generated procedurally, all
it takes to modify them is to adjust their generation Template.
84
7.4. ISSUES AND APPLICATIONS RESULTS
7 Illiterate: The system doesnt understand any part of the text on a deeper
layer and, as a result, is unable to work with it in a more meaningful way
than basic character string operations. Since it cannot perform tasks that
require a certain degree of semantic knowledge, all of them have to be
performed by the user. The previous issue of manually joining sentences
and writing transitions due to the systems inability to help on that matter
is one prominent example of this.
Wrapping this up, it can be expected that a high potential for real-world use
cases of Scribe and similar systems is situated in domains where a recurring
body of text fragments and phrases is used and re-used in order to convey a set
of basic data points to a reader. Generating Wikipedia stub articles is a successful
example of this, but one can easily imagine to use a similar system for generating
corporate E-Mail and text reports as well. The originally anticipated idea of
procedurally generating or varying stories can be successfully implemented as
well, even though it is important to note that the system isnt actually a creative
entity by itself, but designed to help a creative author organize and maintain a
large body of variations.
8
Such as Open Office or Microsoft Word.
85
8 | Summary
In this work, the concept of Procedural Generation was applied to the domain of
authoring natural language text and, in doing so, the capabilities and limitations
of a specific approach were explored practically. As a vital requirement of this,
the prototype system Scribe was developed and tested with three distinct use
cases, each focusing on different aspects of the generation and writing process.
Although the scope of this project later shifted towards a more general direc-
tion, the original idea was to determine to which degree fictional stories can be
procedurally generated without severely impacting their quality.
Figure 8.1: The Scribe testing interface that was used for displaying generated
text results.
Over the course of the initial research effort, various former story generation
systems were described and analysed, including early plot generators like Tale-
Spin, Minstrel and Brutus, as well as the more productively oriented procedural
writing experiment Almost Goodbye. While the discussed plot generators rely on
86
SUMMARY
a world simulation approach with only little reliability and limited text quality,
Almost Goodbye manages to satisfy all production quality requirements using a
formal grammar based approach albeit a strongly specialized one. In an effort
to build up on this, the presented prototype system was conceptualized.
Instead of recursive text replacement using formal grammars, a more com-
plex design was crafted, involving distinct system components for text fragments
(Templates), grammatical word forms (Dictionary), domain knowledge / data
(Ontology) and generation logic (Expressions, Plugins). By relying on a special-
ized solution for each problem of the chosen approach, Scribe is able to expand
the core idea of fragment-based procedural text generation as seen in Almost
Goodbye to a more general-purpose scope. A proof of concept for this was con-
ducted as part of the evaluation and testing process. (See Fig. 8.1 and Fig. 8.2)
Figure 8.2: A debug view of generated text that allows to inspect its internal
structure.
A basic functionality test of all five system components was done by adapt-
ing a pre-written fable to different actors that were dynamically chosen from
a database while solving for a set of constraints. The systems ability to intro-
duce not random, but thoroughly consistent variation was tested by changing
the mood of a pre-written short story, both in a basic version and an extended
setup where entirely new aspects were introduced into the plot for a subset of
generated stories. Finally, Scribe was successfully used to generate Wikipedia ar-
ticle stubs from a database, in order to prove its viability in a completely different
domain with different requirements.
87
SUMMARY
88
9 | Outlook
89
OUTLOOK
Figure 9.1: In a Live Editing approach, the user is able to see and interact with
both the source Template as well as the resulting generated text at the same
time.
Template Features and Parameters could be defined using a fixed user in-
terface as opposed to typing text representations of them, and when writ-
ing embedded Expressions or Ontology queries, their respective specialized
editors could be used directly to guide users through the process.
In order to provide some additional debugging tools, the preview could
also include an advanced mode, which would allow to inspect the values of
90
OUTLOOK
Ontology: Scribes Ontology layer is built upon RDF, a system that relies on data
Triples to express both object properties and relations between objects.
Providing an editor for this kind of data can be done in multiple ways with
different weights on overview and editing convenience.
1
For reference, see applications like Microsoft Excel, Google Sheets or LibreOffice Calc.
91
OUTLOOK
Figure 9.3: An example of an editable table view for Ontology data Triples,
visualized using Google Sheets.
Expressions: Scribe Expressions are the glue that ties together all other sys-
tems, so it is even more important to provide a convenient way of editing
and testing them. Because they represent a limited subset of a program-
ming language, experience from providing usable source code editing soft-
ware can be applied here as well: When editing an Expression, syntax
highlighting could help users parse the code using visual cues and auto-
complete features could help writing it. After the user stops typing, Expres-
sions could be verified for correctness and show error details if necessary
and if there are no errors, the system could automatically derive sample
input from the Expressions context in order to display a selection of likely
outputs.
92
OUTLOOK
Usability can often seem like an open-ended topic and indeed, most of the
above ideas and concepts would require a tremendous effort to be researched,
implemented and evaluated properly. Even though resorting to existing solutions
for the tasks at hand could greatly reduce the workload: Designing, developing
and iterating upon what essentially amounts to being a full Integrated Develop-
ment Environment is not a small endeavour. Yet, it might be a worthwhile one
when it comes to turning the developed prototype into a usable product.
Figure 9.4: An example of a text editor integration for Scribe, visualized using
Google Docs.
93
OUTLOOK
to different use cases and require a different set of skills, integrating both in a
single application would need to be carefully evaluated for viability.
Going deeper into the core of the implemented prototype system, there are
quite some opportunities for improvement as well. As described earlier, a re-
curring problem in the examined test cases was the fact that Scribe isnt able to
handle the actual content of a Template and, as a result, can only use them as
fixed building blocks. There is no way for a Template to react to its surround-
ings, which can lead to several stylistic problems in the generated output text.
Especially problematic are the inadvertent occurrence of word repetitions and
sequences of simple or similar sentences, both of which contribute to an imme-
diate loss in perceived writing quality to the human reader yet, the localized
nature of Scribes text Templates makes it impossible to counteract on the issue
efficiently. Without knowing the final text tree structure, there is no way to de-
tect these stylistic problems and even with knowing it, a Template has no easy
way to react accordingly. Addressing these issues requires a global view of the
generated output, because they occur only after assembling the text. With the ex-
isting system in mind, there are two techniques that could significantly improve
its output quality:
Stylistic Reprocessing: As soon as a text nodes sub-tree has been fully gener-
ated, the final text portion of that node is available and can be analysed
with regard to its stylistic quality and rated among a uniform scale between
zero and one. [22] With the presented system, it would be easily possible to
apply a threshold to the stylistic rating and re-generate the sub-tree, should
its value be too low: Since most Templates implement a certain degree
of variation in structure and wording, it is likely that some instances will
be rated better than others. Contrary to Scribes current implementation,
where the first valid instance will be the final one, this would force the
system to recursively re-write parts of the text until a satisfying style rating
is achieved or a maximum number of attempts is exceeded.2
Stylistic Postprocessing: Even though re-generating parts of the text tree has a
good chance to get the best out of the source material, it will still fail when
a stylistic improvement requires the alteration of text that remains static
across different text versions, or a change in structure that wasnt antici-
pated by Template authors. When two subsequent Templates each define a
2
Interestingly enough, this behaviour also seems to mirror the process of iteratively reviewing
and polishing a text as found in human authors.
94
OUTLOOK
trivial sentence, varying each sentence individually will not solve the prob-
lem of a bumpy reading flow. Instead, an Aggregation of both sentences
would be required, as described in Dale and Reiters Natural Language Gen-
eration stages (Section 2.2). Similarly, a Lexical Choice algorithm could
attempt to switch out overused words with an equivalent synonym.
95
OUTLOOK
with the processed text content on a deeper level. To expand on this, shifting
the projects focus more towards Creation and Adaption (Fig. 3.4) of text should
certainly be among the goals of future prototypes.
96
Bibliography
[1] T OGELIUS , J ULIAN ; S HAKER , N OOR ; N ELSON , M ARK J.: Procedural Content Generation in
Games: A Textbook and an Overview of Current Research
Published by Springer, 2015
[6] P REZ Y P REZ , R AFAEL ; S HARPLES , M IKE: Three Computer-Based Models of Storytelling:
BRUTUS, MINSTREL and MEXICA
2004 in "Knowledge-Based Systems, Volume 17, Issue 1"
[7] B RINGSJORD , S ELMER ; F ERRUCCI , D AVID: Artificial Intelligence and Literary Creativity: In-
side the Mind of BRUTUS, a Storytelling Machine
1999 in "Computational Linguistics, Volume 26, Number 4"
[8] R IEDL , M ARK ; Y OUNG, M ICHAEL: Narrative Planning: Balancing Plot and Character
2004 in "Journal of Artificial Intelligence Research 39"
[11] C HATMAN , S EYMOUR: Story and discourse. Narrative structure in fiction and film
Published by Cornell University Press, 1980
[12] K LYNE , G RAHAM ; C ARROLL , J EREMY; M C B RIDE , B RIAN: Resource Description Framework
(RDF): Concepts and Abstract Data Model
2002-08-29,
http://www.w3.org/TR/2002/WD-rdf-concepts-20020829, Last accessed 2015-04-27
97
BIBLIOGRAPHY BIBLIOGRAPHY
[13] B IRD , S TEVEN ; K LEIN , E WAN ; L OPER , E DWARD: Natural Language Processing
Published by Distributed with the Natural Language Toolkit, 2008
[14] D ALE , R OBERT; R EITER , E HUD: Building Applied Natural Language Generation Systems
2000 in "Cambridge University Press"
[15] S PUFFORD , F RANCIS: Masters of their universe, edited extract from Backroom Boys: The
Secret Return Of The British Boffin
2004-07-31 in "The Guardian"
98
BIBLIOGRAPHY BIBLIOGRAPHY
[36] R EILLY, C LAIRE: Absolutely fabulist: The computer program that writes fables
2014-08-03,
http://www.cnet.com/news/absolutely-fabulist-computer-program-writes-fables/, Last accessed
2015-04-11
99
BIBLIOGRAPHY BIBLIOGRAPHY
Text Sources
Image Sources
100
Part IV
Appendix
101
A | Text Generation Example
The goal of this appendix is to demonstrate a full text generation cycle (Fig. 5.2)
with a concrete example. For the sake of simplicity and in order to focus on
providing a basic overview, certain implementation details may be skipped.
Source Data
Scribes text generation process depends heavily on the available source data.
Excerpts from external files that are relevant to it are displayed as follows:
<Template name=" Template:EasySample " r o o t=" t r u e ">
<F e a t u r e name=" V a r : P r e d a t o r ">
<Query mode=" Random ">
SELECT ? obj
WHERE { ? obj System:IsA Knowledge:Animal . }
</ Query>
</ F e a t u r e>
<F e a t u r e name=" Var:HuntedAnimal ">
<Query mode=" Random ">
SELECT ? obj
WHERE {
? obj System:IsA Knowledge:Animal .
@ { Var:Predator } Knowledge:PredatorOf ? obj .
}
</ Query>
</ F e a t u r e>
<Text>
{ Template:Title }
Listing A.1: Templates\EasySample.xml
102
TEXT GENERATION EXAMPLE
Knowledge : Animal
System : Text Noun : Animal .
Knowledge : Sheep
System : IsA Knowledge : Animal ;
System : Text Noun : Sheep .
Knowledge : Wolf
System : IsA Knowledge : Animal ;
System : Text Noun : Wolf ;
Knowledge : PredatorOf Knowledge : Sheep .
Listing A.2: Ontology\EasySample.ttl
Column : EasySample : Nouns , singular , null , primary
Column : EasySample : Nouns , plural / s , null
Listing A.3: Dictionary\EasySample.dct
Column : Grammar : Genders , I / he , he , primary
Column : Grammar : Genders , me / him , him
Column : Grammar : Genders , myself / himself , himself
Column : Grammar : Genders , my / his , his
Column : Grammar : Genders , mine / hiss , his
Listing A.4: Dictionary\Gender.dct
103
TEXT GENERATION EXAMPLE
Generation Process
The following is a simplified step-by-step description on the process of generating
a text from the available source data.
Preparation
1. The system is initialized by generating the root node of a text tree.
Query
Since Template:EasySample defines two Feature variables, their values need to
be determined. Both Var:Predator and Var:HuntedAnimal define a query in
order to do so.
104
TEXT GENERATION EXAMPLE
6. All variables of the current node are now defined, no further queries are
required.
Evaluate
The current nodes text body has three inline expressions: Template:Title,
Var:Predator and Var:HuntedAnimal.s. Each of them needs to be evaluated
in order to determine a suitable text representation.
105
TEXT GENERATION EXAMPLE
Resolve
The current nodes expressions have been evaluated, but except for Var:HuntedAnimal.s,
none of them can be inserted into the text directly. The remaining values need
to be resolved.
Create
Each inline expression can potentially spawn a sub-node of the current one, de-
pending on the resolved value.
106
TEXT GENERATION EXAMPLE
Iterate
All previously described steps will now be performed recursively on the current
nodes child nodes. In this case, there is only one child node that was spawned
by Template:Title. Since it only contains a static text, its generation process is
considered trivial and will not be described in detail
Results
Due to the limited Template and Ontology databases, there is only a single pos-
sible result for text tree generation, which consists of a root node with a single
child node and some expressions that have been resolved to string representa-
tions as described above. Transforming the tree into a simple text string results
in the following:
A Simple Example
Listing A.5: The resulting sample text.
107