0% found this document useful (0 votes)
133 views

Visualising Software Architecture

This document introduces the C4 model for visualizing software architecture. It discusses drawing diagrams at four levels of detail (context, container, component, and class) to depict architectural concepts and their relationships. The C4 model aims to provide a shared visual language to facilitate communication about software design.

Uploaded by

Alexander Lopez
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views

Visualising Software Architecture

This document introduces the C4 model for visualizing software architecture. It discusses drawing diagrams at four levels of detail (context, container, component, and class) to depict architectural concepts and their relationships. The C4 model aims to provide a shared visual language to facilitate communication about software design.

Uploaded by

Alexander Lopez
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 114

The Art of Visualising Software Architecture

Communicating software architecture with sketches,


diagrams and the C4 model

Simon Brown
This book is for sale at http://leanpub.com/visualising-software-architecture

This version was published on 2016-03-06

This is a Leanpub book. Leanpub empowers authors and publishers with the Lean
Publishing process. Lean Publishing is the act of publishing an in-progress ebook using
lightweight tools and many iterations to get reader feedback, pivot until you have the right
book and build traction once you do.

© 2015 - 2016 Simon Brown


Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Past, present and future . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 About this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2. Draw one or more diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4


2.1 Where do we start? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Common problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 The hidden assumptions of diagrams . . . . . . . . . . . . . . . . . . . . . 17

3. Creating a shared vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19


3.1 Common abstractions over a common notation . . . . . . . . . . . . . . . . 19
3.2 Static structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Software systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6 Components vs classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.7 Non-OO components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.8 Modules and subsystems? . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.9 Microservices? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.10 Platforms, frameworks and libraries? . . . . . . . . . . . . . . . . . . . . . 27
3.11 Create your own shared vocabulary . . . . . . . . . . . . . . . . . . . . . . 27
3.12 From abstractions to diagrams . . . . . . . . . . . . . . . . . . . . . . . . . 28

4. Level 1: Context diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29


4.1 Intent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4 Software systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
CONTENTS

4.5 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.7 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.8 Required or optional? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5. Level 2: Container diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


5.1 Intent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.4 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5 People and software systems . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.6 Software system boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.7 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.8 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.9 Required or optional? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6. Level 3: Component diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43


6.1 Intent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.3 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.4 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.5 People, software systems and containers . . . . . . . . . . . . . . . . . . . 48
6.6 Container boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.7 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.8 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.9 Required or optional? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

7. Level 4: Class diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


7.1 Intent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.4 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.5 Required or optional? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

8. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.1 Titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.2 Keys and legends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.3 Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.4 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
CONTENTS

8.5 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.6 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.7 Labels and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.8 Quality attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.9 Diagram scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9. Aligning the diagrams and the code . . . . . . . . . . . . . . . . . . . . . . . . 68


9.1 The model-code gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.2 An architecturally-evident coding style . . . . . . . . . . . . . . . . . . . . 73
9.3 Diagrams should reflect reality . . . . . . . . . . . . . . . . . . . . . . . . 75

10. Sketches, diagrams, models and tooling . . . . . . . . . . . . . . . . . . . . . . 77


10.1 Sketches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10.2 Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

11. Other diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96


11.1 Enterprise context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
11.2 User interface mockups and wireframes . . . . . . . . . . . . . . . . . . . . 99
11.3 Domain model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
11.4 Runtime and behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
11.5 Business process and workflow . . . . . . . . . . . . . . . . . . . . . . . . 102
11.6 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
11.7 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
11.8 And more . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
11.9 Architectural view models . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

12. Appendix A: Financial Risk System . . . . . . . . . . . . . . . . . . . . . . . . 106


12.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.2 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.3 Non-functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 107
1. Introduction
We’ve reached an interesting point in the software development industry. Globally dis-
tributed teams are building Internet-scale software systems in all manner of programming
languages, with architectures ranging from monolithic systems through to those composed
of dozens of microservices. Agile and lean approaches are now no longer seen as niche ways
to build software, and even the most traditional of organisations are seeking fast feedback
with minimum viable products to prove their ideas. Techniques such as automated testing
and continuous delivery coupled with the power of cloud computing make this a reality for
organisations of any size too. But there’s something still missing.
Ask somebody in the building industry to visually communicate the architecture of a building
and you’ll likely be presented with site plans, floor plans, elevation views, cross-section views
and detail drawings. In contrast, ask a software developer to communicate the software
architecture of a software system using diagrams and you’ll likely get a confused mess of
boxes and lines.
I’ve asked thousands of software developers to do just this over the past decade and continue
to do so today. The results still surprise me, with the thousands of photos taken during these
software architecture sketching workshops suggesting that effective visual communication
of software architecture is a skill that’s sorely lacking in the software development industry.

1
Introduction 2

A selection of typical “boxes and lines” diagrams

1.1 Past, present and future

If you cast your mind back in time, the structured processes of times past provided a reference
point for both the software design process and how to communicate the resulting designs.
Some well-known examples include the Rational Unified Process (RUP) and the Structured
Systems Analysis And Design Method (SSADM). Although the software development
industry has moved on in many ways, we seem to have forgotten some of the good things
that these prior approaches gave us, especially with respect to communication.
As an industry, we do have the Unified Modeling Language (UML), which is a standardised
notation for communicating the design of software systems. However, while you can argue
whether UML offers an effective way to communicate software designs or not, that’s often
irrelevant because many teams have already thrown out UML or simply don’t know it. Such
teams typically favour informal “boxes and lines” diagrams instead, which usually don’t
make much sense unless they are accompanied by a detailed narrative.
Given all of the progress in recent years, many software teams lack the ability to com-
municate software architecture before, during and after the construction of that software.
The result is often that these same teams seem to lack technical leadership, direction and
consistency. If you want to ensure that everybody is contributing to the same end-goal, you
Introduction 3

need to be able to effectively communicate the vision of what it is you’re building. And if
you want agility and the ability to move fast, you need to be able to communicate that vision
efficiently too.

1.2 About this book

This book focusses on the visual communication of software architecture. You’ll notice that
the title of this book includes the word “art”. I’ve seen a number of debates over the years
about whether software development is a craft or an engineering discipline. Although I think
it should be an engineering discipline, I believe we’re a number of years away from this
being a reality. So while this book doesn’t present a formalised, standardised method to
communicate software architecture, it does provide a collection of ideas and techniques that
thousands of people across the world have found useful. The core of this is my C4 software
architecture model, which you’ll find covered in more depth here than in my Software
Architecture for Developers¹ book. You’ll also find additional discussion about notation, the
various uses for diagrams, the value of creating a model and tooling.
¹http://leanpub.com/software-architecture-for-developers
2. Draw one or more diagrams
As I mentioned in the introduction, I’ve asked thousands of software developers to draw
software architecture diagrams during workshops I’ve run. Sometimes this is done as part of
a software architecture kata, where groups of people are tasked with designing a software
system. Other times it’s done as part of a diagramming workshop where I ask software
developers to draw some pictures to describe the software architecture of a system they are
currently working on. Either way, the result is the same - an ad hoc collection of “boxes and
lines” diagrams.
The task is literally phrased as “draw one or more diagrams to describe your software system”.
As you can probably imagine, the resulting diagrams are all very different. Some diagrams
show a very high-level of abstraction, others present low-level design details. Some diagrams
show static structure, others show runtime and behavioural aspects. Some diagrams show
technology choices, others don’t.

2.1 Where do we start?

When you think about it, this result is unsurprising. Asking people what they found
challenging about the exercise reveals that perhaps this isn’t a skill they are proactively being
taught. I regularly hear the following questions and comments during the workshops:

• “What types of diagram should we draw?”


• “What notation should we use?”
• “What level of detail should we present?”
• “Who is the audience for these diagrams?”

I run this as a group-based exercise, typically with between two and five people per group.
Rather than making the exercise easier, having a group of people with different backgrounds
and experience tends to complicate matters. Why? Simply because, unlike the building
industry, the software development industry lacks a standard, consistent way to think about,
describe and visually communicate software architecture. I believe there are a number of
reasons that contribute to this:

4
Draw one or more diagrams 5

1. In their haste to adopt agile approaches in recent years, many software teams have
thrown out the baby with the bath water - modeling and documentation have been
thrown out alongside traditional plan-driven processes and methodologies. That may
sound a little extreme, but many of the software teams I work with only have a very
limited amount of documentation for their software systems.
2. Teams that still do see the value in documents and diagrams have typically aban-
doned the Unified Modeling Language (UML) in favour of an approach that is more
lightweight and pragmatic. We’ll discuss UML later in the book, but my anecdotal
evidence, based upon meeting and speaking to thousands of software developers,
suggests that UML is optimistically used by only one in ten developers.
3. There are very few people out there who teach software teams how to effectively
model, visualise and communicate software architecture. Based upon running work-
shops for computer science undergraduates, this includes universities too.

2.2 Some examples

Let’s look at some examples. The small selection of photos that follow are taken from my
workshops, where groups have been tasked to design a simple financial risk system and
draw one or more diagrams to communicate the software architecture of it. The purpose of
the financial risk system is to import data from two data sources (a “Trade Data System” and a
“Reference Data System”), merge the datasets, perform some risk calculations and produce a
Microsoft Excel compatible report for a number of business users. A subset of those business
users can additionally modify some of the parameters that are used during the calculations.
You can see the full set of requirements for the financial risk system in the Appendix A.

The shopping list

Regardless of whether this is the only software architecture diagram or one of a collection
of software architecture diagrams, this diagram doesn’t tell you much about the solution.
Essentially it’s just a shopping list of technologies.
Draw one or more diagrams 6

There’s a Unix box and a Windows box, with some additional product selections that include
JBoss (a Java EE application server) and Microsoft SQL Server. The problem is, I don’t know
what those products are doing and there seems to be a connection missing between the Unix
box and the Windows box. It’s essentially a bulleted list that’s been presented as a diagram.

Boxes and no lines

When people talk about software architecture, they often refer to “boxes and lines”. This next
diagram has boxes, but no lines.
Draw one or more diagrams 7

This is a three-tier solution (I think) that uses the Microsoft technology stack. There’s an
ASP.NET web thing at the top, which I assume is being used for some sort of user interaction,
although that’s not shown on the diagram. The bottom section is labelled “SQL Server” and
there are lots of separate cylinders. To be honest though, I’m left wondering whether these
are separate database servers, schemas or tables.
Finally, in the middle, is a collection of boxes, which I assume are things like components,
services, modules, etc. From one perspective, it’s great to see how the middle-tier of the
overall solution has been decomposed into smaller chunks and these are certainly the types
of components/services/modules that I would expect to see for such a solution. But again,
there are no responsibilities and no interactions. Software architecture is about structure,
which is about things (boxes) and how they interact (lines). This diagram has one, but not
the other. It’s telling a story, but not the whole story.

The “functional view”

This is similar to the previous diagram and is very common, particularly in large organisa-
tions for some reason.
Draw one or more diagrams 8

Essentially the group that produced this diagram has simply documented their functional
decomposition of the solution into things, which I again assume are components, services,
modules, etc but I could be wrong. Imagine a building architect drawing you a diagram of
your new house that simply had a collection of boxes labelled “Cooking”, “Eating”, “Sleeping”,
“Relaxing”, etc.
This diagram suffers from the same problem as the previous diagram (no responsibilities and
no interactions) plus we additionally have a colour coding to decipher. Can you work out
what the colour coding means? Is it related to input vs output functions? Or perhaps it’s
business vs infrastructure? Existing vs new? Buy vs build? Or maybe different people simply
had different colour pens! Who knows. I often get asked why the central “Risk Assessment
Processor” box has a noticeably thicker border than the other boxes. I honestly don’t know,
but I suspect it’s simply because the marker pen was held at a different angle.

The airline route map

This is one of my all-time favourites. It was also the one and only diagram that this particular
group used to present their solution.
Draw one or more diagrams 9

The central spine of this diagram is great because it shows how data comes in from the source
data systems (TDS and RDS) and then flows through a series of steps to import the data,
perform some calculations, generate reports and finally distribute them. It’s a super-simple
activity diagram that provides a nice high-level overview of what the system is doing. But
then it all goes wrong.
I think the green circle on the right of the diagram is important because everything is pointing
to it, but I’m not sure why. And there’s also a clock, which I assume means that something
is scheduled to happen at a specific time.
The left of the diagram is equally confusing, with various lines of differing colours and styles
zipping across one another. If you look carefully you’ll see the letters “UI” (User Interface)
upside-down. The reason? People were simply writing from wherever they sat around the
table.

Generically true

This is another very common style of diagram. Next time somebody asks you to produce a
software architecture diagram of a system, present them this photo and you’re done!
Draw one or more diagrams 10

It’s a very “Software Architecture 101” style of diagram where most of the content is generic.
Ignoring the source data systems at the top of the diagram (TDS and RDS); we have boxes
generically labelled transport, archive, audit, report generation, error handling and arrows
labelled error and action. And look at the box in the centre - it’s labelled “business logic”,
which is not hugely descriptive!

The “logical view”

This diagram is also relatively common. It shows the overall shape of the software architec-
ture (including responsibilities, which I really like) but the technology choices are left to your
imagination.
Draw one or more diagrams 11

There’s a common misconception that “software architecture” diagrams should be “logical”


in nature rather than include technology choices, especially before any code is written. After
all, I’m often told that the financial risk system “is a simple solution that can be built with
any technology”, so it doesn’t really matter anyway. I disagree this is the case and the issue
of including or omitting technology choices is covered in more detail elsewhere in the book.

Missing technology details

This next diagram tells us that the solution is an n-tier Java EE system but, like the previous
diagram, it omits some important technology details.
Draw one or more diagrams 12

The lines between the web server and the application server have no information about how
this communication occurs. Is it SOAP? RESTful services? XML over HTTP? Remote method
invocation? Asynchronous messaging? It’s not clear.

Deployment vs execution context

This next one is a Java solution consisting of a web application and a number of server-side
components. Although it provides a simple high-level overview of the solution, it’s missing
some information and you need to make some educated guesses to fill in the blanks.
Draw one or more diagrams 13

If you look at the Unix box in the centre of the diagram, you’ll see two smaller boxes labelled
“Risk Analysis System” and “Data Import Service”. If you look closely, you’ll see that both
boxes are annotated “JAR”, which is the deployment mechanism for Java code (Java ARchive).
Basically this is a ZIP file containing compiled Java bytecode. The equivalent in the .NET
world is a DLL.
And herein lies the ambiguity. What happens if you put a JAR file on a Unix box? Well,
the answer is not very much other than it takes up some disk space. And cron (the Unix
scheduler) doesn’t execute JAR files unless they are really standalone console applications,
the sort that have a “public static void main” method as a program entry point. By deduction
then, I think both of those JAR files are actually standalone applications and that’s what I’d
like to see on the diagram. Rather than the deployment mechanism, I want to understand the
execution context.

Homeless Old C# Object (HOCO)

If you’ve heard of “Plain Old C# Objects” (POCOs) or “Plain Old Java Objects” (POJOs), this
is the homeless edition. This diagram mixes up a number of different levels of detail.
Draw one or more diagrams 14

In the bottom left of the diagram is a SQL Server database, and at the top left of the diagram is
a box labelled “Application”. Notice how that same box is also annotated (in green) “Console-
C#”. Basically, this system seems to be made up of a C# console application and a database.
But what about the other boxes?
Well, most of them seem to be C# components, services, modules or objects and they’re
much like what we’ve seen on some of the other diagrams. There’s also a “data access” box
and a “logger” box, which could be frameworks or architectural layers. Do all of these boxes
represent the same level of granularity as the console application and the database? Or are
they actually part of the application? I suspect the latter, but the lack of boundaries makes
this diagram confusing. I’d like to draw a big box around most of the boxes to say “all of
these things live inside the console application”. I want to give those boxes a home. Again, I
do want to understand how the system has been decomposed into smaller components, but
I also want to know about the execution context too.

Choose your own adventure

This is the middle part of a more complex diagram.


Draw one or more diagrams 15

It’s a little like those “choose your own adventure” books that I used to read as a kid.
You would start reading at page 1 and eventually arrive at a fork in the story where you
decide what should happen next. If you want to attack the big scary creature you’ve just
encountered, you turn to page 47. If you want to run away like a coward, it’s page 205 for
you. You keep making similar choices and eventually, and annoyingly, your character ends
up dying and you have to start over again.
This diagram is the same. You start at the top and weave your way downwards through what
is a complex asynchronous and event-driven style of architecture. You often get to make a
choice - should you follow the “fail event” or the “complete event”? As with the books, all
paths eventually lead to the (SNMP) trap on the left of the diagram.
The diagram is complex, it’s trying to show everything and the single colour being used
doesn’t help. Removing some information and/or using colour coding to highlight the
different paths through the architecture would help tremendously.

Stormtroopers

To pick up on something you may have noticed from previous sketches, I regularly see
diagrams that include unlabelled users/actors. Essentially they are faceless clones. I don’t
know who they are and why they are using the software.
Draw one or more diagrams 16

Should have used a whiteboard!

The final diagram is a great example of why whiteboards are such useful bits of kit!
Draw one or more diagrams 17

2.3 Common problems

All joking aside, these diagrams do suffer from one or more of the following problems:

• Colour coding is not explained or is inconsistent.


• The purpose of diagram elements (i.e. different styles of boxes and lines) is not
explained.
• Key relationships between diagram elements are missing or ambiguous.
• Generic terms such as “business logic” are used.
• Technology choices (or options) are omitted.
• Levels of abstraction are mixed.
• Diagrams try to show too much detail.
• Diagrams lack context or a logical starting point.

In addition, the problems associated with a single diagram are often exacerbated when a
collection of diagrams is created when:

• The notation (colour coding, line styles, etc) is not consistent between diagrams.
• The naming of elements is not consistent between diagrams.
• The logical order in which to read the diagrams isn’t clear.

The example diagrams typify what I see during my workshops and these types problems are
incredibly common. A quick Google image search¹ will uncover a plethora of similar block
diagrams that suffer from many of the same problems we’ve seen already. I’m sure you will
have seen diagrams like this within your own organisations too.

2.4 The hidden assumptions of diagrams

One of the easiest ways to understand whether a diagram makes sense is to give it to
somebody else and ask them to interpret it without providing a narrative. I’m a firm believer
that diagrams should stand alone. Any narrative should complement the diagram rather than
explain it. However, I often hear groups in my workshops say the following:

• “We’ll talk through the diagrams.”


¹https://www.google.com/search?q=software+architecture+diagrams&tbm=isch
Draw one or more diagrams 18

• “This doesn’t make sense, but we’ll explain it during the presentation.”

The assumption that a diagram will be accompanied by a narrative creates a gap between the
information captured on the paper and what remains in people’s heads. Diagrams that need
explaining have limited value, especially when used for the purpose of creating long-lived
documentation.
The problems highlighted broadly fall into the category of content or notation. The rest of
this book will tackle both, illustrating how to create simple software architecture diagrams.
3. Creating a shared vocabulary
The diagrams we’ve seen so far have been an ad hoc collection of “boxes and lines” and,
although notation is important, one of the fundamental problems I believe we have in the
software development industry is that we lack a common, shared vocabulary with which
to think about and describe the software systems we build. Next time you’re sitting in an
conversation about software design, listen out for how people use terms like “component”,
“module”, “sub-system”, etc.

3.1 Common abstractions over a common notation

My goal is to see teams able to discuss their software systems with a common set of
abstractions rather than struggling to understand what the various notational elements are
trying to show. For me, a common set of abstractions is more important than a common
notation.
Most maps are a great example of this principle in action. If you get two different maps of
your local area and lay them out side by side, they would both show the major roads, rivers,
lakes, forests, towns, districts, schools, churches and so on. Visually though, these maps will
probably use different notation in terms of colour-coding, line styles, iconography, etc. In
other words, the maps are showing the same things (the same abstractions), but the notation
varies. The key to understanding them is exactly that; a key or legend tucked away in a corner
somewhere. We can do the same with our software architecture diagrams.

Diagrams are the maps that help software developers navigate a complex
codebase.

3.2 Static structure

In order to get to this point though, we need to agree upon some vocabulary. And this is
the step that is usually missed during my software architecture sketching workshops. Teams
charge headlong into the exercise without having a shared understanding of the terms they
are using.

19
Creating a shared vocabulary 20

“This is a component of our system”, says one developer, pointing to a box on a


diagram labelled “Web Application”.

I’ve witnessed groups of people having design discussions using terms like “component”
where they are clearly not talking about the same thing. Yet everybody in the group
is oblivious to this. For me, the answer is simple. Each group needs to agree upon the
vocabulary, terminology and abstractions they are going to use. The notation can then evolve.
So, notation aside (we’ll cover that later in the book), my approach to tackling this problem
is to introduce a shared vocabulary that we can use to describe our software. The primary
aspect I’m interested in is the static structure. And I’m interested in the static structure from
different levels of abstraction. Once this static structure is understood and in use, it’s easy
to supplement it with other information to illustrate runtime/behavioural characteristics,
infrastructure, deployment models, etc.

A simple model of architectural constructs used to define the static structure of a software system

Assuming that you’re using an object-oriented programming language (e.g. Java, C#, C++,
etc), I like to think of my software system as being a hierarchy of simple building blocks as
follows:

A software system is made up of one or more containers (web applications,


mobile apps, standalone applications, databases, file systems, etc), each of which
Creating a shared vocabulary 21

contains one or more components, which in turn are implemented by one or


more classes.

3.3 Software systems

A software system is the highest level of abstraction and represents something that delivers
value to its users, whether they are human or not.

3.4 Containers

Put simply, a container represents something that hosts code or data. A container is
something that needs to be running in order for the overall software system to work. In
real terms, a container is something like:

• Server-side web application: A Java EE web application running on Apache Tomcat,


an ASP.NET MVC application running on Microsoft IIS, a Ruby on Rails application
running on WEBrick, a Node.js application, etc.
• Client-side web application: A JavaScript application running in a web browser using
AngularJS, Backbone.JS, jQuery, etc).
• Client-side desktop application: A Windows desktop application written using WPF,
an OS X desktop application written using Objective-C, a cross-platform desktop
application written using JavaFX, etc.
• Mobile app: An Apple iOS app, an Android app, a Microsoft Windows Phone app, etc.
• Server-side console application: A standalone (e.g. “public static void main”) appli-
cation, a batch process, etc.
• Microservice: A single microservice, hosted in anything from a traditional web server
to something like Spring Boot, Dropwizard, etc.
• Database: A schema or database in a relational database management system, doc-
ument store, graph database, etc such as MySQL, Microsoft SQL Server, Oracle
Database, MongoDB, Riak, Cassandra, Neo4j, etc.
• Blob or content store: A blob store (e.g. Amazon S3, Microsoft Azure Blob Storage,
etc) or content delivery network (e.g. Akamai, Amazon CloudFront, etc).
• File system: A full local file system or a portion of a larger networked file system (e.g.
SAN, NAS, etc).
• Shell script: A single shell script written in Bash, etc.
Creating a shared vocabulary 22

• etc

A container is essentially a context or boundary inside which some code is executed or some
data is stored. The name “container” was chosen because I wanted a name that didn’t imply
anything about the physical nature of how that container is executed. For example, some
web servers run multiple threads inside a single process, whereas others run single threads
across multiple processes. When I’m thinking about the static structure of a software system,
I don’t want to concern myself with the details of whether a web application is using one
operating system process or many when it’s servicing users. It’s an important detail, but we
can get into that later.

Containers are separately deployable

It’s also worth noting that each container should be a separately deployable thing. Again,
the physical deployment is another important detail that we will look at later, but, in theory
anyway, every container can be deployed onto a separate piece of hardware; whether that
hardware is physical, virtual or containerised. The implication here is that communication
between containers is likely to require an out-of-process or remote procedure call.
To give an example, let’s imagine you’re building a website that is comprised of two different
web applications (e.g. a desktop version and a mobile version, or an end-user version serving
HTML and an API endpoint serving JSON). There are a number of scenarios to consider:

1. Each web application is packaged up into separately deployable units (e.g. two Java
WAR files, two ASP.NET web applications, etc). This is two containers, regardless of
whether both deployable units are actually deployed into the same physical web server
(this is simply a deployment optimisation).
2. Although you think about the two web applications as being logically separate, they
are actually inseparable because they are packaged as a single deployment unit (e.g. a
single Java WAR file or ASP.NET web application). This is a single container.

As a final note, put simply, a container refers to an execution context and it’s a really runtime
construct. This means that libraries or modules (e.g. a JAR file, a DLL, a .NET assembly)
should not be considered to be containers, unless they are runnable on their own, of course.

3.5 Components
The word “component” is a hugely overloaded term in the software development industry,
but I like to think of a component as simply being a grouping of related functionality
Creating a shared vocabulary 23

encapsulated behind a well-defined interface. Aspects such as how those components are
packaged (e.g. one component vs many components per JAR file, DLL, shared library, etc)
is an orthogonal concern and, from my perspective, doesn’t affect how we think about
components.

3.6 Components vs classes

If you’re using an object-oriented programming language, your components will be imple-


mented using one or more classes. Let’s look at a quick example to better define what a
component is in the context of some code.
The Spring PetClinic¹ application is a sample codebase used to illustrate how to use the Spring
framework to build web applications in Java. It’s a typical layered architecture consisting of
a number of web MVC controllers, a service containing business logic and some repositories
for data access, along with some domain and util classes too. If you download a copy of the
GitHub repository, open it in your IDE of choice and visualise it by drawing a UML class
diagram of the code, you’ll get something like this.

As you would expect, this diagram is showing you all of the Java classes and interfaces that
make up the Spring PetClinic web application, plus all of the relationships between them.
The properties and methods are hidden on the diagram because they add too much noise to
the picture. This isn’t a complex codebase by any stretch of the imagination but, by showing
classes and interfaces, the diagram is showing too much detail.
¹https://github.com/spring-projects/spring-petclinic
Creating a shared vocabulary 24

Let’s remove those classes that aren’t relevant to having an “architecture” discussion about
the system. In other words, let’s only try to show those classes that have some static structural
significance. In concrete terms, this means excluding the model (domain) and util classes.

After a little rearranging, this diagram is much better and we now have a simpler diagram
with which to reason about the software architecture. We can also see the architectural layers
(controllers, services and repositories). In order to show the true picture of the dependencies
on this diagram, we need to show the interface and implementation classes for the service
and repositories. To simplify the diagram, we could treat the ClinicService and each of the
*Repository things as a “component”, by collapsing the interface and implementation classes
together on the diagram.
Creating a shared vocabulary 25

As this example illustrates, we can think of a component as simply being a collection of


implementation classes behind an interface. Although there’s a simple mapping from one
interface and one implementation class to a component in this example, components are
typically made up of a larger number of classes in real-world software systems.
Although this example illustrates a traditional layered architecture, the same principles are
applicable regardless of how you package your code (e.g. by layer, feature or component) or
the architectural style in use (e.g. layered, hexagonal, ports and adapters, etc). My aim in all
of this is to minimise, and in fact remove, the gap between how software developers think
about components from a logical and physical perspective. Components should be real things,
evident in the code, rather than logical constructs that are used in architecture discussions
only.

3.7 Non-OO components


As I said before, a component is a logical grouping of functionality. But what if you’re
not using an object-oriented programming language and you don’t have classes? Well, a
Creating a shared vocabulary 26

component is simply a way to step up one level of abstraction from the low-level building
blocks that you have in the technology you’re using. It’s a way to step slight away from
the code-level constructs, and you can use this same approach to define what a component
means to you. For example:

• Object-oriented programming languages (e.g. Java, C#, C++, etc): A component is


made up of classes and interfaces.
• Procedural programming languages (e.g. C): A component could be made up of a
number of C files in a particular directory.
• JavaScript: A component could be a JavaScript module, which is made up of a number
of objects and functions.
• Functional programming languages: A component could be a module (a concept
supported by languages such as F#, Haskell, etc), which is a logical grouping of related
functions, types, etc.
• Relational database: A component could be a logical grouping of functionality; based
upon a number of tables, views, stored procedures, functions, triggers, etc.

3.8 Modules and subsystems?

If you’re familiar with the definition of software architecture from books such as Software
Architecture in Practice², you will have noticed that I don’t use the term “module” as a part
of the static structure definition. A module typically refers to an implementation unit (e.g.
a library or some other collection of programming elements) that may be combined with
other modules into a component, which itself is instantiated to create component instances
at runtime. While this model makes sense, I find it adds an additional level of detail that is
usually unnecessary when thinking about a software system from a “big picture” perspective.
For this reason, I’ve deliberately avoided using the term module and instead focus on the
identification of coarser-grained components within the static structure.
I’ve also avoided using the term “subsystem”, which some people use to refer to a collection
of related components or a functional slice of a software system. The problem I have with the
term “subsystem” is that it’s often difficult to map this concept onto a real-world codebase.
If the concept of components and modules, or systems and subsystems, is useful, then feel
free to build that into the shared vocabulary that you create.
²http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=30264
Creating a shared vocabulary 27

3.9 Microservices?

Given the degree of hype and discussion around microservices at the moment, it’s worth
being explicit about how to describe microservices using the vocabulary we’ve defined so
far. If a typical modular monolithic application is a container with a number of components
running inside it, a microservice is simply a container with a much smaller number of
components running inside it. The actual number of components will depend upon the
implementation strategy. It could range from the very simple (i.e. one, where a microservice
is a container with a single component running inside) through to something like a mini-
layered or hexagonal architecture.

3.10 Platforms, frameworks and libraries?

You might also be wondering where platforms, frameworks and libraries fit into all of this.
After all, platforms and frameworks are usually something that you build your software
on top of, while libraries are things that your software uses. In most cases, these are
really just technology choices that components make use of and are simply implementation
details rather than components in their own right. Having said that, there are times
when your software will use components provided by platforms, frameworks and libraries.
Understanding how you use platforms, frameworks and libraries is the key to understanding
how they fit into a static model of your software.

3.11 Create your own shared vocabulary

I’ve illustrated the vocabulary that I use here and it works for the majority of organisations
I work with. But, of course, there are no universal rules. Sometimes, rather than introducing
something new, it’s easier for an organisation to stick with the vocabulary they are already
using, ensuring that it is explicitly defined and understood by everybody. As an example,
one organisation I worked with builds software in C. Instead of “container” and “component”,
they use the terms “component” and “module” respectively. In this case, a “component” refers
to an executable built in C, which in turn is made up of a number of modules. Although the
terminology is different, we still have a hierarchical structure that can be used to describe a
software system at a number of different levels of abstraction.
Creating a shared vocabulary 28

3.12 From abstractions to diagrams

With a shared vocabulary in mind, we can now move on to draw some diagrams at varying
levels of abstraction. I call this my “C4 model”; Context, Containers, Components and Classes
(or Code, if you want to take a more generic view).
As a quick note, what follows is not a description of a design process, it’s simply a collection
of diagrams that you can use to describe the static structure of a software system.
4. Level 1: Context diagram
A context diagram can be a useful starting point for diagramming and documenting a
software system, allowing you to step back and look at the big picture.

4.1 Intent

A context diagram helps you to answer the following questions.

1. What is the software system that we are building (or have built)?
2. Who is using it?
3. How does it fit in with the existing environment?

4.2 Structure

Draw a simple block diagram showing your software system as a box in the centre,
surrounded by its users and the other software systems that it interacts with. Detail isn’t
important here as this is your zoomed-out view showing a big picture of the system
landscape. The focus should be on people (actors, roles, personas, etc) and software systems
rather than technologies, protocols and other low-level details. It’s the sort of diagram that
you could show to non-technical people.
Let’s look at an example. The techtribes.je¹ website provides a way to find people, tribes
(businesses, communities, interest groups, etc) and content related to the tech, IT and digital
sector in Jersey and Guernsey, the two largest of the Channel Islands. At the most basic level,
it’s a content aggregator for local tweets, news, blog posts, events, talks, jobs and more. Here’s
a context diagram that provides a visual summary of this.
¹http://techtribes.je

29
Level 1: Context diagram 30

A context diagram for techtribes.je


Level 1: Context diagram 31

4.3 People

These are the people who use your software system. Whether you model them as individual
people, users, roles, actors or personas is your choice. Typically I’ll capture the following
information about people:

• Name: The name of the person, user, role, actor or persona.


• Description: A short description of the person, their role, responsibilities, etc.

4.4 Software systems

These are the other software systems that your software system interacts with. Again, I’ll
capture the following information:

• Name: The name of the software system.


• Description: A short description of the software system, its responsibilities, etc.

Optionally, I may want to capture some information about the location of the software system
relative to my point of reference. For example, if I’m building a software system inside an
organisational boundary, that software system may interact with external systems outside
that boundary (e.g. on the public Internet). For example, a software system I’m building for
a bank may interact with a third-party fraud prevention system on the Internet. In this case,
I might label the fraud prevention software system as being an “External Software System”
rather than just a “Software System”, because it sits outside of the organisation that I work
for.

System scope

Determining which software systems to include on the diagram requires to you ask yourself
which software systems sit outside the scope or boundary of your software system. This
basically comes down to ownership or understanding whether you have responsibility for
maintaining the software system in question. Who owns the software system? Who looks
after it?
The techtribes.je example is very clear. I don’t own Twitter, GitHub or people’s blogs so
they are clearly outside of the scope of my system. For this reason, they are included on the
diagram to illustrate that they are dependencies of the techtribes.je system.
Level 1: Context diagram 32

If we think about the financial risk system though, one of the key requirements is to generate
a Microsoft Excel compatible report for a number of business users. Here’s a basic context
diagram that summarises the requirements².

A basic context diagram for the financial risk system

The method we choose to distribute the report will have an effect on what we see on the
context diagram. For example:

Scenario 1: Reports are saved on a network file share

My typical approach to this scenario is to treat the file system as internal to the financial risk
system boundary, so it wouldn’t appear on the context diagram. Although the network file
system might be a centralised service that I don’t own, I will probably have ownership of a
directory structure that resides on it.
²Please see https://structurizr.com/public/3621 for the diagram key.
Level 1: Context diagram 33

Scenario 2: Reports are e-mailed

If the financial risk system is e-mailing report notifications to a collection of business users,
perhaps containing a hyperlink to view the report, I would expect to see an e-mail system on
the diagram, assuming that I don’t own that e-mail system of course.

A context diagram for the financial risk system, where report notifications are e-mailed to business users

Alternatively, you could exclude the e-mail system and show the notification via e-mail as
an interaction between the financial risk system and the business user.
Level 1: Context diagram 34

An alternative context diagram for the financial risk system, where report notifications are e-mailed to
business users

This version of the context diagram also makes sense and tells the story, but I like the
explicitness of including the e-mail system as a box on the diagram because it makes it much
easier to identify system dependencies at a glance. The choice is yours though.

Scenario 3: Reports are uploaded to a corporate wiki

If the report was being shared via a corporate wiki or Microsoft SharePoint, I would show
this on the context diagram. Again, we can think of the corporate wiki as a software system
that I don’t have ownership of.
Level 1: Context diagram 35

A context diagram for the financial risk system, where reports are uploaded to a corporate wiki

Notation

From a notation perspective, you may have seen diagrams that have represented software
systems as actors, using the traditional “stick man” icon. This comes from UML where “an
actor specifies a role played by a user or any other system that interacts with the subject”.
I’ve done this myself in the past but I shy away from doing it now as it tends to cause too
much confusion. After all, why would you want to visually represent a software system as a
person shape?

4.5 Interactions

Try to annotate every interaction between people and software systems on the diagram with
some information about the purpose of that interaction. This avoids creating a diagram with
Level 1: Context diagram 36

a collection of boxes with ambiguous lines connecting everything together.

4.6 Motivation

You might ask what the point of such a simple diagram is. Here’s why it’s useful:

• It makes the context and scope explicit so that there are no assumptions.
• It shows what is being added (from a high-level) to an existing environment.
• It’s a high-level diagram that technical and non-technical people can use as a starting
point for discussions.
• It provides a starting point for identifying who you potentially need to go and talk to
as far as understanding inter-system interfaces is concerned.

A context diagram doesn’t show much detail but it does help to set the scene and is a starting
point for other diagrams. I will often draw this diagram during a requirements gathering
workshop, to ensure that everybody understands the scope of what we’ve been tasked to
build.

4.7 Audience

Technical and non-technical people, inside and outside of the immediate software develop-
ment team.

4.8 Required or optional?

All software systems should have a context diagram.


5. Level 2: Container diagram
Once you understand how your system fits in to the overall environment with a context
diagram, a useful next step is to illustrate the high-level technology choices with a container
diagram.

5.1 Intent

The container diagram shows the high-level shape of the software architecture and how
responsibilities are distributed across it. It also shows the major technology choices and how
the containers communicate with one another. It’s a simple, high-level technology focussed
diagram that is useful for software developers and support/operations staff alike. A container
diagram helps you answer the following questions:

1. What is the overall shape of the software system?


2. What are the high-level technology decisions?
3. How are responsibilities distributed across the system?
4. How do containers communicate with one another?
5. As a developer, where do I need to write code in order to implement features?

5.2 Structure

Draw a simple block diagram showing the high-level technical elements that your software
system consists of. As an example, the following diagram shows the containers that make up
the techtribes.je website.

37
Level 2: Container diagram 38

A container diagram for techtribes.je


Level 2: Container diagram 39

Put simply, techtribes.je is made up of an Apache Tomcat web server that provides users with
information, and that information is kept up to date by a standalone content updater process.
All data is stored either in a MySQL database, a MongoDB database or the file system.
It’s worth pointing out that this diagram says nothing about the number of physical instances
of each container. For example, there could be a farm of web servers running against a
MongoDB cluster, but this diagram doesn’t show that level of information. Instead, I show
physical instances, failover, clustering, etc on a separate deployment diagram that illustrates
the mapping of containers onto infrastructure.

5.3 Containers

As I’ve already said, essentially a container represents an execution environment or data


storage. I’ll capture the following information about each container:

• Name: The name of the container (e.g. “Internet-facing web server”, “Database”, etc).
• Technology: The implementation technology (e.g. Apache Tomcat 7, Microsoft IIS 8.5,
etc).
• Description: A short descriptive statement. In the case of execution environments, this
is a list of the container’s key responsibilities. In the case of a data store, I’ll list the
major entities, tables, files, etc that are being stored.

File systems and log files vs data

My techtribes.je container diagram shows that search indexes are stored on a file system that
is shared by the web application and the content updater. This is an important use of the file
system, which is why I’ve included it on the diagram.
However, the web application and content updater also write log files to the file system, but
this isn’t shown. You could argue that the same is true for MySQL and MongoDB. In fact,
many of the containers you’ll draw on diagrams write log files to a file system. Although
this is undoubtedly important, I typically omit this detail for brevity.

5.4 Interactions

Typically, communication between containers is inter-process communication. It’s very


useful to explicitly identify this and summarise how these interfaces will work. As with
Level 2: Container diagram 40

any diagram, I recommend annotating all interactions rather than simply having a diagram
with a collection of boxes and ambiguous unlabelled lines connecting everything together.
Useful information to annotate the interactions with includes:

• The purpose of the interaction (e.g. “reads/writes data from”, “sends reports to”, etc).
• The communication mechanism (e.g. Web Services, REST, Java Remote Method
Invocation, Windows Communication Foundation, Java Message Service).
• The communication style (e.g. synchronous, asynchronous, batched, two-phase com-
mit, etc).
• Protocols and port numbers (e.g. HTTP, HTTPS, SOAP/HTTP, SMTP, FTP, RMI/IIOP,
etc).

How much detail?

If you’re drawing a container diagram during an up-front design exercise, you might not
have some of the technical details to hand. That’s fine, simply add what you know. If, on
the other hand, you’re drawing a diagram to document an existing system, it’s more likely
that you’ll be able to add some of the finer details; such as protocols, port numbers, etc. The
choice is yours, add as much detail as you feel is necessary.

5.5 People and software systems

As with the context diagram, I usually include the same collection of people (users, actors,
personas, etc) and software systems. If you look back at the techtribes.je container diagram,
you’ll notice that it has the same people (top) and external software systems (bottom) as the
techtribes.je context diagram. I do this because it helps to tell the overall story of how the
system works and it provides continuity between the two diagrams.

People vs web browsers

The techtribes.je container diagram illustrates that the various types of users use the web
application. Strictly speaking, this isn’t true though. The users use a web browser, which in
turn uses the server-side web application. This raises the question, why didn’t I include the
web browser on the diagram?
In this specific instance, the web browser is simply a delivery mechanism for static content
(HTML and CSS, with a tiny amount of JavaScript) and it doesn’t add much to the story, so
Level 2: Container diagram 41

I’ve excluded the web browser. There are times when I definitely would add the web browser
though, for example, if the web browser was hosting a single page application or a large
JavaScript application. If the web browser is a significant part of the software architecture,
I’ll add it to the diagram. If not, I won’t.

5.6 Software system boundary

If you do choose to include people and software systems, I recommend drawing a bounding
box around the containers on your diagram to explicitly show the system boundary. This
system boundary should correspond with the single box that appears on the context diagram.

The container diagram shows a zoom in of the system boundary

5.7 Motivation

Where a context diagram shows your software system as a single box, a container diagram
opens this box up to show what’s inside it. This is useful because:
Level 2: Container diagram 42

• It makes the high-level technology choices explicit.


• It shows the relationships between containers and how those containers communicate.

In addition, during my software architecture sketching workshops, I often see groups


producing a high-level context diagram that shows the software system in question as a
single black box, and a second diagram that shows all of the components that reside within
the entirety of the software system. This second diagram is usually cluttered and confusing
because it presents too much information. It’s also usually not clear how the components
are grouped together from a deployment perspective. The container diagram provides a nice
intermediate diagram between the two contrasting levels of abstraction. It also logically leads
you to level 3 (a component diagram), where you open up a single container to show only
the components that reside within that container.

5.8 Audience

Technical people inside and outside of the immediate software development team; including
everybody from software developers through to operational and support staff.

5.9 Required or optional?

All software systems should have a container diagram.


6. Level 3: Component diagram
Following on from a container diagram showing the high-level technology elements, the next
step is to zoom in and decompose each container further to show the components inside it.

6.1 Intent

A component diagram helps you answer the following questions.

1. What components/services is the system made up of?


2. It is clear how the system works at a high-level?
3. Do all components/services have a home (i.e. reside in a container)?

6.2 Structure

Whenever people are asked to draw “architecture diagrams”, they usually end up drawing
diagrams that show the components that make up their software system. That is basically
what a component diagram shows, except we only want to see the components that reside
within a single container at a time.
As illustrated by the container diagram, techtribes.je includes a standalone process that pulls
in content from Twitter, GitHub and blogs. The following diagram illustrates the high-level
internal structure of the content updater in terms of its components.

43
Level 3: Component diagram 44
Level 3: Component diagram 45

This diagram shows that the content updater is made up of a number of components. A
scheduled content updater component uses a Twitter connector, a GitHub connector and a
news feed connector to retrieve information from the outside world. It then also uses some
additional components to store this information into the appropriate data store. This diagram
shows how the content updater is divided into components, what each of those components
are, their responsibilities and the technology/implementation details.

6.3 Components

As we saw when creating our shared vocabulary, components are the coarse-grained building
blocks of your system. I’ll capture the following information for each component:

• Name: The name of the component.


• Technology: The implementation technology for the component (e.g. Plain Old
[Java|C#|Ruby|etc] Object, Enterprise JavaBean, Windows Communication Founda-
tion service, etc).
• Description: A short description of the component, usually a brief sentence describing
the component’s responsibilities.

You can think about and identify components regardless of how the code is packaged and
the architectural style in use. With this in mind, your component diagram should reflect the
architectural style in use; whether that’s a layered architecture, hexagonal architecture or
something else entirely.

Infrastructure components and cross-cutting concerns

Infrastructure components are important parts of most software systems, yet you may or
may not want to include them on your component diagram. For example, if you have a
logging component, it’s likely to be used by the majority of other components within the
container. Drawing this component and all of the interactions can result in a very cluttered
diagram. In the techtribes.je component diagram, the “Logging Component” is used by all of
the other components, but I didn’t want to draw the lines to it from every other component
for exactly this reason - the resulting diagram looks very cluttered. Instead, I’ve simply used
an asterisk to denote this on the diagram. The other option is to simply not include the logging
component if it doesn’t add much value in helping you tell the story.
Level 3: Component diagram 46

Shared components and libraries

I’m often asked whether a component diagram should include shared components (for
example, from a shared or static library) and how such components should be represented.
If including the shared component helps tell the story, then it should certainly be included.
If you want to illustrate that a particular component is a shared component or sits within a
specific library/module, again, you can simply use a notation (e.g. a symbol or colour coding)
to represent this fact.

Multiple component implementations

The example component diagram for the techtribes.je content updater shows a number of
distinct components. And if you look in the source code, you’ll notice that the component
interface and implementation classes are bundled together because there is only ever a single
implementation of each component. While this pattern holds true for many codebases, there
are certainly times where a component interface will have multiple implementations. This is
particularly true of software products (rather than bespoke software), where the collection
of active components will be selected through configuration when the product is installed.
Common examples include different implementations of data storage components (e.g. one
for Microsoft SQL Server, one for MySQL, etc), different logging components (e.g. local disk
or a message queue) or pluggable authentication components for integration with different
identity providers.
Having multiple component implementations raises the question of how this should be
illustrated on a component diagram. If we take a simple example of a logging component
with multiple implementations (e.g. local disk vs a message queue), one of which is chosen
at deployment time via configuration, there are a few approaches to diagramming this:

1. The first approach is to omit the fact that there are multiple component implementa-
tions and simply draw a component diagram as if there was only a single implemen-
tation. Here, I would draw the logging component and describe it’s responsibilities,
which in this case might be to “log errors and other system events”.
2. If I wanted to include the fact that the component implementation can vary, I might add
some additional text to the logging component box on the diagram to say something
like “Log entries are stored using local disk or sent to a message queue, depending
upon the implementation chosen by configuration at deployment time”. You could also
achieve the same result by highlighting the logging component using a colour coding
or symbol. The diagram key would then explain what the colour coding or symbol
means.
Level 3: Component diagram 47

3. The other approach is to have one component diagram per implementation option.
This works best if you only have a small number of component implementation
combinations (e.g. you only have one or two components where the implementation
can be swapped in at runtime). Having separate diagrams for specific component
implementations can also be useful if those component implementations themselves
introduce other components, which wouldn’t be seen on a diagram otherwise.

As with many of the things discussed in this book, there is no “right” answer, and it really
depends on what story you need to tell.

How much detail?

If you’re drawing a component diagram during an up-front design exercise, you might not
have some of the technical details to hand. Once again, don’t worry, simply add what you
know. If, on the other hand, you’re drawing a diagram to document an existing system, you’ll
have those finer details to hand; such as the frameworks you may be using to help implement
a component. As with many other aspects of the diagrams, the choice of how much detail
you include is yours.

Can I show components on the container diagram?

Particularly with small software systems, it can be tempting to skip creating a component
diagram and show all of the components as nested boxes inside the respective container on
the container diagram itself. While that’s certainly an option that you can experiment with,
I tend to find that, even with small software systems, a single diagram showing containers
and their components gets too cluttered.
My personal preference is to keep the container diagram as simple as possible, and to annotate
each of the containers you show on the diagram with a short list of the responsibilities rather
than show all of the components. Not only will this result in a clean and simple diagram, but
it will also provide a nice high-level technology diagram that you can show to people like
operations and support staff. It also provides a nice starting point for a separate component
diagram, and you can sanity check that the components you show on the component diagram
correlate with the responsibilities marked on the container diagram.

6.4 Interactions
To reiterate the same advice given for other diagram types, it’s useful to annotate the
interactions between components rather than simply having a diagram with a collection
Level 3: Component diagram 48

of boxes and ambiguous lines connecting them all together. Useful information to add the
diagram includes:

• The purpose of the interaction (e.g. “uses”, “persists data using”, “delegates to”, etc).
• Communication style (e.g. synchronous, asynchronous, etc).

6.5 People, software systems and containers

As with the container diagram, it can be useful to include people, software systems and
other containers to help put some context around the container you’ve zoomed in upon. The
techtribes.je component diagram for the Content Updater includes the containers (top) and
software systems (bottom) that it interacts with. Again, you’ll notice the placement of these
elements remains consistent across the various techtribes.je diagrams.

6.6 Container boundary

If you’re going to include people, software systems and containers on a component diagram,
I recommend drawing a bounding box to explicitly show the boundary of the container.
Level 3: Component diagram 49

The component diagram shows a zoom in of a single container

6.7 Motivation
Decomposing your software system into a number of components is software design at a
slightly higher level of abstraction than classes and the code itself. An audit component might
be implemented using a single class backing onto a logging framework (e.g. log4j, log4net, etc)
but treating it as a distinct component lets you also see it for what it is, which is a key building
block of your architecture. Working at this level is an excellent way to understand how your
system will be internally structured, where reuse opportunities can be realised, where you
have dependencies between components (or between components and containers), and so on.
Breaking down the overall problem into a number of separate parts also provides you with a
basis to get started with some high-level estimation, which is great if you’ve ever been asked
for ballpark estimates for a new project.
A component diagram shows the logical components that reside inside each of the containers.
This is useful because:

• It shows the high-level decomposition of your software system into components with
Level 3: Component diagram 50

distinct responsibilities.
• It shows where there are relationships and dependencies between components.
• It provides a framework for high-level software development estimates and how the
delivery can be broken down.

Designing a software system at this level of abstraction is something that can be done in a
number of hours or days rather than weeks or months. It also sets you up for designing/coding
at the class and interface level without worrying about the overall high-level structure.

6.8 Audience

Technical people within the software development team.

6.9 Required or optional?

I don’t typically draw a component diagram for data storage containers (e.g. databases,
file systems, content stores, etc) or for those containers that are very simple in nature (e.g.
microservices). Exceptions aside, yes, all containers should have a component diagram.
7. Level 4: Class diagram
The final level of detail is a class diagram showing the internals of an individual component.

7.1 Intent

The intent of a class diagram is to illustrate the structure of the code and, in this case, how a
component is implemented.

7.2 Structure

The best way to create a class diagram is using UML, either by generating it automatically
from the code or by drawing it freehand. Let’s look at some examples by zooming in on the
“TweetComponent” from the techtribes.je “Content Updater” component diagram.
Here’s a UML class diagram that has been automatically generated by reverse-engineering
the code.

A UML class diagram generated by IntelliJ IDEA

51
Level 4: Class diagram 52

And here’s another version of the class diagram that I’ve drawn myself using OmniGraffle
and the UML 2.1 Collection stencil¹. I’ve used the + and # symbols along with a colour-coding
(black vs grey) to signify that elements are public or package-protected respectively.

A UML class drawn using OmniGraffle

Both diagrams show more or less the same information, the decision as to which approach
you choose comes often down to a trade-off between visual style, flexibility and the overhead
of keeping the diagram up to date as the code changes.

How much detail?

The danger with class diagrams is that it’s very easy to include a considerable amount of
detail, and this is especially true if you are auto-generating diagrams from code. Although it’s
tempting to include every field/property/attribute and method, I would resist this temptation
and only include as much information as you need to tell the story that you want to tell.
Typically, I will only include the attributes and methods that are relevant to the narrative I
want to create.
UML class diagrams have often been used to describe an entire application, but this just
results in a huge mess of overlapping boxes and lines, regardless of how well-structured the
code is. The key to using class diagrams is to limit their scope. In this case, scope is limited
to the internals of a component.
¹https://stenciltown.omnigroup.com/#stencil=uml-21-collection
Level 4: Class diagram 53

7.3 Motivation

Having this final level of abstraction provides a way to map the high-level, coarse-grained
components into real-world code elements. It helps to bridge what are sometimes seen as
two very different worlds; the software architecture and the code.

7.4 Audience

Technical people within the software development team, specifically software developers.

7.5 Required or optional?

Since this level of detail lives in the code, software developers can get this detail on demand
from the code itself. Therefore, this is definitely an optional level of detail, and I don’t
typically draw class diagrams for anything but the most complex of components or as a
template when I want to describe a pattern that is used across a codebase.
8. Notation
Now that we’ve created a shared vocabulary and looked at how to create diagrams at a
number of different levels of abstraction, let’s look at notation.

8.1 Titles

The first thing that can really help people to understand a diagram is including a title. If you’re
using a notation like UML, the diagram elements will probably provide a clue as to what the
context of the diagram is. That doesn’t really help if you have a collection of diagrams that
are all just boxes and lines though. Include a short and meaningful title on every diagram.
And if the diagrams should be read in a specific order, make sure this is clear in the title,
perhaps by the use of a numbering scheme.

8.2 Keys and legends

One of the advantages of using a notation like UML is that it provides a standardised set
of diagram elements for each type of diagram. In theory, if somebody is familiar with these
elements, they should be able to understand your diagram. In the real world this isn’t always
the case, but this certainly isn’t the case with “boxes and lines” diagrams where the people
drawing the diagrams are inventing the notation as they go along.
There’s nothing wrong with inventing your own notation, but make sure that you give
everybody an equal chance of understanding it by including a key/legend somewhere on
or nearby the diagram. Here are the things that you might want to include explanations of:

• Shapes
• Line styles
• Colours
• Borders
• Acronyms

54
Notation 55

You can often make assumptions and interpret the use of diagram elements without a key. For
example, I’ve heard people say the following sort of thing during my software architecture
sketching workshops:

“the grey boxes seem to be the existing systems and the red boxes are the new
systems”

Even if the notation seems obvious to you, I recommend playing it safe and adding a
key/legend. Even the seemingly obvious can be misinterpreted by people with different
backgrounds and experience.

8.3 Elements

Most “boxes and lines” diagrams that I’ve seen aren’t just boxes and lines, with teams using
a variety of shapes to represent elements within their software architecture. For example,
you’ll often see cylinders on a diagram and many people will interpret them to be a database
of some description. But this isn’t always the case.
My recommendation is that you start with a pure “boxes and lines” diagram, using a very
utilitarian notation and then add shapes, colour and borders to add additional information
or make the diagram more aesthetically pleasing. In order to show some example software
architecture diagrams in this book, I’ve needed to create my own notation, which includes
the following information for each element:

• Person: Name and description.


• Software system: Name and description.
• Container: Name, technology and description.
• Component: Name, technology and description.

As you will have seen from the example diagrams, each of the elements is drawn as follows:
Notation 56

Examples of the elements I use on my diagrams

This is the notation that I’ve gradually settled on over the years. It’s easy to draw on a
whiteboard or in tooling, plus it works well on sticky notes and index cards. Do feel free to
create your own notation though.

Description/responsibilities

If naming is one of the hardest things in software development, do resist the temptation
to have a diagram full of boxes that only contain names. If you look at most software
architecture diagrams, this is exactly what they are - a collection of named boxes. As with
many other things, naming is always open to interpretation and ambiguity.
A really simple, yet effective, way to add an additional layer of information to, and remove
ambiguity from, an architecture diagram, is to annotate diagram elements with a short
descriptive statement of what their responsibilities are. A bulleted list (7 +/- 2 items¹) or
a short sentence works well.
Provided it’s kept short (and using a smaller font for this information can help too), adding
more text onto diagrams can help provide a really useful “at a glance” view of what the
software system does and how it’s been structured. Take a look at the following diagrams -
which do you prefer?
¹http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two
Notation 57

Adding additional descriptive text to diagram elements can remove ambiguity

Technology

The majority of architecture diagrams I see omit any information about technology, instead
focussing on illustrating the functional decomposition and major conceptual elements.
Asking people why their diagrams don’t show any technology decisions results in a number
of related responses, particularly if those diagrams are being drawn as a part of an up front
design exercise:

• “we don’t want to force a solution on developers”.


• “it’s an implementation detail”.
• “we follow the ‘last responsible moment’ principle”.

If you’re drawing software architecture diagrams retrospectively, for documentation after the
software has been built, there’s really no reason for omitting technology decisions. However,
others don’t necessarily share this view and I often hear the following comments:
Notation 58

• “the technology decisions will clutter the diagrams”.


• “but everybody knows that we only use ASP.NET against an Oracle database”.

It seems that regardless of whether diagrams are being drawn before, during or after the
software has been built, there’s a common misconception that architecture diagrams should
be conceptual in nature. I like to ensure that software architecture diagrams are grounded
in reality by including technology choices wherever possible, particularly on the container
and component diagrams. Including technology choices on software architecture diagrams
removes ambiguity, even if you’re working in an environment where all software is built
using a standard set of technologies and patterns.
Of course, if you’re retrospectively diagramming an existing codebase, the technology
decisions have already been made and therefore you have the information to add to the
diagrams. But what about when the diagrams are being drawn during an up front design
exercise? Perhaps you don’t know what technologies you’re going to use. Or perhaps there
are a number of options.
Imagine that you’re designing a software system. Are you really doing this without thinking
about how you’re actually going to implement it? Are you really thinking in terms of
conceptual boxes and functional decomposition? If the answer to these questions is “no”,
then my recommendation is to include as much information as possible. For example, if your
container diagram shows a database but you’re not sure whether it will be Microsoft SQL
Server or MySQL, why not state this by annotating the database element with “Microsoft
SQL Server or MySQL”. Doing this at least makes it explicit that a decision needs to be made,
and it shows the options that are under consideration too.

Colour

Software architecture diagrams don’t have to be black, white and various shades of grey.
Colour can be used to provide differentiation between diagram elements or to ensure that
emphasis is/isn’t placed on them. For example, you could colour-code elements according to:

• Existing vs new.
• Off-the-shelf product vs custom build.
• Technology type or platform.
• Risk profile (e.g. risk to build; high-medium-low risk; red-amber-green).
• Size and/or complexity.
• Ownership (i.e. elements you own vs elements somebody else owns).
Notation 59

• Internal vs external (i.e. elements within your organisation vs those outside).


• Elements you’re modifying or removing in the next release/sprint/phase vs those that
will remain untouched.

If you’re going to use colour, and I recommend that you should, make sure that it’s obvious
what your colour coding scheme is by including a reference to those colours in the key/legend.
Colour can make a world of difference. Just be aware of anybody on your team who suffers
from colour blindness and make sure that your colour scheme works if the diagram will be
printed on black and white printers.

Shapes

Using different shapes can be a great way to add an additional level of information,
supplement, enhance or add emphasis to specific elements. Using shapes can also make a
diagram look more aesthetically pleasing. Although this sounds shallow, people are more
likely to look at diagrams that are easy on the eye. Consider the two versions of the
techtribes.je container diagram that follow. They both show exactly the same information
for every element (name, element type, technology if applicable and description), but one
uses only boxes whereas the other uses some shapes.
Notation 60

The diagram that uses shapes is certainly easier to read from a distance, with the shapes
helping to provide a quicker “at a glance” view. But the shapes are simply enhancing the
diagram; they don’t really add any information that isn’t already present in the text that
resides inside the elements.
The Unified Modeling Language has numerous diagram types and an even higher number
of element types that can appear on those diagrams. Anecdotally, interpreting the notation
is one of the major reasons cited for not adopting UML, and many software developers have
told me that there are simply too many diagram types and nuances in the notation.
In contrast, I like to use a very simple notation consisting of a small number of shapes. Over
the numerous years that I’ve been running my software architecture sketching workshop, I’ve
observed that most developers only typically use the following shapes on their diagrams:

• Boxes (squares, rectangles and rounded boxes).


• People shapes (the “stick man”, “head and shoulders”, etc).
• Cylinders (e.g. to represent databases).
• Folder shapes (e.g. to represent file systems).
Notation 61

My advice is to keep diagrams as simple as possible, but do feel free to use whatever shapes
you like. Again, don’t forget to include the shapes on the key/legend.

Borders

Like shapes, adding borders (e.g. double lines, coloured lines, dashed lines, etc) around
diagram elements can be a great way to add emphasis or to group related elements together.
If you do this, make sure that it’s obvious what the border means, either by labelling the
border or by including an explanation on the key/legend.

Size

A quick note about the size of elements. Be careful about how you size elements on diagrams.
I’ve witnessed a tendency for people to make assumptions about elements that are sized
differently from others. Larger elements are often assumed to be larger, more complex or
more significant; while smaller elements tend to take on the inverse characteristics. Unless
you are specifically making a statement about size, complexity or significance, I recommend
drawing all elements approximately the same size.

8.4 Lines

Lines are an important part of most architecture diagrams, acting as the glue that holds all
of the boxes together. The big problem with lines is exactly that though - they tend to be
thought of as the unimportant things that hold the other, more significant, elements of the
diagram together. As a result, lines often don’t receive much focus.

Directionality

Even though most relationships between elements are bi-directional (e.g. a request followed
by a response), I usually choose the most significant direction and represent that as a uni-
directional line. This raises the question, “which way do you point the arrows?”.
In the techtribes.je example, on the context diagram, my users use techtribes.je and techtribes.je
uses a number of other software systems. If we look at the line between techtribes.je and
Twitter, I could have drawn it in a number of ways:

• [techtribes.je] – gets profile information and tweets from -> [Twitter]


Notation 62

• [techtribes.je] – receives profile information and tweets from -> [Twitter]


• [Twitter] – sends profile information and tweets to -> [techtribes.je]

In this example, there is no “right answer”. I tend to prefer drawing a line from the initiator
to the receiver. All options are equally valid though and the style you adopt is your decision.
My advice is to try to be consistent and to avoid lines without annotations.

Line style

As with elements, you can use different line styles and colour to add an additional level of
information to your diagram. For example, perhaps synchronous interactions are illustrated
using solid lines, whereas asynchronous interactions are illustrated using dashed lines. And
perhaps HTTPS connections are coloured green, while HTTP connections are coloured
amber.
Once again, ensure that any styling supplements the existing information wherever possible
and that the styles you use end up being described on the key/legend.

8.5 Layout

Using electronic drawing tools makes laying out diagram elements easier since you can move
them around as much as you want. Many people prefer to design software while stood in
front of a whiteboard or flip chart though, particularly because it provides a larger and better
environment for collaboration. The trade-off here is that you have to think more about the
layout of diagram elements because it can become awkward if you’re having to constantly
draw, erase and redraw elements of your diagrams when you run out of space.
Sticky notes and index cards can help to give you some flexibility if you use them as a
substitute for drawing boxes. And if you’re using a Class-Responsibility-Collaboration² style
technique to identify candidate classes/components/services during a design session, you can
use the resulting index cards as a way to start creating your diagrams.
²http://en.wikipedia.org/wiki/Class-Responsibility-Collaboration_card
Notation 63

Examples of where sticky notes and index cards have been used instead of drawing boxes

Need to move some elements? No problem, just move them. Need to remove some elements?
No problem, just take them off the diagram and throw them away. Sticky notes and index
cards can be a great way to get started with software architecture diagrams, but I find that
the resulting diagrams can look cluttered. Oh, and sticky notes often don’t stick well to
whiteboards, so have some blu-tack³ handy!

8.6 Orientation

Imagine you’re designing a 3-tier web application that consists of a web-tier, a middle-tier
and a database. If you’re drawing a container diagram, which way up do you draw it? Users
and web-tier at the top with the database at the bottom? The other way up? Or perhaps you
lay out the elements from left to right?
Most of the architecture diagrams that I see have the users and web-tier at the top, but this
isn’t always the case. Sometimes those same diagrams will be presented upside-down or back-
to-front, perhaps illustrating the author’s (potentially subconscious) view that the database
is the centre of their universe. Although there is no “correct” orientation, drawing diagrams
“upside-down” from what we might consider the norm can either be confusing or used to
great effect. The choice is yours.
³http://en.wikipedia.org/wiki/Blu-Tack
Notation 64

I will recommend that you try to put the most important thing in the centre of your diagram
and work around it. Additionally, try to keep the placement of elements consistent between
diagrams. As an example, all of the people in my techtribes.je diagrams are placed at the top,
and my system dependencies are placed at the bottom.

8.7 Labels and acronyms

You’re likely to have a number of labels on your diagrams; including names of software
systems, domain concepts and terminology, etc. Where possible, avoid using acronyms and
if you do need to use acronyms for brevity, ensure that they are documented in a glossary or
on the key/legend. While the regular team members might have an intimate understanding
of common domain acronyms, people outside or new to the team probably won’t.
The exceptions here are acronyms used to describe technology choices, particularly if they are
used widely across the industry. Examples include things like JMS (Java Message Service),
POJO (plain old Java object) and WCF (Windows Communication Foundation). Let your
specific context guide whether you need to explain these acronyms and if in doubt, play it
safe and use the full name/term or add the acronym to the key/legend.

8.8 Quality attributes

In some cases, adding information about quality attributes provides an extra degree of
narrative about what the software system does and how it has been designed. The simplest
way to do this is to add text to the diagram. For example, the description of diagram elements
could be extended to include a note about the number of potential users, the expected number
of concurrent users, etc. The lines between elements can also include additional information
about data volumes being transferred, target message latencies, target request/response times,
etc. Other cross-cutting concerns such as security are harder to show on a diagram and are
usually better left to being described in lightweight supplementary documentation.
When it comes to illustrating solutions that address quality attributes, you could add some
text to indicate that particular parts of the software architecture are replicated or clustered to
address scalability and/or availability quality attributes, for example. My context, container
and component diagrams typically don’t show this though, because I’ll save that type of
information for a separate deployment diagram that shows how technology is mapped onto
infrastructure.
Notation 65

My simple advice is that you shouldn’t try to force everything onto a diagram. Lightweight
supplementary documentation is sometimes a better approach to create a narrative that
explains how quality attributes are addressed.

8.9 Diagram scope

The example diagrams I’ve presented in the book have all conveniently fitted on one page.
Of course, in the real-world, you’re likely to have larger and more complicated software
systems. Simply by their very nature, component diagrams are prone to becoming large and
complex. So what do you do when your diagram doesn’t fit on one page?
A simple, yet naive, approach is to use a larger canvas. If you’re working on A4 paper, grab
some A3 paper or stick two sheets of A4 paper together. If you’re working on A3 paper,
try to find some flip chart paper. If you’re working with an electronic drawing tool, simply
increase the page size. The problem with increasing the page size is that it allows you to show
more information, which leads to more clutter. I’ve seen gigantic diagrams on the walls of
organisations that I’ve visited, some of which have been printed on special purpose plotter
machines that accommodate very large paper sizes.
Even with a relatively small software system, it’s tempting to try and include the entire
story on a single diagram. For example, if you have a web application, it seems logical to
create a single component diagram that shows all of the components that make up that web
application. Unless your software system really is that small, you’re likely to run out of room
on the diagram canvas or find it difficult to discover a layout that isn’t cluttered by a myriad
of overlapping lines. Using a larger diagram canvas can sometimes help, but large diagrams
are usually hard to interpret and comprehend because the cognitive load is too high. And if
nobody understands the diagram, nobody is going to look at it.
As an example of this in action, let’s look at what a component diagram for the techtribes.je
web application would look like if you chose to include all of the components.
Notation 66

A component diagram for the techtribes.je web application

This component diagram has been automatically created using some tooling that identifies
components and their dependencies from the code. It’s comprehensive, but it’s a mess! And
it’s difficult to determine whether this mess is caused by the architecture being a mess or the
diagram showing too much information. If we look at this diagram a little more closely, we
can see there are really three things that cause this diagram to be cluttered:

1. We’re showing every web-MVC controller (the components at the top of the diagram),
and therefore every dependency path through the web application.
2. The “ContentSourceComponent” is being used by a large number of other components
(this may or may not be a good thing from an architectural perspective, of course).
3. The “LoggingComponent” is used by nearly every other component.

Removing the “LoggingComponent” will remove some of the clutter, but it doesn’t really help
that much. Increasing the page size won’t help either. Instead, we need a different approach.
Notation 67

A better solution is to split that single complex diagram into a larger number of simpler
diagrams, each with a specific focus around a business area, functional area, functional
grouping, bounded context, use case, user interaction, feature set, etc. For the techtribes.je
web application component diagram, we could do this by creating a single diagram per web-
MVC controller. For example, here’s a component diagram that focusses on the “TweetCon-
troller” (the page on the website that shows recent tweets by local people and businesses).

A component diagram for the techtribes.je web application, focussed on the TweetController

The key with this approach is to ensure that each of the separate diagrams tells a different
part of the same overall story, at the same level of abstraction.
9. Aligning the diagrams and the code
One of the biggest problems I see with software architecture diagrams is that they never
quite match the code. Sometimes the diagrams are horribly out of date, perhaps because the
overhead in maintaining them is too high, but at other times the abstractions being shown in
the diagram don’t actually reflect the code. And this is especially prevalent at the component
level.

Abstraction allows us to reduce detail and manage complexity

Let’s imagine that you’ve inherited an undocumented existing codebase, which is somewhere
in the region of two million lines of Java code, perhaps broken up into approximately one
hundred thousand Java classes. And let’s say that you’ve been given the task of creating some
software architecture diagrams to help describe the system to the rest of the team. Where do
you start?
If you have enough time and patience, drawing a class diagram of the codebase is certainly
an option. Although by the time you’ve finished drawing the diagram, it’s likely to be out of
date. Automating this process with a static analysis or diagramming tool isn’t likely to help
matters either. The problem here is there’s too much information to comprehend.
Instead, what we tend to do is look for related groups of classes and instead draw a
diagram showing those. These related groups of classes are usually referred to as modules,
components, services, layers, packages, namespaces, subsystems, etc. The same can be said
when you’re doing some up front design for a new software system. Although you could
start by sketching out class diagrams, this is probably diving into the detail too quickly.
There are a number of benefits to thinking about a software system in terms of components,
but essentially it allows us to think and talk about the software as a small number of high-
level abstractions rather than the hundreds and thousands of individual classes that make up
most enterprise object-oriented systems. Abstractions help us to reason about a big and/or
complex software system.

9.1 The model-code gap


Although we might refer to things like components when we’re describing a software
system, and indeed many of us consider our software systems to be built from a number

68
Aligning the diagrams and the code 69

of collaborating components, that structure isn’t usually evident in the code. This is one of
the reasons why there is a disconnect between software architecture and coding as disciplines
- the architecture diagrams on the wall say one thing, but the code says another.
When you open up a codebase, it will often reflect some other structure due to the
organisation of the code. The mapping between the architectural view of a software system
and the code are often very different. This is sometimes why you’ll see people ignore
architecture diagrams and say, “the code is the only single point of truth”. George Fairbanks
names this the “Model-code gap” in his book titled Just Enough Software Architecture¹.
The premise is that while we think about our software systems as being constructed
of components, modules, services, layers, etc, we don’t have these same concepts in the
programming languages that we use. For example, does Java have a “component” or “layer”
keyword? No, our Java systems are built from a collection of classes and interfaces, typically
organised into a number of packages. It’s this mismatch between architectural concepts and
the code that can hinder our understanding.

Package by layer

Let’s assume that we’re building a web application based upon the Web-MVC pattern.
There are a number of ways that we can organise our source code. Packaging code by
layer is typically the default approach because, after all, that’s what the books, tutorials and
framework samples tell us to do. If you do a search on the web for tutorials related to Spring
MVC or ASP.NET MVC, for example, you’ll likely see this in the example code used. I spent
most of my career building software systems in Java and I too used the same packaging
approach for the majority of the codebases that I worked on.
Here we’re organising code by grouping things of the same type. In other words, you’ll have
a package for domain classes, one for web controllers/views, one for “business services”, one
for data access, another for integration points and so on. I’m using the Java terminology of
a “package” here, but the same is applicable to namespaces (e.g. in C#), folders of files, etc.
¹http://rhinoresearch.com/book
Aligning the diagrams and the code 70

Layers are the primary organisation mechanism for the code. Terms such as “separation of
concerns” are used to justify this approach and layered architectures are generally thought
of as a “good thing”. Need to switch out the data access mechanism? No problem, everything
is in one place. Each layer can also be tested in isolation to the others around it, using
appropriate mocking techniques, etc. The problem with layered architectures is that they
often turn into a big ball of mud² because, in Java anyway, you need to mark your classes
as public for much of this to work. And once you mark classes as public, without discipline,
code in any other layer of your architecture can use them.
Organising a codebase by layer makes it easy to see the overall technical structure of the
software but there are trade-offs. For example, you need to delve inside multiple layers in
order to make a change to a feature, user story or functional unit. Also, many codebases end
up looking eerily similar given the fairly standard approach to layering within enterprise
systems. In Screaming Architecture³, Uncle Bob Martin says that if you’re looking at a
codebase, it should scream something about the business domain. The same should be said
for a software architecture diagram.
²http://www.laputan.org/mud/
³http://blog.8thlight.com/uncle-bob/2011/09/30/Screaming-Architecture.html
Aligning the diagrams and the code 71

Package by feature

Packaging by layer isn’t the only answer though and, instead of organising code by horizontal
slice, “package by feature” seeks to do the opposite by organising code by vertical slice.

Now everything related to a single feature (or feature set) resides in a single place (package,
namespace, folder, etc). You can still have a layered architecture, but the layers reside inside
the feature packages. In other words, horizontal layering is the secondary organisation
mechanism. The often cited benefit to “package by feature” is that it’s “easier to navigate
the codebase when you want to make a change to a feature”, but this is a minor thing given
the power of modern IDEs.
What you can do now though is hide feature-specific classes and keep them out of sight from
the rest of the codebase by marking them as hidden or internal to that package. In Java, you
can do this using the default access modifier (package-protected), and with C# you could use
the internal keyword if each feature corresponds to a separate assembly. The big question
though is what happens when that new feature set C needs to access data from features A
and B? Again, in Java, you’ll need to start making classes publicly accessible from outside of
the packages and the big ball of mud will likely again emerge.
Aligning the diagrams and the code 72

Code structure vs architecture

Although there’s nothing particularly wrong with packaging and organising code using
either of these approaches, this code structure often never quite reflects the abstractions
that we think about when we view the system from an architecture perspective. If you’re
using an object-oriented programming language, do you talk about “objects” when you’re
having architecture discussions? In my experience, the answer is “no”. I typically hear
people referring to concepts such as components and services instead. But what are these
components and services? And where are they in the code?
Often, a “component” that we see on an architecture diagram is actually implemented by a
combination of classes across a number of different layers. To use an example, my component
diagram for the techtribes.je Content Updater shows a “Tweet Component” that provides
a way to store and access tweets in a MongoDB store. The diagram suggests that it’s a
single black box component, but my initial implementation was very different. The following
diagram illustrates why.

For my initial implementation, I’d taken a “package by layer” approach and broken my tweet
component down into a separate service and data access object. This is a great example
of where the code doesn’t quite reflect the architecture - the tweet component is a single
box on an architecture diagram but implemented as a collection of classes across a layered
architecture when you look at the code. The “Tweet Component” only exists as a conceptual
thing in the codebase.
Aligning the diagrams and the code 73

Imagine having a large, complex codebase where the architecture diagrams tell a different
story from the code. The easy way to fix this is to simply redraw the component diagram
to show that it’s really a layered architecture made up of services collaborating with data
access objects. The result is a much more complex diagram but it also feels like that diagram
is starting to show too much detail.

9.2 An architecturally-evident coding style

George’s answer to the model-code gap is simple - we should use an “architecturally-evident


coding style”. In other words, we should drop hints into our codebase so that the code reflects
the architectural intent. In concrete terms, this could be achieved by:

• Naming conventions: If you’re implementing something that you think of as a


component, ensure that this is apparent from the naming. For example, a class,
interface, package, namespace, etc) could include the word “component”.
• Packaging conventions: In addition, perhaps you group everything related to a single
component into a single package, namespace, module, folder, etc.
• Machine-readable metadata: Alternatively, why not include machine-readable meta-
data in the code so that parts of it can be traced back to the architectural vision. In real
terms, you could use Java Annotations or C# Attributes to signify classes as being
architecturally important.

This all sounds very sensible and relatively easy to do but, in my experience, I rarely see
teams doing this.

Packaging by component

Back to my “Tweet Component”. What I actually did was to change the code to match the
architectural vision and intent. I reorganised the code to be packaged by component rather
than packaged by layer. In essence, I merged the services and data access objects together into
a single package so that I was left with a public interface and a hidden (package-protected)
implementation. The goal in doing this refactoring is increased modularity through an
architecturally-evident coding style.
Aligning the diagrams and the code 74

The basic premise here is that I want my codebase to be made up of a number of coarse-
grained components, with some sort of presentation layer (web UI, desktop UI, API,
standalone app, etc) built on top. A “component” in this sense is a combination of the business
and data access logic related to a specific thing (e.g. domain concept, bounded context or
aggregate from Domain-Driven Design, etc).
If another component wants access to the collection of tweets stored in the MongoDB store,
it is forced to go through the public interface of the Tweet Component. No direct access to
the data access layer is allowed, and you can enforce this if you use Java’s access modifiers
properly. The same can be achieved in C# with the internal keyword and by making every
component a separate assembly. Again, “architectural layering” is a secondary organisation
mechanism.
Here’s what the restructured TweetComponent⁴ looks like.
⁴https://github.com/techtribesje/techtribesje/tree/master/techtribes-core/src/je/techtribes/component/tweet
Aligning the diagrams and the code 75

Each sub-package of je.techtribes.component⁵ houses a separate component, complete with


its own internal layering and configuration. As far as possible, all of the internals are package
scoped. You could potentially pull each component out and put it in its own project or source
code repository to be versioned separately. This approach will likely seem familiar to you
if you’re building something that has a very explicit loosely coupled architecture such as a
distributed system made up of distinct components or microservices.

Layers are an implementation detail

I’m fairly confident that most people are still building something more monolithic in nature
though, despite thinking about their system in terms of components. I’ve certainly packaged
parts of monolithic codebases using a similar approach in the past but it’s tended to be fairly
ad hoc. Let’s be honest, organising code into packages isn’t something that gets a lot of brain-
time, particularly given the refactoring tools that we have at our disposal. Organising code
by component lets you explicitly reflect the concept of “a component” from the architecture
into the codebase.
With a package by component approach, the components are the architecturally significant
structural building blocks and layers are now a component implementation detail rather than
being the primary organisation mechanism.

9.3 Diagrams should reflect reality

There’s often very little mapping from the architecture into the code and back again.
Visualising a software architecture can help to create a good shared vision within the team,
⁵https://github.com/techtribesje/techtribesje/tree/master/techtribes-core/src/je/techtribes/component
Aligning the diagrams and the code 76

which can help it go faster. Having a simple and explicit mapping from the architecture to
the code can help even further, particularly when you start looking at collaborative design
and collective code ownership. Furthermore, it helps bring software architecture firmly back
into the domain of the development team, which is ultimately where it belongs.
This isn’t a book about software design but there’s clearly a relationship between the
architecture of a software system and how that architecture is visualised as a collection
of diagrams. The style of architecture you’re using needs to be reflected on your software
architecture diagrams; whether that’s layers, components, microservices or something else
entirely. Structuring your code around components isn’t “the one true way” but if you are
building monolithic software systems and think of them as being made up of a number of
smaller components, ensure that your codebase reflects this. Similarly, if you organise your
code by layer or feature, ensure that this is reflected on the software architecture diagrams.
10. Sketches, diagrams, models and
tooling
Now that we’ve created a shared vocabulary and seen how to draw some pictures at varying
levels of abstraction, let’s look at the life cycle of pictures and the various ways we can create
them.

10.1 Sketches

Whether you’re undertaking an up-front design exercise or retrospectively documenting


an existing software system, most people will start with sketches on a piece of paper or
whiteboard. Sketching software architecture diagrams, particularly on a whiteboard, is a
great way to collaborate, exchange ideas and try things out. The tools are simple too; you
simply need a canvas and some coloured marker pens.
To prevent the sketches from morphing into those ad hoc “boxes and lines” diagrams that we
saw right back at the start of the book, I recommend that you take a few minutes to create
your shared vocabulary and agree the diagram types that you want to produce. Be conscious
of the notation too but don’t worry about including all of the detail. In other words, do try
to be precise, but don’t worry too much about the fidelity of the diagrams, especially if the
sketches will have a short lifespan.

10.2 Diagrams

At some point, you’ll probably need to create something more formal than a collection of
sketches on a whiteboard. Why? Perhaps you need to record the diagrams for historical
purposes, or maybe the diagrams need to be included in technical specifications, work/bid
proposals, etc.

Photos

The simplest way to record sketches digitally is to take a photo. This is certainly an option if
you’re not worried about presenting the sketches, but often there’s a need to create a version

77
Sketches, diagrams, models and tooling 78

of the sketches that looks a little more polished. The other disadvantage of photos is they are
hard to update. Imagine you spot a missing component after taking a photo and clearing the
whiteboard.

Drawing tools

At this point, the default option for many people is to create an electronic version of the
diagram using tooling, and there are many options. The most common is to use a desktop
drawing tool such as Microsoft Visio, OmniGraffle, SimpleDiagrams, etc. There are some
web-based solutions too; including draw.io, Gliffy, Lucidchart, etc. Alternatively you could
use the diagram creation facilities in Microsoft Word, Microsoft PowerPoint, Apple Keynote,
etc. Most of these drawing tools allow you to produce an image-based (e.g. PNG) export so
the diagrams can be embedded in other documents or web pages. Some of the web-based
tools provide direct integration with wikis such as Atlassian Confluence.
These drawing tools allow you to create any type of diagram you like, by dragging elements
onto a canvas and customising the colours, text positioning, line styles, etc. There is a little
up-front work required to recreate the notation you’ve used on your sketches, but after that
it’s a simple matter of copying and pasting elements to create your diagram.
Many of these tools allow you to create templates or stencils that you can use to make
the diagramming process more efficient too. For example, Dennis Laumen¹ has created an
OmniGraffle stencil² that will save you some time. After installing it, you can simply drag
the C4 elements (people, software systems, containers, components, etc) onto your diagram
canvas and change the text (name and description) as needed.
¹https://twitter.com/dennislaumen
²https://stenciltown.omnigroup.com/#stencil=c4
Sketches, diagrams, models and tooling 79

A C4 stencil for OmniGraffle

One problem with this type of tool is consistency, or rather, the lack of it. Once you start
creating multiple diagrams, you need to put some effort into ensuring that your diagram
elements are named and styled consistently across those diagrams. This is easy to do when
Sketches, diagrams, models and tooling 80

you only have a small number of simple diagrams, but the challenge increases with the
number and complexity of your diagrams.
Another problem is that the files created by these drawing tools often aren’t amenable to
being version controlled. It’s not that you can’t add them to a version control system, it’s
more a case of it being tricky to understand the difference between versions, especially if
you have a binary file format.

Text-based diagramming tools

A solution to the version control problem is to use a text-based diagramming tool such as
WebSequenceDiagrams, yUML, nomnoml, PlantUML, etc. These tools allow you to write
text that describes a set of elements and the interactions between them. The diagram is then
created for you.

WebSequenceDiagrams allows you to create diagrams using text

The majority of these tools are UML focussed, which is great if you want to use UML, not so
much otherwise. You do also lose a degree of control over the graphical styling and layout
of the resulting diagrams. The upside, of course, is that it’s easy to create diagrams, at least
simple diagrams anyway. Additionally, as software developers, we often find working with
text much easier than messing around with boxes and lines in a drawing tool.
Sketches, diagrams, models and tooling 81

While text-based diagramming tools relieve some of the burden of manually creating,
styling and moving boxes in a drawing tool, they still don’t necessarily resolve the issue
of consistency. For that you need to move onto modeling tools.

10.3 Models

The software architecture diagrams we’ve discussed so far are simply that … diagrams.
Regardless of whether you’re using pen and paper or a tool like Microsoft Visio, these
diagrams are pictures created by drawing boxes and lines freehand on a diagram canvas. You
have all of the control over what you draw, and with that control comes the responsibility
to ensure the diagram is accurate and consistent. Diagrams are static and we can’t ask them
any questions. Diagrams are purely visual representations.
The other strategy is to create a model of our software system. In contrast to a collection of
diagrams, a model is a non-visual representation or definition of the software system. You
can then create a number of visual representations (diagrams) based upon the content of that
model. Models are also typically machine-readable, so they can be queried or transformed
into other representations too.

Modeling tools

There are many tools that support this way of working; such as No Magic MagicDraw, Sparx
Enterprise Architect, Archi, IBM Rational Software Architect, ArgoUML, etc. There are also
modeling tool plugins/extensions for many of the popular IDEs. Essentially they all follow
the same principle, by providing you with a way to create and a populate a model. You then
use this model to create a number of diagrams. Here, a diagram represents a specific view of
the model.
As an example, let’s imagine that we want to create a context diagram. With a diagramming
tool, to draw a software system, we need to create a box and put some text inside it. And
we need to do this for every software system we want to include on the diagram. With a
modeling tool, we create a definition for each software system (e.g. by specifying its name
and description) and then use that element on the diagram by dragging it onto the canvas.
If you need to use the same software system across two diagrams, you just drag the element
onto the second diagram canvas.
The power of having a model starts to come into play when you need to rename that software
system. All you do is rename it in the model and all occurrences of the software system
across all diagrams are renamed too. Compare this to a collection of diagrams where you
Sketches, diagrams, models and tooling 82

need to check every diagram and rename any occurrences that you found. This is how a
model introduces and improves consistency over a collection of static diagrams.
These tools typically support many different types of models and notations; including the
Unified Modeling Language, SysML, ArchiMate and so on. This is great if you want to use
these languages, otherwise you’re out of luck. The other aspect you can’t ignore is that
you have to create the model, and often it can be a time-consuming task to populate the
model with information. If you’re modelling a software system as part of an up-front design
exercise, the only real option you have is to use the modeling tool’s user interface to populate
the model. This can sometimes require lots of tedious data entry. If you have an existing
codebase though, some modeling tools will provide the option to reverse-engineer the code
and populate the model for you.

Reverse-engineering and static analysis tools

There are many tools available that will help you reverse-engineer a codebase to create a
model. Some of the modeling tools mentioned previously will do this, as will the popular
IDEs. In addition, there’s a category of static analysis tools that will do this too. Examples
include Structure101, NDepend, Lattix, Sonargraph, etc. These tools work by scanning your
codebase for elements and their dependencies. For an object-oriented programming language
such as Java, this usually means creating a model of the code based upon the set of classes,
interfaces and packages along with the dependencies between all of these elements. Although
the primary purpose of static analysis tools is to provide you information about the quality
of your code, most of them also create some sort of architecture diagrams. This approach also
resolves the thorny issue of how to keep diagrams up to date as a codebase evolves.
If you’ve ever tried to use a static analysis or modelling to automatically generate meaningful
diagrams of your codebase via reverse-engineering though, you will have probably been left
frustrated. The resulting diagrams tend to include too much information by default and they
usually show you code-level elements rather than those you would expect to see on a software
architecture diagram. We’ve seen a simple example of this already by trying to automatically
create a UML class diagram from the Spring PetClinic codebase.
Sketches, diagrams, models and tooling 83

Most static analysis tools won’t show you a single UML class diagram of a codebase,
instead they’ll start by showing you the top-level packages/namespaces and the dependencies
between them. Double-click a package to expand it and you’ll be shown the sub-packages
and classes that reside within that package, along with the dependencies between them.
Although some static analysis tools claim to generate “architecture diagrams”, the diagrams
they actually create are still very code focussed. Like us browsing a codebase, these tools see
classes and interfaces in packages/namespaces when reverse-engineering code. Some tools
can be given rules to recognise architectural constructs (e.g. layers or components) but this
isn’t typically the default out-of-the-box experience. In essence, these diagramming tools
also suffer from the model-code gap. And furthermore, everything we need to understand
the software system from an architectural perspective should be in the code.

Why isn’t the architecture in the code?

And this raises an interesting question. If the code is often cited as the “single point of truth”,
why isn’t a description of the architecture in the code? Let’s look at this in the context of the
C4 model.

Level 1: Context

For a given software system, a context diagram shows the key types of user (actors, roles,
personas, etc) and system dependencies. Is it possible to get this information from the code?
The answer is, “not really”.
Sketches, diagrams, models and tooling 84

• Users: I should be able to get a list of user roles from the code. For example, many
software systems will have some security configuration that describes the various
user roles, Active Directory groups, etc and the parts of the system that such users
have access too. The implementation details will differ from codebase to codebase and
technology to technology, but in theory this information is available somewhere in
the absence of an explicit list of user types.
• System dependencies: The list of system dependencies is a little harder to extract from
a codebase. Again, we can scrape security configuration to identify links to systems
such as LDAP and Active Directory. We could also search the codebase for links to
known libraries, APIs and service endpoints (e.g. URLs), and make the assumption
that these are system dependencies. But what about those system interactions that
are done by copying a file to a network share? I know this sounds archaic, but it still
happens. Understanding inbound dependencies is also tricky.

Level 2: Containers

Zooming in on the software system, a container diagram shows the various web applications,
mobile apps, databases, file systems, standalone applications, etc and how they interact to
form the overall software system. Again, some of this information will be present, in one
form or another, in the codebase. For example, you could scrape this information from:

• IDE project files: Information about executable artifacts (and therefore containers)
could in theory be extracted from IntelliJ IDEA project files, Microsoft Visual Studio
solution files, Eclipse workspaces, etc.
• Build scripts: Automated build scripts (e.g. Ant, Maven, Gradle, MSBuild, etc)
typically generate executable artifacts or have module definitions that can again be
used to identify containers.
• Infrastructure provisioning and deployment scripts: Infrastructure provisioning
and deployment scripts (e.g. Puppet, Chef, Vagrant, Docker, etc) will probably result
in deployable units, which can again be identified and this information used to create
the containers model.

Extracting information from such sources is useful if you have a microservices architecture
with hundreds of separate containers but, if you simply have a web application talking to a
database, it may be easier to explicitly define this rather than going to the effort of scraping
it from the code.
Sketches, diagrams, models and tooling 85

Level 3: Components

Zooming in further to a container is the component diagram. Since even a relatively small
software system may consist of a large number of components, this is a level that we certainly
want to extract from the code. But it turns out that even this is tricky. Usually there’s a lack
of an architecturally-evident coding style, which makes it hard to identify components in
the code. This is particularly true in older systems where the codebase lacks modularity and
looks like a sea of thousands of classes interacting with one another. Assuming that there
is some structure to the code, “components” can be extracted using a number of different
approaches, depending on the codebase and the degree to which an architecturally-evident
coding style has been adopted:

• Metadata: The simplest approach is to annotate the architecturally significant ele-


ments in the codebase and extract them automatically. Examples include finding Java
classes with specific annotations or C# classes with specific attributes. These could
be your own annotations or those provided by a framework such as Spring (e.g.
@Controller, @Service, @Repository, etc), Java EE (e.g. @EJB, @MessageDriven, etc)
and so on.
• Naming conventions: If no metadata is present in the code, often a naming convention
will have been consciously or unconsciously adopted that can assist with finding those
architecturally significant code elements. For example, finding all classes where the
name matches “xxxService” or “xxxRepository” may do the trick.
• Packaging conventions: Alternatively, perhaps each sub-package or sub-namespace
(e.g. com.mycompany.myapp.components.xxx) represents a component.
• Module systems: If a module system is being used (e.g. OSGi), perhaps each of the
module bundles represents a component.
• Build scripts: Similarly, build scripts often create separate modules/JARs/DLLs from
a single codebase and perhaps each of these represents a component.

Auto-generating the software architecture model

Ultimately, I’d like to auto-generate as much of the software architecture model as possible
from the code, but this isn’t currently realistic because most codebases don’t include enough
information about the software architecture to be able to do this effectively. This is true both
at the “big picture” level (context and containers) and the lower level (components). One
solution to this problem is to enrich the information that we can get from the code, with that
which we can’t get from the code.
Sketches, diagrams, models and tooling 86

There have been a number of attempts to create Architecture Definition Languages (ADLs)³
that can be used to formally define the architecture of a software system, although my
experience suggests these are rarely used in real-world projects. There are a number of
reasons for this, ranging from typical real-world time and budget pressures through to the
lack of perceived benefit from creating an academic description of a software system that
isn’t reflective of the source code. Unless you’re building software in a safety critical or
regulated environment, the Agile Manifesto statement of valuing working software over
comprehensive documentation holds true.

Extract and supplement

In my own attempt to solve this problem, I’ve created Structurizr⁴ to combine the benefits
you get from creating a model with text and having it kept up to date using static analysis
techniques. Put simply, it’s a way to create a software architecture model as code, and then
have that model visualised by some simple tooling. The goal of Structurizr is to allow people
to create simple, versionable, up-to-date and scalable software architecture models.
Structurizr is a tool of two halves. First is an open source library⁵ that can be used to create a
software architecture model using code. It’s an architecture description language based upon
the C4 model, implemented in Java. Let’s see how we might define a software architecture
model for the Spring PetClinic system. If I was going to draw a context diagram, it would
simply consist of a single type of user (a clinic employee) using the Spring PetClinic system.
With Structurizr, we can represent this in code as follows.

1 Workspace workspace = new Workspace("Spring PetClinic");


2 Model model = workspace.getModel();
3
4 SoftwareSystem springPetClinic = model.addSoftwareSystem("Spring PetClinic",
5 "Allows employees to view and manage information regarding the veterinarians, the clients, and the\
6 ir pets.");
7
8 Person clinicEmployee = model.addPerson("Clinic Employee", "An employee of the clinic");
9
10 clinicEmployee.uses(springPetClinic, "Uses");

Stepping down to containers, the Spring PetClinic system is made up of a Java web
application that uses a database to store data. Again, we can represent this in code as follows
(I’ve made some assumptions about the technology stack the system is deployed on).
³http://en.wikipedia.org/wiki/Architecture_description_language
⁴https://www.structurizr.com
⁵https://github.com/structurizr/java
Sketches, diagrams, models and tooling 87

1 Container webApplication = springPetClinic.addContainer("Web Application",


2 "Allows employees to view and manage information regarding the veterinarians, the clients, and the\
3 ir pets.", "Apache Tomcat 7.x");
4
5 Container relationalDatabase = springPetClinic.addContainer("Relational Database",
6 "Stores information regarding the veterinarians, the clients, and their pets.", "HSQLDB");
7
8 clinicEmployee.uses(webApplication, "Uses", "HTTP");
9
10 webApplication.uses(relationalDatabase, "Reads from and writes to", "JDBC, port 9001");

At the next level of abstraction, we need to open up the web application to see the components
inside it. Although we couldn’t really get the two previous levels of abstraction from the
codebase easily, we can get the components. All we need to do is understand what a
“component” means in the context of this codebase. We can then use this information to
help us find and extract them in order to populate the software architecture model.
Spring MVC uses Java annotations (@Controller, @Service and @Repository) to signify
classes as being web controllers, services and repositories respectively. Assuming that we
consider these to be our architecturally significant code elements, it’s then a simple job of
extracting these annotated classes (Spring Beans) from the codebase.

1 ComponentFinder componentFinder = new ComponentFinder(


2 webApplication,
3 "org.springframework.samples.petclinic",
4 new SpringComponentFinderStrategy(),
5 new JavadocComponentFinderStrategy(new File("./src/main/java/"), 150));
6 );
7 componentFinder.findComponents();

Built-in to the SpringComponentFinderStrategy are some rules that automatically collapse


the interface and implementation of a Spring Bean, so the controllers, services and reposi-
tories are treated as “components” rather than a number of separate interfaces and classes.
The dependencies between components are also identified and extracted. In addition, the
JavadocComponentFinderStrategy will parse the class-level Javadoc comment from the
source file for inclusion in the model.
The final thing we need to do is connect the user to the web controllers, and the repositories
to the database. This is easy to do since the software architecture model is represented in
code.
Sketches, diagrams, models and tooling 88

1 webApplication.getComponents().stream()
2 .filter(c -> c.getTechnology().equals(SpringComponentFinderStrategy.SPRING_MVC_CONTROLLER))
3 .forEach(c -> user.uses(c, "Uses"));
4
5 webApplication.getComponents().stream()
6 .filter(c -> c.getTechnology().equals(SpringComponentFinderStrategy.SPRING_REPOSITORY))
7 .forEach(c -> c.uses(relationalDatabase, "Reads from and writes to"));

With the software architecture model in place, we now need to create some views with which
to visualise the model. Again, we can do this using code. First the context diagram, which
includes all people and all software systems.

1 ViewSet viewSet = workspace.getViews();


2
3 SystemContextView contextView = viewSet.createContextView(springPetClinic);
4 contextView.addAllSoftwareSystems();
5 contextView.addAllPeople();

Next is the container diagram.

1 ContainerView containerView = viewSet.createContainerView(springPetClinic);


2 containerView.addAllPeople();
3 containerView.addAllContainers();

And finally is the component diagram.

1 ComponentView componentView = viewSet.createComponentView(webApplication);


2 componentView.addAllComponents();
3 componentView.addAllPeople();
4 componentView.add(relationalDatabase);

There are a few minor details omitted here for brevity (specifically related to styling the
elements), but that’s essentially all the code you need to create a software architecture model
and views for this sample codebase. The full source code for this example can be found on
the Structurizr for Java repository⁶.
⁶https://github.com/structurizr/java/blob/master/structurizr-examples/src/com/structurizr/example/spring/petclinic/
SpringPetClinic.java
Sketches, diagrams, models and tooling 89

Visualising the software architecture model

The code we’ve just seen simply creates an in-memory representation of the software
architecture model, in this case as a collection of Java objects. The open source Structurizr for
Java library also includes a way to export this model to an intermediate JSON representation,
which can then be imported into some tooling that is able to visualise it.

structurizr.com⁷ is the other half of the story. It’s a web application that takes a software
architecture model (via an API) and provides a way to visualise it.

For example:

1 StructurizrClient client = new StructurizrClient("https://api.structurizr.com", "key", "secret");


2 client.putWorkspace(1, workspace);

Aside from changing the colour, size and position of the boxes, the graphical representation is
relatively fixed. This in turn frees you up from messing around with creating static diagrams
in drawing tools. The result of visualising the Spring PetClinic model, after moving the boxes
around, is something like the following. Here’s the context diagram.
⁷https://www.structurizr.com
Sketches, diagrams, models and tooling 90

Next is the container diagram.

And finally we have the component diagram for the web application.
Sketches, diagrams, models and tooling 91
Sketches, diagrams, models and tooling 92

Structurizr will automatically generate a diagram key too.

The live version of the diagrams can be found at structurizr.com⁸ and they allow you to
double-click a component on the component diagram in order to navigate directly to the
Spring PetClinic code that is hosted on GitHub. This links the software architecture diagrams
with the code.

Alternative visualisations

It’s worth pointing out that Structurizr is my vision of what I want from a simple software
architecture diagramming tool, but you’re free to take the output from the open source library
and create your own tooling to visualise the model.
For example, it’s easy to visualise the hierarchy of elements within the model using something
like D3⁹.
⁸https://www.structurizr.com/public/1
⁹http://d3js.org
Sketches, diagrams, models and tooling 93

Or, if you’re a fan of Graphviz¹⁰, you can export the various views to a DOT file.
¹⁰http://www.graphviz.org
Sketches, diagrams, models and tooling 94

Alternatively, you could export the model to an XMI format (for importing into UML tools),
a desktop app, IDE plugins, etc. The choice is yours.

Software architecture as code opens opportunities

Having the software architecture model as code opens a number of opportunities for creating
the model (e.g. extracting components automatically from a codebase) and communicating
it (e.g. you can slice and dice the model to produce a number of different views as necessary).
For example, showing all components for a large system will result in a very cluttered
diagram. Instead, you can simply write some code to programmatically create a number
of smaller, simpler diagrams, perhaps one per vertical slice, web controller, user story, etc.
You can also opt to include or exclude any elements as necessary. For example, I typically
exclude logging components because they tend to be used by every other component and
serve no purpose other than to clutter the diagram.
Since the models are code, they are also versionable alongside your codebase and can be
integrated with your automated build system to keep your models up to date. This provides
accurate, up-to-date, living software architecture diagrams that actually reflect the code.
Sketches, diagrams, models and tooling 95

The road ahead

In order to adopt this approach, you do need to consider how your software architecture
reflects your code and vice versa. The model-code gap needs to be as small as possible
so that meaningful diagrams can be automatically generated from the code. And it’s here
that we face two key challenges. First of all, we need to get people thinking about software
architecture once again so that they are able to think about, describe and discuss the various
structures needed to reason about a large and/or complex software system. And secondly,
we need to find a way to get these structures into the codebase using architecturally-evident
coding styles to ensure that the model-code gap is minimised. As an industry, I think we
still have a long way to go but, in time, I hope that the thought of using a drawing tool like
Microsoft Visio for creating software architecture diagrams will seem ridiculous.
11. Other diagrams
The context, container and component diagrams that we’ve seen so far are often sufficient
to describe a software system. However, sometimes it can be useful to draw some additional
diagrams to highlight different aspects.

Understand the static structure first

The primary focus of this book is describing and communicating the static structure of a
software system; from the big picture of how a software system fits into its environment
down to its components and the classes that implement them. Once you have a shared
vocabulary with which to describe the static structure of a software system (at varying levels
of abstraction), it becomes easy to communicate other aspects of that system based upon the
static structure.

96
Other diagrams 97

The static structure defines the core of the software architecture model

Architectural views

There are a number of different ways to think about, describe and visualise a software system.
Examples include IEEE 1471¹, ISO/IEC/IEEE 42010², Philippe Kruchten’s 4+1 model³, etc.
What they have in common is that they all provide different “views” onto a software system
to describe different aspects of it. For example, there’s often a “logical view”, a “physical
view”, a “development view” and so on.
The big problem I’ve found in the real-world with many of these approaches is that it starts
to get confusing very quickly if the whole team isn’t versed in the terminology used. For
example, I’ve heard people argue about what the difference between a “conceptual view”
and a “logical view” is. And let’s not even start asking questions about whether technology
¹http://en.wikipedia.org/wiki/IEEE_1471
²http://en.wikipedia.org/wiki/ISO/IEC_42010
³http://en.wikipedia.org/wiki/4%2B1_architectural_view_model
Other diagrams 98

is permitted in a logical view. Perspective is important too. If I’m a software developer, is


the “development view” about the code, or is that the “implementation view”? But what
about the “physical view”? Code is the physical output, after all. But then “physical view”
means something different to infrastructure architects. But what if the target deployment
environment is virtual rather than physical?
A common theme throughout this book has been about creating a shared vocabulary, and
the same applies when you’re considering which other diagrams to draw. One way to resolve
the terminology issue is to ensure that everybody on the team can point to a clear definition
of what the various diagram types or architectural views are. Just be aware that different
software architecture books often use different names to describe the same architectural view.
This one included, of course.

11.1 Enterprise context

The C4 model provides a static view of a single software system but, in the real-world,
software systems never live in isolation. For this reason, and particularly if you manage
a collection of software systems, it’s often useful to understand how all of these software
systems fit together within the bounds of an enterprise. To do this, I’ll simply add another
diagram that sits on top of the C4 diagrams, to show the enterprise context from an IT
perspective. C4 therefore becomes C5, with this extra enterprise context diagram showing:

• The organisational boundary.


• Internal and external users.
• Internal and external systems (including a high-level summary of their responsibilities
and data owned).

Essentially this becomes a high-level map of the software systems at the enterprise level,
with a C4 drill-down for each software system of interest. As a caveat, I do appreciate
that enterprise architecture isn’t simply about technology but, in my experience, many
organisations don’t have an enterprise architecture view of their IT landscape. In fact, it
shocks me how often I come across organisations of all sizes that lack such a holistic view,
especially considering IT is usually a key part of the way they implement business processes
and service customers. Sketching out the enterprise context from a technology perspective
at least provides a way to think outside of the typical silos that form around IT systems.
Other diagrams 99

11.2 User interface mockups and wireframes

Mocking up user interfaces with tools such as Balsamiq⁴ is a fantastic way to understand
what is needed from a software system and to prototype ideas. Such sketches, mockups and
wireframes will provide a view of the software system that is impossible with the C4 model.

11.3 Domain model

Most real-world software systems represent business domains that are non-trivial and, if
this is the case, a diagram summarising the key domain concepts can be a useful addition.
The format I use for these is a UML class diagram where each UML class represents an
entity in the domain. For each entity, I’ll typically include important attributes/properties
and the relationships between entities. A domain model is useful regardless of whether you’re
following a Domain-driven design⁵ approach or not.

11.4 Runtime and behaviour

The majority of this book has concentrated on an approach to thinking about, describing
and communicating software architecture that is focussed around static structure: software
systems, containers, components and classes. Whenever I’ve needed to document a software
system in the past, from a diagramming perspective anyway, most of what I’ve created echoes
this sentiment, with the majority of my diagrams being descriptions of the static structure.
My software architecture sketching workshops also confirm a similar trend. During the initial
iteration (where I provide very little guidance and hopefully don’t influence the outcome), I
estimate that 80-90% of the diagrams I see reflect static structure.
Software isn’t static though, and needs to be executed in order to actually do something
useful. For this reason, it can be useful to create diagrams that illustrate what happens at
runtime for important use cases, user stories or scenarios. To do this, I simply take the concept
of sequence and collaboration diagrams from UML and apply them to the static elements in
the C4 model. For example, you could illustrate how a use case is implemented by drawing a
sequence diagram of how components interact at runtime. Or you could show the interaction
between containers if you have more of a microservices style of architecture, where every
service is deployed in a separate container.
⁴https://balsamiq.com
⁵http://en.wikipedia.org/wiki/Domain-driven_design
Other diagrams 100

Given the number of execution paths through even a relatively small software system
(user stories, success/failure scenarios, conditional logic, asynchronous processing, etc), it’s
obviously not practical to document everything. Especially not at the class or method level.
In fact, if you want to figure out how something works, it’s often easier to just dive into the
code, run an automated test or use a debugger. This assumes that you have a good starting
point and know where to look, of course.
If I think back to the software systems that I’ve documented, and my documentation approach
in general, I very rarely describe the dynamic aspects of a software system. A few examples
where I have done this include:

• Explaining how a low-level design pattern works at the code level (the interactions
between classes).
• Explaining the interactions between applications and services during user authentica-
tion when using a federated security provider (the handshaking between an ASP.NET
application running Windows Identity Foundation against Microsoft Active Directory
Federation Services isn’t straightforward).
• Explaining the typical flow of asynchronous messages that implement a business
process.

While the dynamic aspects of a software system are certainly important, I don’t typically
find that documenting them adds much value. As I said, rather than documenting every
execution path through a software system, I’ll only do this in order to explain the significant
or complex scenarios, especially where they are not evident by reading the code.

Sequence and collaboration diagrams

In terms of how to diagram the dynamic aspects, UML sequence and collaboration diagrams
are typically the way that most people do this, myself included, although I tend to use UML as
a sketching notation rather than precisely following the specification. There are lots of UML
introductions on the web, but essentially both diagram types show the same information,
albeit from a slightly different perspective.
A UML sequence diagram is typically made up of a number of items (left to right) and a
timeline (top to bottom). The diagram illustrates how the various items collaborate (using
horizontal arrows) by sending messages, making requests, etc. The vertical order of the
arrows illustrates the sequencing. You can use sequence diagrams to illustrate any sequence
of items collaborating. Commonly these items are code elements such as classes, but there’s
Other diagrams 101

nothing preventing you showing people, software systems, containers or components. As


example, here’s a sequence diagram to illustrate how a user of the Spring PetClinic system
gets a list of vets working in the clinic.

This particular example was created using WebSequenceDiagrams⁶, using the following text:

1 title View list of vets


2
3 Clinic Employee->VetController: Requests list of vets from /vets
4 VetController->ClinicService: Calls findVets
5 ClinicService->VetRepository: Calls findAll
6 VetRepository->Relational Database: select * from vets

There are a number of tools and approaches to creating sequence diagrams, but they all create
similar diagrams, with the official UML specification detailing ways to add more precision
and semantics onto the diagrams.
The other approach is a collaboration diagram⁷. Essentially this shows exactly the same
information as a sequence diagram, although that information is presented in a different
way. Typically, this is a simpler “boxes and lines” style diagram where the lines have been
annotated and numbered to indicate the ordering of collaborations. Here’s the same scenario
of a clinic employee requesting a list of vets, this time illustrated using a collaboration
diagram.
⁶http://www.websequencediagrams.com
⁷In UML 2.x, this is called a communication diagram, but the purpose and content is essentially the same.
Other diagrams 102

I prefer a collaboration diagram because it tends to be easier to draw, especially on


a whiteboard, but either diagram works provided you’re not trying to show too many
collaborations.

11.5 Business process and workflow

Related to sequence and collaboration diagrams are process models. Sometimes I want to
summarise a particular business process or user workflow that a software system implements,
without getting into the technicalities of how it’s implemented. A UML activity diagram or
traditional flowchart is a great way to do this.

11.6 Infrastructure

A map of your infrastructure can be a useful thing to capture because of the obvious
relationship between software and infrastructure. There are a number of ways to describe
infrastructure, ranging from infrastructure diagrams in Microsoft Visio through to automated
scripts that manage and provision infrastructure on a cloud provider.

11.7 Deployment

It’s often useful to describe the deployment mapping between containers and infrastructure.
For example, a database-driven website could be deployed onto a single server or across a
Other diagrams 103

server farm made of up hundreds of servers, depending on the need to support scalability, re-
silience, security, etc. A deployment diagram would typically show how container instances
(and perhaps component instances inside them) are deployed onto infrastructure.
Even if your deployment is fully automated, it can still sometimes be useful to have a diagram
summarising the deployment mapping.

11.8 And more

The diagrams from the C4 model plus those I’ve listed here are usually enough for me to
adequately describe how a software system is designed, built and works. I try to keep the
number of diagrams I use to do this to a minimum and I advise you to do the same. Some
diagrams can be automatically generated (e.g. an entity relationship diagram for a database
schema) but if you need an A0 sheet of paper to display it, you should consider whether the
diagram is actually useful. Do create more diagrams if you need to describe something that
isn’t listed here and if a particular diagram doesn’t add any value, simply discard it.

11.9 Architectural view models

Something you might be wondering is how the C4 model compares to some of the other
architectural view models, such as Philippe Kruchten’s 4+1 architectural view model⁸ and
the collection offered in Software Systems Architecture⁹ by Eoin Woods and Nick Rozanski.

“4+1” View Model (Philippe Kruchten)

The “4+1” view model consists of five different views that can be used to describe a
software system. The original definition can be found in Philippe’s IEEE paper, Architectural
Blueprints—The “4+1” View Model of Software Architecture¹⁰, although many people have
refined the model since the original paper was published. Some of what you’ll find written
about “4+1” on the Internet doesn’t necessarily match the content of the original paper too,
and many people have subtly redefined the views of the model to better map onto the notation
or methodology they were using at the time (e.g. UML). My summary of “4+1” is as follows:
⁸http://en.wikipedia.org/wiki/4%2B1_architectural_view_model
⁹http://www.viewpoints-and-perspectives.info
¹⁰http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=469759
Other diagrams 104

• Logical View: This view describes the functionality delivered by the system. It’s
usually one or more high-level diagrams that show the major functional building
blocks and how they are related.
• Process View: This view describes how the logical building blocks are combined
together into physical processes (or execution units). It’s used to capture concurrency,
inter-process communication, etc.
• Physical View: This view describes the infrastructure and deployment topology of the
software.
• Development View: This view describes the how the functional building blocks in
the logical view are implemented by software developers using modules, components,
classes, layers, etc.
• Scenarios View: A number of selected use cases, or scenarios, are used to drive and
illustrate the content of the four previous views.

Software Systems Architecture (Eoin Woods and Nick Rozanski)

“Software Systems Architecture” by Eoin Woods and Nick Rozanski defines another model
with which to describe a software system. This builds upon the “4+1” view model and presents
a collection of seven viewpoints as follows:

• Context Viewpoint: This describes how the software system fits into the surrounding
environment (i.e. people and other software systems).
• Functional Viewpoint: This is similar to the “Logical View” in the “4+1” model; it
shows the functional building blocks that make up the software system. It was renamed
to make the intent clearer (i.e. you’re looking at the functional building blocks that
make up the software system).
• Information Viewpoint: This describes how the software system stores and uses
information, from a static structural perspective (e.g. entity relationship models, etc),
how that information is used at runtime (e.g. the flow of information through system,
information state models, etc), how it’s archived, volumetrics and so on. This viewpoint
allows information to be seen as a first-class citizen, rather than a by-product of the
software system.
• Concurrency Viewpoint: This is similar to the “Process View” in the “4+1” model.
• Development Viewpoint: This is similar to the “Development View” in the “4+1”
model.
• Deployment Viewpoint: This is similar to the “Physical View” in the “4+1” model.
Other diagrams 105

• Operational Viewpoint: This viewpoint is used to describe the operational aspects


of the software system; including installation, operation, upgrades, data migration,
configuration management, administration, monitoring, support models, etc.

Minimising the gap between the logical and development views

In my experience, a “Logical View” (or “Functional View”) is typically what people think of
when asked to draw “an architecture diagram”. While that’s a useful view to create, it’s often
confusing, especially for software developers, when that logical view of functional building
blocks doesn’t easily map onto or reflect the code. Also, this split between the “Logical
View” and the “Development View” is often what I see in organisations where there is a
clear separation between “the architects” and “the developers”. This isn’t necessarily a bad
thing, but the transition between the “Logical” and “Development” views doesn’t mandate
a process hand-off between separate architecture and development teams. Process hand-offs
tend to further exaggerate the distance between the “Logical” and “Development” views.
My personal preference is to minimise, and in fact remove, the gap between a logical view of
what the system does and the real-world view of how that functionality is implemented in
code. As we’ve already discussed in previous chapters, this is about minimising the model-
code gap. In essence, the C4 model spans and combines what you might find in a “Logical
View” and “Development View” into a single description of the static structure, across a
number of different levels of abstraction:

• System Context: This shows you what the system does and how it fits into the world
around it (i.e. users and other software systems).
• Containers: This shows how the functionality delivered by the system is partitioned
up across the high-level building blocks (i.e. containers).
• Components: This shows how the functionality delivered by a particular container is
partitioned across components within that container.

You can say that the C4 model also includes some of what you would find in the “4+1 Process
View”, especially given that the container diagram shows execution units. That’s certainly
true, although there still may be occasions when it’s worth creating a specialised version of
a container diagram to highlight concurrency or synchronisation.
12. Appendix A: Financial Risk System
12.1 Background

A global investment bank based in London, New York and Singapore trades (buys and sells)
financial products with other banks (counterparties). When share prices on the stock markets
move up or down, the bank either makes money or loses it. At the end of the working day,
the bank needs to gain a view of how much risk they are exposed to (e.g. of losing money)
by running some calculations on the data held about their trades. The bank has an existing
Trade Data System (TDS) and Reference Data System (RDS) but need a new Risk System.

Trade Data System

The Trade Data System maintains a store of all trades made by the bank. It is already
configured to generate a file-based XML export of trade data at the close of business (5pm) in
New York. The export includes the following information for every trade made by the bank:

• Trade ID
• Date
• Current trade value in US dollars
• Counterparty ID

Reference Data System

The Reference Data System maintains all of the reference data needed by the bank. This
includes information about counterparties; each of which represents an individual, a bank,
etc. A file-based XML export is also available and includes basic information about each
counterparty. A new organisation-wide reference data system is due for completion in the
next 3 months, with the current system eventually being decommissioned.

106
Appendix A: Financial Risk System 107

12.2 Functional Requirements

The high-level functional requirements for the new Risk System are as follows.

1. Import trade data from the Trade Data System.


2. Import counterparty data from the Reference Data System.
3. Join the two sets of data together, enriching the trade data with information about the
counterparty.
4. For each counterparty, calculate the risk that the bank is exposed to.
5. Generate a report that can be imported into Microsoft Excel containing the risk figures
for all counterparties known by the bank.
6. Distribute the report to the business users before the start of the next trading day (9am)
in Singapore.
7. Provide a way for a subset of the business users to configure and maintain the external
parameters used by the risk calculations.

12.3 Non-functional Requirements

The non-functional requirements for the new Risk System are as follows.

Performance

• Risk reports must be generated before 9am the following business day in Singapore.

Scalability

• The system must be able to cope with trade volumes for the next 5 years.
• The Trade Data System export includes approximately 5000 trades now and it is
anticipated that there will be an additional 10 trades per day.
• The Reference Data System counterparty export includes approximately 20,000 coun-
terparties and growth will be negligible.
• There are 40-50 business users around the world that need access to the report.

Availability

• Risk reports should be available to users 24x7, but a small amount of downtime (less
than 30 minutes per day) can be tolerated.
Appendix A: Financial Risk System 108

Failover

• Manual failover is sufficient for all system components, provided that the availability
targets can be met.

Security

• This system must follow bank policy that states system access is restricted to authen-
ticated and authorised users only.
• Reports must only be distributed to authorised users.
• Only a subset of the authorised users are permitted to modify the parameters used in
the risk calculations.
• Although desirable, there are no single sign-on requirements (e.g. integration with
Active Directory, LDAP, etc).
• All access to the system and reports will be within the confines of the bank’s global
network.

Audit

• The following events must be recorded in the system audit logs:


– Report generation.
– Modification of risk calculation parameters.
• It must be possible to understand the input data that was used in calculating risk.

Fault Tolerance and Resilience

• The system should take appropriate steps to recover from an error if possible, but all
errors should be logged.
• Errors preventing a counterparty risk calculation being completed should be logged
and the process should continue.

Internationalization and Localization

• All user interfaces will be presented in English only.


• All reports will be presented in English only.
• All trading values and risk figures will be presented in US dollars only.
Appendix A: Financial Risk System 109

Monitoring and Management

• A Simple Network Management Protocol (SNMP) trap should be sent to the bank’s
Central Monitoring Service in the following circumstances:
– When there is a fatal error with a system component.
– When reports have not been generated before 9am Singapore time.

Data Retention and Archiving

• Input files used in the risk calculation process must be retained for 1 year.

Interoperability

• Interfaces with existing data systems should conform to and use existing data formats.

You might also like