Johanna Drucker, Information Visualization
Johanna Drucker, Information Visualization
Information visualization
6a Basics of visualization
Information visualizations are a part of everyday communications and
scholarship. These graphics have powerful rhetorical force. The visualiza-
tions are often more easily consumed than the complex research data on
which they depend. Understanding the process by which visualizations are
made helps bring into focus what they show and what they conceal.
All information visualizations are metrics expressed as graphics. The
implications of this simple statement are far ranging. Data can be very dif-
ficult to interpret in tabular form. Very few individuals are skilled at reading
spread sheets, let alone relational databases, to make sense of information.
A query might produce thousands of data points. Information visualizations
are used to make this quantitative data legible. They are particularly useful
for seeing patterns in large amounts of information, making these apparent
in a condensed form.
Anything that can be quantified (given a numerical value) can be turned
into a graph, chart, diagram, or other visualization.
Points, lines, and areas can be plotted using analog tools—paper and
colored pencils—and many of the formats used in digitally produced visu-
alizations are centuries old. The process of making graphs by hand is slow
and deliberate. Each point has to be marked, each line created by connecting
dots or using mathematical formulae, and each area calculated. At each step
of hand-drawing a graph or chart, we reflect on how it is made.
But the ease of production afforded by computational means makes it
possible to create polished and sophisticated graphics without critical reflec-
tion. We can easily overlook the fact that all parts of the process—from
creating quantified information to producing visualizations—are acts of
interpretation. In addition, the ability to read a visualization requires under-
standing the semantics of graphic formats. Visual forms create meaning,
they don’t just display it. A bar chart makes a different statement than a pie
chart, for instance, and such insights are crucial to the critical engagement
with information visualization (Lengler and Eppler 2007).
Information visualization 87
Benefits and liabilities
To begin, consider the two components of a visualization separately—the
metrics and the graphics. Here are two versions of the same information, a
table and a bar chart:
Figure 6.1a Segment of a table and 6.1b Bar chart generated from the same informa-
tion (JD)
The table is not very complicated, it puts dates in one column and num-
ber of pages output by an author into a second one. All of the information
in it makes good sense but trying to read columns of numbers to see a
pattern in them is difficult. The chart makes clear that a steady output of
pages occurred in 1972, matched by one spike in 1971, and followed by
88 Information visualization
Figure 6.1a Continued
low output in 1973. The comparison of values is easily done in the visual
format, and if we imagine extending the table to include hundreds or thou-
sands of data points, this fact would be even more dramatically clear.
What is the relationship of the data to the visualization? In this situation,
a line of dates is charted on the x-axis and a set of values is indicated by the
y-axis. The conventions of charts make this easy to read and even intuitive
in layout. But is there an inherent visual form in the data? One interesting
exercise is to put the same data into other graphical formats to see what
happens. Here are two examples of the same data but in a line chart and a
pie chart.
We are immediately confronted with the question of what features of the
graphical display are meaningful. For instance, the continuous line on the left
graphing the dates suggests that the rate of change in the data about pages
is a significant factor. But the “number of pages” data is actually a discrete
value. While the bar chart compares the values of each segment to each other,
the line chart makes these part of a continuous process, though this is not the
case. By contrast, the pie chart suggests that each entry is part of a whole—
that the sum total of pages is significant, not the difference in their value. The
values are hard to compare, the dates are lost entirely, and the concept of
the “whole” of the author’s output has no meaning. Neither of these charts
makes the correlation of date and page output as clear as the initial bar chart.
These are both “bad” graphics (and possibly bad data as well).
The point is that nothing in the data dictates the form of the visualiza-
tion. These and a host of other charts can be generated from the same data.
Information visualization 89
Figures 6.2 and 6.3 Other visualizations of the same data in Figure 6.1 (JD)
90 Information visualization
Any data set can be put into a pie chart, a continuous graph, a scatter plot,
a tree map, and so on. The challenge is to understand how the information
visualization creates an argument and then make use of the graphical format
whose features serve your purpose. Any sense that data have an inherent
visual form is an illusion. [See Exercise #1: A range of graphs.]
Data creation, as we noted in earlier (see Sections 2a and 2b), depends
on parameterization. As stated before, this means that anything that can be
measured, counted, or given a metric or numerical value can be turned into
data. The concept of parameterization is crucial to visualization because the
ways in which we assign value to the data will have a direct impact on the
ways it can be displayed. Visualizations are convincing by virtue of their
graphic qualities and can easily distort the data. While all visualizations are
interpretations, some are more suited to the structure of a given data set
than others.
Visualization basics
In many cases, the graphic image is an artifact of the way the decisions
about the design were made, not about the data. Understanding some basics
of the relation between graphics and metrics is essential.
Here are some fundamental guidelines for thinking about which chart
to use:
• The distinction between discrete and continuous data is one of the most
significant decisions in choosing a design. Example: in visualizing the
height of students in a class, making a continuous graph that connects
the dots makes no sense at all. There is no continuity between the height
of one student and another. Individual height is a discrete value.
• If you are showing change over time or any other variable, then a con-
tinuous graph is the right choice. Example: Change in height for indi-
vidual students over a five-year period.
• If a graph shows quantities with area, use it for percentages of a whole,
like a pie chart, not comparative value. If you increase the area of a
circle by length of the radius, or a square from the length of the side,
you are introducing distortion into the relation of the elements. This is a
common error. Example: The population in the town doubled from ten
thousand to twenty thousand in five years. The data is visualized with
two squares on a map, with the second having its sides twice the length
of the first (10,000 to 20,000). But the area of the second square four
times that of the first, not double.
• The way in which you label and order the elements in a chart will make
some arguments more immediately evident. If you want to compare
quantities, be sure they are displayed in proximity. Example: when
comparing the population size of states should you put the states in
Information visualization 91
alphabetical order or put the data in size order? Which is going to make
the information more legible?
• The use of labels is crucial and their design can either aid or hinder leg-
ibility. Where are the labels? How much work are you adding to your
reader’s experience?
• Another consideration and challenge is the choice of a scale. When
values are relatively close, the scale of the chart can be kept consistent.
92 Information visualization
Figure 6.5 Classic error in which a value increases numerically but the area increases
geometrically. The quantity on the right is twice that on the left, but the
area is four times as large (JD)
But imagine the charts of date and page outputs in the example above
if in one year the author produced 2000 pages. To show this value,
the scale would need to extend to forty times its current height. The
result would be that the difference between 20 pages and 50 pages
would barely register. The legibility of the graph and patterns would
be altered. To deal with such anomalies, charts are drawn with “bro-
ken” or modified scales, leaving a gap between lower and upper values.
These gaps need to be noted and taken into account in some kind of
legend, labeling, or documentation. [See Exercise #2: Reverse engineer-
ing a visualization.]
Figure 6.6a and 6.6b Charts showing ordering and labels: The first chart makes it
easy to find individuals by name, the second makes it easy to
compare heights and correlate with names (JD)
Figure 6.7 The scale has to stretch to include the height of the outlier and makes it
difficult to compare the differences among the close values in the middle
range. Making a “break” in the scale could allow focus on the area in
which the meaningful information is present (JD)
a format that exaggerated this information. She used the difference in her
data values to set the length of a radius in a circular form, also known as a
polar area diagram, thus distorting the area. (This is similar to the example
of the square, above, but here the area is calculated by the standard formula
A = π r² (area = pi x square of the radius r). The contrast was dramatic, and
she won her argument.
This kind of exaggeration can be very misleading in any chart that
uses area as a feature of its graphical form. As already noted, when using
graphics that are based on area, such exaggerations are built in. This
distortion is a regular feature of information display on maps, as will be
seen ahead.
Figure 6.8 William Playfair Chart of the National Debt, The Commercial and Political Atlas, 1786 (Public domain)
Information visualization 95
96 Information visualization
Takeaway
Information visualizations are metrics expressed as graphics. Information
visualizations allow large amounts of (often complex) data to be depicted
visually in ways that reveal patterns, anomalies, and other features of the
data. No data has an inherent visual form. Any data set can be expressed
in any number of standard formats, but only some of these are appropri-
ate for the features of the data. Certain common errors include misuse
of area, continuity, and other graphical qualities. The rhetorical force of
visualization is often misleading. All visualizations are interpretations, not
presentations of fact. Some graphic features of visualizations are artifacts
100 Information visualization
of the display, not of the data, and can contribute to the reification of
misinformation. Understanding the language of graphics is an art that
combines conceptual insight with design acuity. Still, even a novice can
produce useful graphics with current platforms and tools. The challenge
is to produce graphics that are appropriate to the research task and com-
munication of arguments.
Exercises
Recommended readings
D’Ignazio and Lauren Klein. 2016. “Feminist Data Visualization.” IEEE. www.aca
demia.edu/28173807/Feminist_Data_Visualization.
Drucker, Johanna. 2011. “Humanities Approaches to Graphical Display.” Digital
Humanities Quarterly. www.digitalhumanities.org/dhq/vol/5/1/000091/000091.
html.
Lupi, Giorgia. 2017. “Data Humanism: The Revolutionary Future of Data Visuali-
zation.” PRINT. www.printmag.com, www.printmag.com/post/data-humanism-
future-of-data-visualization.
Properties of networks
Networks exhibit varying degrees of closed-ness and open-ness. Researchers
interested in complex or emergent systems are attentive to the ways bound-
ary conditions are maintained under different circumstances, helping to
define the limits of a system. Social networks are almost never closed, and
like kinship relations or communications, they can quickly escalate to a very
high volume of connections. Epidemiologists trying to track the spread of a
disease are aware of how rapidly the connections among individuals grow
exponentially in a very short period of time. Network analysis is an essential
feature of textual studies, particularly of citations and influences. Network
analysis plays a large role in policy and resource allocation as well as in
other kinds of research work.
To reiterate what was already stated, the basic elements of any net-
work are nodes and edges. The degree of agency or activity assigned to
any node and the different attributes that can be assigned to any rela-
tion or edge will be structured into the data model. The data for linked
“nodes” are understood as “source” and “target” (even though these can
be reciprocal, and also, unrelated). Edges are the connections specified
between the nodes.
For an example of this in action, look at the project, Kindred Britain,
which studies connections of about 30,000 British individuals. The project
is meant to show the many ways in which connections form through social
networks, family ties, business, and political circumstances.
Another interesting example looks at the genre of “exchange poems” that
were part of medieval Chinese culture. These had traditionally been char-
acterized by schools and styles. But new research positioned them in social
networks. To paraphrase the work of the project director, Tom Mazanec,
it turns out that the Buddhist monks in the 7th to 10th centuries of the
Tang dynasty were central “nodes” in the network of literary production
(Mazanec 2017). Graphing these has changed the way this form of Chi-
nese poetry is understood and its place in cultural and social life. Relations
between literary forms and social activity that were not noted before were
revealed through the analysis.
Art historians Pamela Fletcher and Anne Helmreich used network analy-
sis to study the London art market, and found surprising insights from sales
records and auction catalogs (Fletcher and Helmreich 2012). Artists and
styles that have not necessarily been seen as important by later art historians
turned out to play a significant role in the markets of the time, even if they
have largely vanished from the canon. [See Exercise #1: Kindred Britain, a
social network project.]
104 Information visualization
Figure 6.11 Network graph, edgelist, and nodelist (Image courtesy of Nick Schwi-
eterman) (NS)
Complex systems
Systems that follow non-linear processes of development are called complex.
This does not mean complicated. A complex system can be as simple as a
relationship between two people, a person and an environment, or an envi-
ronment and changing conditions (Clemens 2019). What makes it complex
is that the development of the system cannot be predicted—because the pro-
cesses are non-linear and/or non-deterministic from a statistical standpoint.
106 Information visualization
The conditions in which they emerge continue to change and elements in the
system interact in unpredictable ways. Weather systems are a paradigmatic
example of complex systems, but so are stock markets, political processes,
social relations of all kinds, and cultural activities. Who could have pre-
dicted that a conceptual artist named Marcel Duchamp would confound
the conventions of the Western art world in 1917 by displaying a urinal
upside down in an exhibit? Or that Mao Tse Tung would come to power in
the Chinese Revolution? Or that the presence of the Missions in Australia
would create an opportunity for art practices that were 20,000 years old to
become codified in the medium of paint on oil and board? (Artlandish n.d.).
These are examples of complexity at work. Many—even most—cultural
processes are complex but modeling these requires more than creation of a
data set. This work involves modeling behaviors of agents and conditions
in a system.
Information designers—and artists—have been intrigued with visualiz-
ing complexity. Art exhibitions featuring data aesthetics have become com-
mon (Remondino, Stabellini, and Tamborrini 2018). The result has been a
rich vocabulary of vivid and dynamic information visualizations—as well
as some “eye candy” that may be more seductive than meaningful (Lima
2013). The process of constructing data and formulae for visualizing com-
plexity is more complicated than it is for other visualizations (Yau 2007–
2020). [See Exercise #4: Complexity.]
Advanced network theory pays attention to emergent properties of
systems. The capacity of networks to “self-organize” using very sim-
ple procedures that produce increasingly complex results makes them
useful models for looking at many kinds of behaviors in human and
non-human systems. Networks do not have to be dynamic, but complex
systems almost always are. The study of systems theory and of networks
is relatively recent and only emerged as a distinct field of research in
the last few decades. We might argue, however, that novelists and play-
wrights have been observing social networks for much longer, as have
observers of animal behavior, weather and climate, and the movements
of heavenly bodies held in relation to each other by magnetism, gravity,
and other forces. Most dynamic phenomena are complex systems gov-
erned by non-linear processes.
Takeaway
Networks consist of nodes (entities) and edges (relations). The data
model for a network is a simple three-part formula of entity-relation-
entity. This can be structured in a spreadsheet and exported to create a
network visualization. Networks emphasize relations and connections
of exchange and influence. Refining the relations among nodes beyond
the concept of a single relation is important and so is the change of
Information visualization 107
relations over time. Social networks change constantly, as do communi-
cation networks, and the relations among the technology that supports
a network and the psychological, social, or affective bonds can alter
independently.
Exercises
Recommended readings
Grandjean, Martin, and Aaron Mauro. 2015. “A Social Network Analysis of Twit-
ter: Mapping the Digital Humanities Community.” Cogent: Arts and Humanities
3 (1). www.tandfonline.com/doi/full/10.1080/23311983.2016.1171458.
108 Information visualization
Weingart, Scott. 2011. “Demystifying Networks, Parts I & II Journal of Digital
Humanities.” Journal of Digital Humanities 1 (1). http://journalofdigitalhumanities.
References cited
Apostol, Tom. 1969. “A Short History of Probability.” In Calculus, Vol. II. John
Wiley & Sons. http://homepages.wmich.edu/~mackey/Teaching/145/probHist.
html.
Artlandish. n.d. “Australian Aboriginal Art.” www.aboriginal-art-australia.com/
aboriginal-art-library/the-story-of-aboriginal-art/.
Bhasin, Jasin. 2019. “Graph Analytics—Introduction and Concepts of Central-
ity.” Towards Data Science. https://towardsdatascience.com/graph-analytics-
introduction-and-concepts-of-centrality-8f5543b55de3.
Clemens, Marshall. 2019. “Visualizing Complex Systems.” New England Complex
Systems Institute. https://necsi.edu/visualizing-complex-systems-science.
Fenton, William. 2015. “Humanizing Maps: An Interview with Johanna Drucker.” PC.
www.pcmag.com/news/humanizing-maps-an-interview-with-johanna-drucker.
Fletcher, Pamela, and Anne Helmreich. 2012. “Local/Global: Mapping Nineteenth-
Century London’s Art Market.” Nineteenth Century Art Worldwide 11 (3).
www.19thc-artworldwide.org/autumn12/fletcher-helmreich-mapping-the-london-
art-market.
Friendly, Michael. 2007. “DataVis.” www.datavis.ca/index.php.
Lengler, Ralph, and Martin J. Eppler. 2007. www.visual-literacy.org/periodic_table/
periodic_table.html.
Lima, Manuel. 2013. Visual Complexity: Mapping Patterns of Information.
New York, NY: Princeton Architectural Press. https://medium.com/@mslima/
visualcomplexity-com-ad9a12fa2c1a.
Lupi, Georgia. 2017. “Dear Data, the Project.” http://giorgialupi.com/dear-data.
Mansky, Jackie. 2018. “W.E.B. Du Bois’s Visionary Infographics Come Together
for the First Time in Color.” Smithsonian Magazine. www.smithsonianmag.com/
history/first-time-together-and-color-book-displays-web-du-bois-visionary-info
graphics-180970826/.
Mazanec, Tom. 2016–17. “Chinese Exchange Poems.” https://cdh.princeton.edu/
projects/chinese-exchange-poems/.
Norman, Jeremy. 2004–2020. “The History of Information.” www.historyofinfor
mation.com/detail.php?entryid=2929.
Rana, Ashish. 2018. “Getting Started with Network Data Sets.” Towards Data
Science. https://towardsdatascience.com/getting-started-with-network-datasets-
92ec54958c07.
Remondino, Chiara L., Barbara Stabellini, and Paolo Tamborrini. 2018. “Exhibition:
Visualizing Complex Systems.” https://systemic-design.net/wp-content/uploads/
2019/05/RSD7Exhibition_VisualizingComplexSystems.pdf.
Wild Maths. n.d. https://wild.maths.org/rené-descartes-and-fly-ceiling.
Yau, Nathan. 2007–2020. “Flowing Data Site.” https://flowingdata.com.
Zer-Aviv, Mushon. 2016. “If Everything Is a Network, Nothing Is a Network.”
Visualising Information for Advocacy, visualisingadvocacy.org. https://visualisin-
gadvocacy.org/node/739.html.
Information visualization 109
Resources
Cytoscape https://cytoscape.org/.
Gephi ttps://gephi.org/.
Kindred Britain http://kindred.stanford.edu/#.
Network Graphs (Flourish Studio) https://app.flourish.studio/@flourish/network-
graph.
Social Network Graphs https://gwu-libraries.github.io/sfm-ui/posts/2017-09-08-sna.