DVT 5th Unit
DVT 5th Unit
information to the targeted audience, while bearing in mind the task or purpose of the
visualization (exploration, confirmation, presentation
Steps in designing visualizations
Creating a visualization involves deciding how to map the data fields to graphical attributes, selecting and
implementing methods for modifying views, and choosing how much data to visualize.
1. Intuitive mappings from data to visualization
2. Selecting and modifying views
3. Information density—when is it too much or too little
4. Keys, labels, and legends
5. Using color with care
6. The importance of aesthetics
Inituitive mapping from data to visualization
To create the most effective visualization for a particular application, it is critical to consider the semantics of
the data and the context of the typical user.
In addition, the more consistent the designer is in predicting the user's expectations, the less chance there will
be for misinterpretation. Intuitive mappings also lead to more rapid interpretation, as translation time is
reduced.
For example, in Figure 13.1. images of planets are used to plot the relationship between the distance from the
planet to the sun and the duration of its orbit.
Mapping spatial data attributes, such as longitude and latitude, to screen position is perhaps the most common
and intuitive mapping found in visualizations.
Some of the earliest visualizations took advantage of the ability of humans to correlate position on the
drawing medium with position in the three-dimensional world.
Color has specific interpretations in fields such as cartography (land use classification) and geology
(stratigraphic layer classification), and thus the application domain for the visualization may dictate the
logical use for the color attribute.
Selecting and Modifying Views
The Key to developing an effective visualization is to be able to anticipate the types of views and view
modifications that will be of most use to the typical user, and then provide intuitive controls for setting and
customizing the views.
View modifications fall into a number of categories, and their inclusion as part of the functionality should be
considered based on user priorities.
1. Scrolling and zooming operations are needed if the entire data set cannot be presented at the resolution
desired by the user.
2. Color map control is almost always desirable, minimally supporting a set of different palettes, and preferably
offering the user control of either individual colors or the complete palette.
3. Mapping control allows users to switch between different ways of visualizing the same data. Features of the
data that are hidden in mapping may stand out in others
4. Scale control permits the user to modify the range and distribution.Similarly, of values for a particular data
field prior to its mapping. data clipping and other forms of filtering allow the user to focus on subsets.
Information Density—When Is It Too Much or Too Little?
One of the key decisions one makes when designing a visualization is determining how much information to
display. This gives rise to two extreme situations.
The first, which might be called "gratuitous graphics," occurs when there is very little information to present.
Many examples of graphics
The other extreme, trying to convey too much information, is also a common problem. Excessive information
content can lead to confusion, intimidation, and difficulties in interpretation on the part of the viewer.
Important information contained within the data can be lost or deemphasized on a cluttered display, and
viewers may have a hard time determining where to focus their attention.
There are many effective solutions to the problem of excessive information content in a visualization.
One method is to give the user the option of disabling or enabling different components of the display. In this
manner, a user can decide which parts are most important to her, and can have the less important information
displayed on demand.
Another solution is to use multiple screens, either as disjoint panes or with partial occlusion. This method
makes better use of screen space, while making each of the individual pieces of data readily available.
Keys, Labels and Legends
Supporting information should begin with a detailed caption indicating the particular data fields being
displayed, and the mappings that were used additionally, grid or tick marks should be displayed to convey the
ranges all values of interest for numeric fields when absolute judgments are important, and all axes should be
labeled with appropriate units.
If symbols are being used, a key must be provided, either along the border of the display or within a separate
widget.
The use of grid and tick marks can be both a boon and a curse to the visualization. Poor choices of the types of
markings and the density used can occlude the data being displayed and lead to a cluttered appearance.
The actual positions of the markings can also have a bearing on how readily the data is interpreted. Based on
the semantics of the data, certain gaps between markings may make more sense to the user than others.
One final rule of thumb pertains to the use of multiple frames or windows. It is important to follow a
consistent labeling and gridding scheme. Changing the position of labels and keys or the range of values
shown (for the same field) can cause confusion and increase the risk of misinterpretation.
If range changes are necessary (e.g., for views that differ in level of detail), the label, as well as the grid
markings, should convey the change. Similarly, if different color mappings are necessary, the visualizations
must clearly convey this information.
Using color with care
One of the most frequently misused parameters in visualization design is that of color. Selecting the wrong
color map or attempting to convey too much quantitative information through color can lead to ineffective or
misleading visualizations.
Also, since color perception is context-dependent (a particular color will appear quite different, depending on
adjacent colors), the characteristics of the data itself can influence how the colors are perceived.
Guidelines can assist in the effective use of color in visualization.
1. If the visualization task involves absolute judgment, keep the number of distinct numeric
2. Use redundant mappings if possible, e.g., map a particular field to both color and size to improve the chances of
the data being communicated accurately.
3. In creating a color map for conveying numeric information, make sure that both hue and lightness are changed
for each entry
4. Include a labeled color key to help users interpret the colors
5. When possible, use semantically resonant colors in the visualizatiou these will be easier for Users to learn and
remember
The Importance of Aesthetics
Once we have ensured that our designed visualization conveys the desired information to the user
(function), the final step is to assess the aesthetics (form) of the results. The best visualizations are
both informative and pleasing to the eye. In contrast, a visualization might be so visually
unappealing that it detracts from the communication process. An aesthetically pleasing visualization
invites the viewer to study it in depth.
There are many guidelines for attractive visualization design that can be drawn from the art and
graphic design communities. These include:
Focus. The viewer's focus should be drawn toward the part of the visualization that is most
important. If the important components are not sufficiently emphasized, viewers don't have
sufficient cues for guiding their inspection
Balance. The screen space should be used effectively, with the most important components in the
center. Emphasis should not be given to any particular border
Simplicity. Don't try to carry too much information in one display, and don't use graphics gimmicks
simply because they are available (e.g., using 3D Phong shaded histograms when a bar or line chart
could convey the same information). A useful procedure to follow once a visualization has been
designed is to iteratively remove features and measure the loss of information being conveyed.
Features whose removal results in minimal loss can probably be discarded
Problems in Designing Effective Visualization
These problems have deeper root and relate to decisions regarding what to visualize and what is the most
appropriate method to use. These include,
Misleading Visualizations
One of the foremost rules of visualization is image should be an accurate depiction of the data. These
so-
called "viz lies" can be found everywhere, from the most prestigious journals to company
portfolios. we identify some of the common strategies for creating misleading visualizations,
Data Scrubbing: Raw data can often be very rough in form, and the temptation when creating
a visualization is to remove some of the roughness. Unfortunately, sometimes the selection of
which data to remove is biased to eliminate data that does not support a particular point that
the author of the data is espousing .
Outlier removal is a common tactic in this situation. Unless there is reason to believe that the
outliers resulted from flaws in the data acquisition process, they should not be removed
without informing the viewer and providing the option for the outliers to be displayed.
Unbalanced scaling: Scaling is a powerful tool in visualization, since careful selection of scale
factors can reveal patterns and structures not visible in unscaled views. However, scaling can
be used to deceive the viewer into believing that a trend is stronger or weaker than by the
data.
For Example, the size of objects in the background is reduced in width and height by
Problem with data scrubbing
Range Distortion: Viewers often have a expectation about the ranges for a particular data dimension; by setting
this range to be significantly different from his expectation, the user may deceived into misinterpretation.
This is often done by moving an axis so it no longer corresponds with the expected “zero value”.
Designer may want to give the user the option of moving this baseline to avois wasting free space, but it should
be made clear what the baseline is,especially if it departs from the established norm.
Abusing Dimensionality: Errors in interpretation rise with the power of the dimensionality being portrayed. Our
errors in judging the volume are much worse than those for area, which in turn are worse than those for length.
Therefore mapping a scalar value to a graphical attribute such as volume can dramatically increase the
livelihood of erroneous interpretation/
Visual Nonsense – Comparing Apples & Oranges
Visualizations are designed to convey information, and it is important that the information be meaningful.
Visualizations are often created by combining data sets from different sources. However, it is easy to combine
unrelated components into a single visualization and identify what seems to be structure;
For example, plotting stock market values against occurrences of sunspots (see Figure 13.18). In this case,
coincidental relationships can be confused with causal relationships. In deciding what data to combine, it is
important to first ensure that there is some logic in the combination.
One Of the problems found in analytic pattern recognition/data mining processes is that these irrelevant
relationships are often discovered and reported, which must then be eliminated by a domain specialist. The
visualization designer should attempt to avoid creating nonsense graphics before they are presented to users.
Another factor that must be considered is compatibility between temporal and spatial ranges for data being
compared.
Thus, for example, one shouldn’t compare the sales of a particular product in one Year for a particular region
of the country with the sales of the same product for a different region and year, unless one is hypothesizing
that a migration in interest for the product is occurring.
Compatibility in units also needs to be examined in creating a data set for visualization. For example, food
products that are measured in terms of price per volume are often mixed with those measured in price per
weight. An effective, visualization of this data might normalize them both to price per serving.
Losing data in the Chart junk
Including labeled grid or tick marks on visualizations that require quantitative assessment is more important.
The excessive use of such markings is an example of chart junk . Chart junk can be defined as any
supplementary (nondata) graphics in a visualization that are not necessary for the accurate interpretation of the
data.
This additional information can lead not only to visualizations that appear overly complex, but also to
occlusion and deemphasis of the actual data.
Deciding the amount of supplementary graphics to put in a visualization is sometimes a difficult process, since
the designer might not know the needs Of all potential users, In some visualization tasks, users can switch
between qualitative overviews and quantitative analysis.
A good rule of thumb is to provide sufficient tools to support the user’s quantitative needs, but with the option
of disabling them or altering their degree of presence in the visualization.
Raw versus Derived data
In some visualizations, it is common practice to throw out all of the raw data and only show
the smooth approximation derived from that data.
This forces the viewer to trust that the approximation is an accurate portrayal of the data,
which is often not the case when the designer blindly applies statistical fitting algorithms.
It is best to show both the raw data and the fitted model first, and to allow one or the other to
be deemphasized or filtered out on demand (see figure 13.19).
Yet another form of cleaning the data is the process of resampling, where raw data positioned
either on a sparse grid or randomly are used to create data that are either denser or on a
regularly spaced grid. This can result in a much richer visualization, approaching that of
continuous sampling. But it again deceives the user into believing the data set is much larger
than it actually is.
The denser the resampling, the more likely that the user will misinterpret the data, unless the
phenomenon being observed has little variability.
For example, figure 13.20 shows the locations of global temperature monitoring stations. Clearly, there are large
voids where 110 stations exist, so resampling could result in many wrong conclusions, such as that the entire
northern part of south america would be interpolated by the readings from four or five stations, with the
conclusion being that the region has dropped in temperature over the past century.
Insufficient sampling is another problem. As the images in figure 13.21 show, a sampling
that doesn't look at the data characteristics can miss many important features. The left image
is sampled and interpolated uniformly, while the right image uses contour information to
add sample points where significant changes occur.
Absolute versus Relative Judgement
Humans have a fairly limited ability to make absolute judgments of visual stimuli. This
implies that visualizations that depend too heavily on users performing accurate
measurements of graphical attributes such as position, length, and color will result in
problems in interpretation.
One means of combating this human limitation is to design visualizations that either rely on
relative rather than absolute judgment, or that are restricted to only using a small number of
distinct values for each graphical attribute being used to convey information.
Bounding boxes, grids, and tick marks are all excellent tools for converting an absolute
judgment task to one that depends more on relative judgment.
By comparing the length or position of a graphical entity against a quantified structure, users
can more rapidly determine the approximate value relative to the known levels. Using
residuals (e.g., subtracting values from their means) can also change a measurement task to
one of deciding whether a value is above or below a particular level (see Figure 13.22)'
Research Directions in Visualization
Visualization is a sufficiently mature field that many groups of researchers are dedicating time to contemplate
and predict the future directions for it.
In this we will identify and elaborate upon some of the common themes within these and other research agenda
reports.
Issues of Data
Scale:
Static versus dynamic: While most visualization techniques to date have been developed with the assumption
that data is static (e.B., in files or databases), a growing interest is in the area of visual analysis of dynamic
data .
An increasing number of streaming data sources are being studied in the database and data mining
communities, and efforts to perform visual analysis on this type of data are starting to emerge.
The basic concept is that the data is continually arriving, and has to be analyzed in real time, both because of
the urgency with which analysis must be performed, as well as the fact that the volume of data precludes its
permanent storage.
Spatial versus nonspatial data: A growing number of application areas for visualization include both spatial
and nonspatial data, including many scientific and engineering fields. To provide analysts with a powerful
environment for studying this data, several recent efforts have focused on the integration of the spatial
visualization techniques normally found in scientific visualization with the nonspatial techniques that are
common in information visualization.
Nominal versus ordinal: The graphical attributes to which we map data in our visualizations, such as position,
size, and color, are primarily quantitative in nature, while it is quite common to have data that is not
quantitative, such as the name of a gene or the address of an employee. If this nominal data is to be used in the
visualization, a mapping is needed. However, it is also important to ensure that relationships derived from visual
analysis are truly part of the data, and not an artifact of the mapping.
Structured versus nonstructured:Data can be classified based on the degree to which it follows a predictable
structure. For example, tables of numbers would be considered highly structured, while newspaper articles may
be regarded as unstructured. In between, we can have semi-structured data, such as an e-mail message that
contains both a structured component (sender, time, receiver) and an unstructured part (message body).
Time:Time is a special variable (and attribute of data). Time in dynamic data provides one view: a volume
visualization over time deals with a physical representation, and a common interactive visualization uses time as
a control. Spatio-temporal databases and queries and the visualization of results are becoming prominent as
more data is being made publicly available.
Variable quality:while most visualization systems and techniques assume the data is complete and reliable, in fact most
sources of data do not match these constraints. Often, there are missing fields in the data, often due to acquisition
problems (e.g., defective sensors, incomplete forms). The quality of the data itself may also be problematic; out-of-date
information can have low certainty associated with it, inaccurate sensors may produce values with significant variability,
and manual entry of data can be error-prone.
Issues of Cognition, Perception and reasoning
Many of the foundational concepts in data and information visualization have their roots in our understanding
of human perception, particularly in aspects of selecting effective mappings of data to graphical attributes such
as color and size.
Tasks such as discovering associated patterns of change in the data will involve not only visualization of the
data, but also how the data is changing and how those changes may be associated with other changes taking
place.
These higher-level discoveries can then be used by the analyst to form, confirm, or refute hypotheses, expand
or correct mental models, and provide confidence in decision-making processes.
These higher-level discoveries can then be used by the analyst to form, confirm, or refute hypotheses, expand
or correct mental models, and provide confidence in decision-making processes .
Beyond decision making, we can also envision the expansion of visualization in the process of human learning.
Different visual tools of learning.and mechanisms are likely to be needed to address these very different styles.
In both problem solving and learning activities, we can also imagine using visualizations as a mechanism to
expand and support the memory process, which is critical to both activities.
Issues of System Design
One of the most crucial research challenges in developing visualization tools is determining how best to
integrate computational analysis with interactive visual analysis.
While many visualization systems support a modest number of computational tools, such as clustering,
statistical modeling, and dimension reduction, and similarly many computational analysis systems support some
amount of visualization, such as visualizing the analysis re-sults, there have been no systems developed to date
that provide a truly seamless integration of visual and computational techniques.
Another key problem in visualization system design is the development of powerful new interaction paradigms
to support the user's tasks. Many researchers believe that existing modes of interaction during visual analysis
are il-suited to the tasks being performed.
This is likely due to the fact that while advances in hardware and visualization techniques have been moving
by leaps and bounds, interaction methods have expanded at a much slower pace.
Another issue is that most visualization systems require an expert user.Visualization in the last few years has
made its appearance in daily newspapers and television, and interactive visualization is common on the web.
These are minimally interactive and often very simple visualizations. Can we develop ones that are engaging
and easy to use? This is becoming critical as we are encountering the era of the democratization of data.
Finally, we are still developing visualizations and visualization systems based on experience, pragmatic
research, and heuristics. We do not yet have a science of visualization. There have been attempts at automating
the process: given data, automatically generate a visualization.
Issues of Evaluation
In the early days of visualization research, rigorous evaluation was rarely performed; the assumption was that
some visualization was better than no visualization for many tasks, and that if a new technique were developed,
it was sufficient to simply place a couple of sets of images side by side and do a qualitative judgment of the
results.
More recently, there have been a large number of concerted efforts to incorporate a more formal evaluation
process into visualization research, not only to enable quantification of improvements as they occur, but also to
validate that visualization has measurable benefits to analysis and decision making (68,321).
While many strategies have been developed and tested, there are many avenues of research toward improving
the overall process. Some unanswered questions include:
How important are aesthetics in designing visualizations, and how can they be measured?
• How can we use the understanding of human perceptual and cognitive limitations to design and improve
visualizations?
• How do we measure the benefits of visual analysis as compared to more traditional computational analysis?
What quantitative and qualitative measures of usability are most important for different categories (novices
versus experts) and domains of users?
• How do we measure the information content, distortion, or loss in a visualization and use this information to
produce more accurate and informative visualizations?
• What are the relative benefits of long, longitudinal studies with a small number of users, versus limited tests
with a large number of subjects?
• What mixture of domain knowledge and visualization knowledge is needed to design and develop effective
tools?
Issues of Hardware
Whenever computer technology advances, the applications that employ this technology must be reassessed to
see how the advances can be leveraged.For visualization, there are several technologies that can and will have
an impact.
Hand-held displays: Most people these days carry with them at least one form of digital display, whether it be
mobile phones, PDAs, portable games, or tablets. While most visualization systems have been designed for
desktop (or larger) displays, there are still significant opportunities to deliver interactive representations of
information and data on these smaller de-vices.
Examples of potential applications abound. For maintenance aircraft, ships, and even buildings, having detailed
presentations of wiring aircraft plumbing diagrams, sensor output, aand access paths can can greatly simplify a
technician's tasks.
For crisis management during an emergency. police, firefighters, medical personnel, and other key players need
interactive real-time access to information presented in a clear, unambiguois fashion. For those who monitor
border crossings, rapid access to risk assessments, cargo manifests, and travel histories can help prevent entry
by unwelcome individuals and material. The key is to develop visual solutions that make effective use of the
limited display space and interactivity options.
Display walls.:At the other extreme, large-scale displays, often involving multiple panels stretching 10-30 feet
in each direction, are becoming more and more common, not only for control centers, but also for
investigating large data and information spaces.
A better solution would be to redesign the visual analysis environment to arrange the displays of different
types and different views of information in a way that supports the analysis.
In this way, high-resolution displays can always be visible, rather than being covered in a typical desktop
solution, thus requiring viewers to just move their head or shift their focus to see different views.
Immersive environments. Virtual and augmented reality systems have been frequently used within the
visualization field. Virtual walk-throughs and fly-throughs have been used in a diversity of fields, including
architecture, medicine, and aeronautics.
A key problem with this technology is the need to render the visualizations with minimal latency, which has
spawned significant research in algorithm optimization. While the "killer application" for virtual environments
has yet to be discovered, it will undoubtedly require significant visualization technology.
Google glasses and similar devices providing an augmentation to the user's world are creating a number of
opportunities for improved interactions and visualizations in dealing with the real world and the projected one.
Graphical processing units: The development of special-purpose graphics hardware has actually exceeded the
growth in performance of general purpose CPUs, primarily driven by the computer game industry.
Due to the architecture of a typical GPU, existing algorithms designed for CPUs do not, in general, port directly
to the GPU, but require a nearly total redesign.
However, as more and more software and hardware engineers become versed in this programming paradigm, it
is likely we will see a growing use of this technology, not only for graphics, but also for complex algorithms in
general.
Interaction devices. Each new device for user interaction with the computer opens up a wide range of
possibilities for use in visualization. Voice/sound input and output have been extensively studied, though they
are rarely an integral component of a visualization system.
Another avenue for development is to examine how different controllers used in modern game consoles could
be employed to support visualization. The popular Wii input wand could be used for specifying actions via
gestures. Other devices, such as head and eye trackers, may have significant potential in the visualization field.
Brain control is another area providing great opportunities for controlling and interacting with visualizations.
Issues of Applications
Many advances in the field of visualization have been driven by the needs of a particular application domain.
These advances are then often generalized to be applicable to many other domains and problems.
Breadth-based innovations:Another direction of research is to broaden the number of applications in which
data and information visualization can be applied. Indeed, it is hard to imagine an area in which visualization
would not be applicable, as all areas of society are experiencing a glut of informa-tion, while at the same time,
display devices have become ubiquitous.
In many applications, visual information presentation is rapidly replacing much if the textual communication,
such as weather reports, stock market behaviour, health statistics, and so on. Daily schedules are often best
captured in a graplical presentation of an hourly or daily calendar.
Graphs are used to capture complex social networks, organizational charts, process flows, and communication
patterns.
Another critical problem is the conversion of data and information into a format that is amenable to existing
visualization techniques. While many tools now accept input from standard database tables and spreadsheet
files, much data and information is still stored either in proprietary formats or as unstructured files. Concerted
efforts are needed, both in the visualization of unstructured data, as well as in automatic or semi-automatic
conversion of unstructured data into structured forms, to tap into these rich sources of new visualization
applications.