tmp7211 TMP
tmp7211 TMP
Editor:
Theresa-Marie Rhyne
Background
The September/October 2004 issue of CG&A introduced the term visual analytics to the computer
science literature.1 In 2005, an international team
of multidisciplinary panelists consensually and
collectively defined the then newly established
area as the science of analytical reasoning facilitated by interactive visual interfaces.2 The means
and targets of VA have since evolved and expanded
significantly, covering both scientific and nonscientific data of different types, shapes, sizes, domains, and applications. As extreme-scale datasets
began revolutionizing our daily working life,3 researchers looked to VA for solutions to their bigdata problems.
Todays extreme-scale VA applications often combine high-performance computers for computation, high-performance database appliances and/or
Addressing the top challenges has profound, farreaching implications, for not only fulfilling the
critical science and technical needs but also facilitating the transfer of solutions to a wider community. We thus evaluate the problems from both
technical and social perspectives. The order of the
following challenges doesnt reflect their relative
importance but rather the content correlation
among individual challenges.
1. In Situ Analysis
The traditional postmortem approach of storing
data on disk and then analyzing the data later
might not be feasible beyond petascale in the near
future. Instead, in situ VA tries to perform as
much analysis as possible while the data are still
in memory. This approach can greatly reduce I/O
63
Visualization Viewpoints
uman-computer interaction has long been a substantial barrier in many computer science development
areas. Visual analytics (VA) is no exception. Significantly
increasing data size in traditional VA tasks inevitably compounds the existing problems. We identified 10 major
challenges regarding interaction and user interfaces in
extreme-scale VA. The following is a brief revisit of our
study of the topic.1
5. Heterogeneous-Data Fusion
Many extreme-scale data problems are highly heterogeneous. We must pay proper attention to analyzing the interrelationships among heterogeneous data objects or entities.
The challenge is to extract the right amount of semantics
from extreme-scale data and interactively fuse it for VA.
July/August 2012
3. Large-Data Visualization
This challenge focuses primarily on data presentation in VA, which includes visualization tech-
References
1. P.C. Wong, H.-W. Shen, and C. Chen, Top Ten
Interaction Challenges in Extreme-Scale Visual
Analytics, Expanding the Frontiers of Visual Analytics and Visualization, J. Dill et al., eds., Springer,
2012, pp. 197207.
2. S. Ashby et al., The Opportunities and Challenges
of Exascale ComputingSummary Report of the
Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee, US Dept. of Energy
Office of Science, 2010; http://science.energy.
gov/~/media/ascr/ascac/pdf/reports/Exascale_
subcommittee_report.pdf.
5. Algorithms
Traditional VA algorithms often werent designed
with scalability in mind. So, many algorithms either are too computationally expensive or cant
produce output with sufficient clarity that humans can easily consume it. In addition, most algorithms assume a postprocessing model in which
all data are readily available in memory or on a
local disk.
We must develop algorithms to address both
data-size and visual-efficiency issues. We need to
introduce novel visual representations and user
interaction. Furthermore, user preferences must
be integrated with automatic learning so that the
visualization output is highly adaptable.
When visualization algorithms have an immense search space for control parameters, automatic algorithms that can organize and narrow
the search space will be critical to minimize the
effort of data analytics and exploration.
65
Visualization Viewpoints
8. Parallelism
To cope with the sheer size of data, parallel processing can effectively reduce the turnaround time
for visual computing and hence enable interactive
data analytics. Future computer architectures will
likely have significantly more cores per processor.
66
July/August 2012
Discussion
The previous top-10 list echoes challenges previously described in a 2007 DOE workshop report.6
Compared to that report, our list concentrates particularly on human cognition and user interaction
issues raised by the VA community. It also pays increased attention to database issues found in both
public clouds and private clusters. In addition, it
focuses on both scientific and nonscientific applications with data sizes reaching exabytes and beyond.
Acknowledgments
The article benefited from a discussion with Pat Hanrahan. We thank John Feo, Theresa-Marie Rhyne, and
the anonymous reviewers for their comments. This
research has been supported partly by the US Department of Energy (DOE) Office of Science Advanced
Scientific Computing Research under award 59172,
program manager Lucy Nowell; DOE award DOESC0005036, Battelle Contract 137365; DOE SciDAC
grant DE-FC02-06ER25770; the DOE SciDAC Visu-
alization and Analytics Center for Enabling Technologies; DOE SciDAC grant DE-AC02-06CH11357; US
National Science Foundation grant IIS-1017635; and
the Pfizer Corporation. Battelle Memorial Institute
manages the Pacific Northwest National Laboratory
for the DOE under contract DE-AC06-76R1-1830.
References
1. P.C. Wong and J. Thomas, Visual Analytics, IEEE
Computer Graphics and Applications, vol. 24, no. 5,
2004, pp. 2021.
2. J.J. Thomas and K.A. Cook, eds., Illuminating the
Paththe Research and Development Agenda for Visual
Analytics, IEEE CS, 2005.
3. B. Swanson, The Coming Exaflood, The Wall Street
J., 20 Jan. 2007; www.discovery.org/a/3869.
4. P.C. Wong, H.-W. Shen, and C. Chen, Top Ten Interaction Challenges in Extreme-Scale Visual Analytics,
Expanding the Frontiers of Visual Analytics and Visualization, J. Dill et al., eds., Springer, 2012, pp. 197207.
5. ASCR Research: Scientific Discovery through Advanced
Computing (SciDAC), US Dept. of Energy, 15 Feb.
2012; http://science.energy.gov/ascr/research/scidac.
6. C. Johnson and R. Ross, Visualization and Knowledge
Discovery: Report from the DOE/ASCR Workshop
on Visual Analysis and Data Exploration at Extreme
Scale, US Dept. of Energy, Oct. 2007; http://science.
energy.gov/~/media/ascr/pdf/program-documents/
docs/Doe_visualization_report_2007.pdf.
Pak Chung Wong is a project manager and chief scientist
in the Pacific Northwest National Laboratorys Computational and Statistical Analytics Division. Contact him at
[email protected].
Han-Wei Shen is an associate professor in Ohio State Universitys Computer Science and Engineering Department.
Contact him at [email protected].
Christopher R. Johnson is a Distinguished Professor of
Computer Science and the director of the Scientific Computing and Imaging Institute at the University of Utah. Contact
him at [email protected].
Chaomei Chen is an associate professor in Drexel Universitys College of Information Science and Technology. Contact
him at [email protected].
Robert B. Ross is a senior fellow of the Computation Institute at the University of Chicago and Argonne National
Laboratory. Contact him at [email protected].
Contact department editor T heresa-Marie Rhyne at
[email protected].
IEEE Computer Graphics and Applications
67
computingnow.computer.org