0% found this document useful (0 votes)
14 views

Lecture 3

Big Data visual analytics

Uploaded by

Jeswaanth Gogula
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture 3

Big Data visual analytics

Uploaded by

Jeswaanth Gogula
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Big Data Visual Analytics (CS 661)

Instructor: Soumya Dutta


Department of Computer Science and Engineering
Indian Institute of Technology Kanpur (IITK)
email: [email protected]
Reading Materials for Lectures 1
&2
• Visualization Analysis and Design by T. Munzner
• Chapter 1, Chapter 2, Chapter 5
• Book available online at IITK Library

Reading Materials for Lecture 3 (Today)


• The Visualization Toolkit by Will Schroeder, Ken Martin, Bill Lorensen
• Chapter 1, Chapter 4, Chapter 5
• Get the pdf: https://vtk.org/vtk-textbook/
• Reference for learning VTK:
• VTK User’s Guide
• Get the pdf: https://vtk.org/vtk-users-guide/
• Examples: https://kitware.github.io/vtk-examples/site/Python/
IITK CS661: Big Data Visual Analytics: Soumya Dutta 2
Announcements
• If you are not part of HelloIITK course page, please email me with
your IITK email ASAP
• If you are not able to form a group for assignments, do it by tonight
and update the google spreadsheet
• I will assign the remaining students to groups of 2 randomly

IITK CS661: Big Data Visual Analytics: Soumya Dutta 3


Acknowledgements
• Some of the following slides are adapted from the excellent course
materials and tutorials made available by:
• Prof. Han-Wei Shen (The Ohio State University)
• Prof. Klaus Mueller (State University of New York at Stony Brook)
• David DeMarle (Intel)

IITK CS661: Big Data Visual Analytics: Soumya Dutta 4


Scientific Data
Analysis and
Visualization

IITK CS661: Big Data Visual Analytics: Soumya Dutta 5


Imaging, Computer Graphics, and Visualization

Image processing Computer Graphics Visualization

• Study of 2D pictures, • Process of creating • Process of exploring,


or images images using a transforming, and
• Transform, extract computer viewing data as
features • 2D paint-and-draw images, and plots
• Analyze the data • Sophisticated 3D • Gain understanding
rendering techniques and insight into the
data
“the purpose of visualization is insight, not pictures” – Ben
Shneiderman
IITK CS661: Big Data Visual Analytics: Soumya Dutta 6
Scientific Visualization
(SciVis)

• Technique for comprehending data & knowledge extraction from


the results of simulations, computations, or measurements
• Field in computer science that encompasses
• data representation and processing algorithms
• visual representations
• user interface
• other sensory presentation such as sound, touch, AR, VR
• Relatively (new) domain of research (~ 36 years)
• Formal inception in 1987 by US NSF
• “Visualization in Scientific Computing” by McCormick et al.

IITK CS661: Big Data Visual Analytics: Soumya Dutta 7


Example – Application in Climate Science

IITK CS661: Big Data Visual Analytics: Soumya Dutta https://www.youtube.com/watch?v=8Df96rx3i9g 8


Example – Application in Asteroid Impact Assessment

IITK CS661: Big Data Visual Analytics: Soumya Dutta https://www.youtube.com/watch?v=95z0qRNFFxs 9


Big Data in Scientific Visualization
• Big data describes datasets that are so large,
complex, or rapidly changing that they push the
very limits of our analytical capability1

Megascale (106 FLOPS) (1961)

Gigascale (109 FLOPS) (1972)

Terascale (1012 FLOPS) (1997)

Petascale (1015 FLOPS) (2009)

Exascale (1018 FLOPS) (2022)


FRONTIER World’s fastest supercomputer at Oak Ridge
National Laboratory, credit: OLCF
IITK CS661: Big Data Visual Analytics: Soumya Dutta 1
NIST Big Data Public Working Group: (https://bigdatawg.nist.gov/_uploadfiles/NIST.SP.1500-1.pdf) 10
Big Data in Scientific Visualization

Exa-Star: Simulation of Universe MFIX-Exa: Studies CLR TURBO: Studies Engine Stall
The target science includes simulations Deliver high-fidelity multiphase flow Enable understanding of rotating
of astrophysical explosions (such as modeling capabilities for applications stall in Jet engines and develop
supernovae and neutron stars) to in reducing CO2 emissions from fossil stall detection as well as stall
understand the cosmic origin and the fuel power plants. prevention measures.
fundamental physics.
The simulation will use billions of Just a single revolution produces
Targeted to simulate 8192^3
particles to model the physics. A 5TB of data and hundreds of
resolution grid with 6 scalar
simulation with 200 million particles and revolutions are needed to run.
quantities, i.e., 4TB per time step
100 time steps will need 160 TB storage.

IITK CS661: Big Data Visual Analytics: Soumya Dutta 11


Big Data Characterization
• The “5Vs” of Big Data
Velocity
• Velocity: Rate at which data is generated
• Volume: The extreme size of the data Value
• Variety: Diverse types of structured and Volume
unstructured data
• Veracity: Quality and truthfulness of the
data
• Value: Contributes to informed decision
Veracity Variety
making

IITK CS661: Big Data Visual Analytics: Soumya Dutta 12


Big Data Characterization
• The “5Vs” of Big Data
Velocity
• Velocity: Rate at which data is generated
• Volume: The extreme size of the data Value
• Variety: Diverse types of structured and Volume
unstructured data
• Veracity: Quality and truthfulness of the
data
• Value: Contributes to informed decision
Veracity Variety
making

The 6th V: Visualization!

IITK CS661: Big Data Visual Analytics: Soumya Dutta 13


Visualization Pipeline

IITK CS661: Big Data Visual Analytics: Soumya Dutta 14


Visualization Pipeline

IITK CS661: Big Data Visual Analytics: Soumya Dutta 15


Visualization Pipeline

IITK CS661: Big Data Visual Analytics: Soumya Dutta 16


Direct and Inverse Mapping

IITK CS661: Big Data Visual Analytics: Soumya Dutta 17


Direct and Inverse Mapping

IITK CS661: Big Data Visual Analytics: Soumya Dutta 18


Data Representation &
Scientific Data Model

IITK CS661: Big Data Visual Analytics: Soumya Dutta 19


Data Representation
• Scientific data is Continuous in nature
• Measure of physical quantities that are studied by various disciplines
• Mathematically, a continuous data is defined as a function

• In practice, such data is sampled in a discrete form in computers for


representation, manipulation, analysis, and visualization

IITK CS661: Big Data Visual Analytics: Soumya Dutta 20


Scientific Dataset
• A scientific dataset is a related collection of data with a spatial context
• In practice, we deal with discrete datasets sampled discretely from a
continuous domain for digital representation

IITK CS661: Big Data Visual Analytics: Soumya Dutta 21


Scientific Dataset: Example

Continuous domain Discrete samples Discrete samples: Grid points

Reconstructed data from grid points Data values are observed at grid points Grid points connected to form cells
IITK CS661: Big Data Visual Analytics: Soumya Dutta 22
Scientific Dataset: Various Cell Types

Non-linear Cells
Linear Cell Types
Types
IITK CS661: Big Data Visual Analytics: Soumya Dutta 23
Demo Linear
Cell
Code:
https://kitware.github.io/vtk-
examples/site/Python/GeometricO
bjects/LinearCellDemo/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 24


Scientific Dataset: Attribute Data

IITK CS661: Big Data Visual Analytics: Soumya Dutta 25


Scientific Dataset: Grid Types
Uniform grid Grid types

Rectilinear grid

Structured grid

Unstructured grid

IITK CS661: Big Data Visual Analytics: Soumya Dutta 26


Scientific Dataset: Uniform Grid
• Axis aligned box
• Sample points are equally spaced

IITK CS661: Big Data Visual Analytics: Soumya Dutta 27


Scientific Dataset: Rectilinear Grid
• Axis aligned box
• Sample points are nonequally spaced

IITK CS661: Big Data Visual Analytics: Soumya Dutta 28


Scientific Dataset: Structured Grid
• Allow explicit placement of every sample point
• Yet preserve the matrix-like ordering
• Structured grids can be seen as a deformation of
uniform/rectilinear grids

IITK CS661: Big Data Visual Analytics: Soumya Dutta 29


Scientific Dataset: Unstructured Grid
• Allow us to define both the sample points and cells explicitly
• Different cell types can be mixed
• Connectivity is explicitly specified

IITK CS661: Big Data Visual Analytics: Soumya Dutta 30


Visualization Toolkit
(VTK)

https://vtk.org/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 31


What is VTK?
• VTK: Visualization Toolkit
• An open source, freely available
software library for 3D visualization,
graphics, and image processing
• Support for hundreds of algorithms
• Object­‐oriented design with different
interpreted language wrappers.

IITK CS661: Big Data Visual Analytics: Soumya Dutta Vtk.org 32


Installing VTK
• https://anaconda.org/conda-forge/vtk
• You can also use pip to install VTK

IITK CS661: Big Data Visual Analytics: Soumya Dutta 33


VTK: Resources
• Examples: https://kitware.github.io/vtk-examples/site/Python/
• VTK User’s Guide: https://vtk.org/vtk-users-guide/
• VTK Textbook: https://vtk.org/vtk-textbook/

IITK CS661: Big Data Visual Analytics: Soumya Dutta 34


VTK System Architecture
Wrapper (Python, Java)
•Tcl/Tk shell •Tcl/Tk source
•Java interpreter •Java JDK
•Python interpreter •Python source

C++ core
Libraries and All class
includes (dll and .h source code
files) (could take
Or hours to
(.a and .h files) compile)

Binary Installation: if you will Source code


use the classes to build your Installation: If you
applicatoin want to extend vtk

IITK CS661: Big Data Visual Analytics: Soumya Dutta 35


VTK classes

(https://vtk.org/doc/nightly/html/classes.html
)
IITK CS661: Big Data Visual Analytics: Soumya Dutta 36
VTK Data Types

IITK CS661: Big Data Visual Analytics: Soumya Dutta 37


VTK Pipeline Execution
Source
Direction of data flow

Filter

Mapper

Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 38


VTK Pipeline Execution: Source
Source • Source indicates the dataset source
• Involves data loader of various types
Direction of data flow

• Uniform, structured, rectilinear, etc.


Filter

Mapper

Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 39


VTK Pipeline Execution: Filter
Source
• Filters are another name of algorithms in VTK
• threshold, connected component, surface
extraction, volume render, etc.
Direction of data flow

Filter

Mapper

Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 40


VTK Pipeline Execution: Mapper
Source • Mappers convert data into graphical primitives
• Mappers require one or more input data objects,
Direction of data flow

output from Filters


Filter
• Example: vtkPolyDataMapper, which takes geometry
such as cylinder or cone as input and convert it to
Mapper renderable geometry

Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 41


VTK Pipeline Execution: Actor
Source • Actors represent graphical data or objects with various
properties for rendering
Direction of data flow

• A VTK actor contains


Filter
– object properties (color, shading type, etc.)
– geometry
– transformations
Mapper
• VTK actors need to work together with lights (vtkLight) and
camera (vtkCamera) to make a scene
Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 42


VTK Pipeline Execution:
Renderer
Source
• vtkRenderer coordinates the rendering process
involving lights, camera, and actors
Direction of data flow

Filter • vtkRenderer creates a default camera and light


if not present, but needs to have at least one
Mapper
actor

Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 43


VTK Pipeline Execution: Render
Window
Source
• The class, vtkRenderWindow ties the
entire rendering process together
Direction of data flow

Filter • Manages all the platform dependent window


management issues and hide the details from
Mapper
the user

Actor
Render Window

Renderer

IITK CS661: Big Data Visual Analytics: Soumya Dutta 44


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 45


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 46


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 47


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 48


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 49


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 50


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 51


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 52


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 53


A Simple VTK Program

IITK CS661: Big Data Visual Analytics: Soumya Dutta 54


A Simple VTK Program
$> Python Pyramid.py

Demo

IITK CS661: Big Data Visual Analytics: Soumya Dutta 55

You might also like