0% found this document useful (0 votes)
3 views

Lecture_5_GridComputing-2014

The document outlines the curriculum for High Performance Computing (HPC) and Grid Computing, focusing on e-Science, programming models, and middleware. It discusses the need for distributed resources to solve complex scientific problems and highlights the importance of grid services for resource sharing. Additionally, it defines key concepts related to grid computing, including high throughput and high performance computing, and their applications in scientific research.

Uploaded by

azhagar_ss
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture_5_GridComputing-2014

The document outlines the curriculum for High Performance Computing (HPC) and Grid Computing, focusing on e-Science, programming models, and middleware. It discusses the need for distributed resources to solve complex scientific problems and highlights the importance of grid services for resource sharing. Additionally, it defines key concepts related to grid computing, including high throughput and high performance computing, and their applications in scientific research.

Uploaded by

azhagar_ss
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

HPC

 &  BigData  
Grid  Compu3ng  
High  Performance  compu3ng  Curriculum  
UvA-­‐SARA  
h@p://www.hpc.uva.nl/  
outline  
• e-Science
• Grid approach
• Grid computing
• Programming models for the Grid
• Grid-middleware
• Web Services
• Open Grid Service Architecture (OGSA)
Doing  Science  in  the  21th  century  
• Nowadays  Scien3fic  Applica3ons  are  
– CPU  intensive  
– Produce/process  Huge  sets  of  Data  
– Requires  access  to  geographically  distributed  and  
expensive  instruments  
Online  Access  to  Scien3fic  Instruments  
Advanced Photon Source

wide-area
dissemination

real-time archival desktop & VR clients


with shared controls
collection storage

tomographic reconstruction
DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
From    the  Grid  tutorials  available  at  :  h@p:.//www.globus.org  
CPU intensive Science: Optimization problem
NUG30

• The problem, a quadratic


assignment problem (QAP)
known as NUG30
– given a set of n locations and n
facilities, the goal is to assign each
facility to a location.
– There are n! possible assignments
• NUG30 proposed in 1968 as a test
of computer capabilities, but
remained unsolved because of its
great complexity.

Nug30  Quadra+c  Assignment  Problem  Solved  by  1,000    h@ps://scout.wisc.edu/archives/r7125  


To  solve  these  problems?  
LHC NUG30 Online Access

Application Application Application


Specific Specific Specific
Part Part Part

Potential Generic Potential Generic Potential Generic


part part part
Management Management Management
of comm. & of comm. & of comm. &
computing computing computing

Grid Services
Harness multi-domain distributed
resources

“VL-e project” UvA


LHC NUG30 Online Access

Application Application Application

Management Management Management


of comm. & of comm. & of comm. &
computing computing computing

Grid Services
Harness multi-domain distributed resources
outline  
• e-Science
• Grid approach
• Grid computing
• Programming models for the Grid
• Grid-middleware
• Web Services
• Open Grid Service Architecture (OGSA)
The  Grid  Problem  

• Flexible,  secure,  coordinated  resource  sharing  


among  dynamic  collec3ons  of  individuals,  
ins3tu3ons,  and  resources  

• Enable  communi3es  (“Virtual  Organiza3ons”)  to  


share  geographically  distributed  resources  as  they  
pursue  common  goals  -­‐-­‐  assuming  the  absence  of  :  
central  loca3on,  central  control,  exis3ng  trust  
rela3onships.  
From    the  Grid  tutorials  available  at  :  h@p:.//www.globus.org  
Some Definitions of the Grid?
“A Computational grid is a hardware and software infrastructure
that provides dependable, consistent, pervasive, and inexpensive
access to high-end computational capabilities”. Karl Kesselman & Ian Foster.

“The overall motivation for Grids is to enable the routine


interactions of resources geographically and organizationally
dispersed to facilitate Large-scale Science and engineering” The
Vision for a DOE Science Grid, William Johnston, Lawrence Berkeley Nat. Lab.

“Making possible a shared large wide-area Computational


infrastructure a concept which has been named the Grid” Peter Dinda,
Gorgia Tech, 2001.
The real Grid target
• A Grid is a system that is able to

– Coordinate resources
• not subject to centralized control

– Use standard, open, general-purpose protocols and


interfaces

– Deliver nontrivial qualities of service.

“Ian  Foster’s  3  point  checklist”  


Coordinated Sharing
• The  sharing  is  controlled  by  the  providers  and  
consumers  
– what  is  shared?  
– who  is  allowed  to  share?  
– and  the  condi3ons  under  which  sharing  occurs?  

• sharing  rela3onships  
– client-­‐server,  peer-­‐to-­‐peer,  and  brokered  
– access  control:  fine  AC,  delega3on,  local/global  policies  

From  “The  Anatomy  of  the  Grid:  Enabling  Scalable  Virtual  Organiza3ons”  Foster  et  al  
outline  
• e-Science
• Grid approach
• Grid computing
• Programming models for the Grid
• Grid-middleware
• Web Services
• Open Grid Service Architecture (OGSA)
What  is  Grid  Compu3ng    
• Grid  compu3ng  is  the  use  of  hundreds,  
thousands,  or  millions  of  geographically  and  
organiza3onally  disperse  and  diverse  
resources  to  solve:  

è  problems  that  require  more  compu3ng  power  


than  is  available  from  a  single  machine  or  from  a  
local  area  distributed  system    
Poten3al  Grid  Applica3on  
• An  applica3on  which  requires  the  grid  solu3on  
is  likely  distributed  (Distributed  Compu3ng)  
and  fit  in  one  of  the  following  paradigms:  
– High  throughput  Compu3ng  
– High  performance  Compu3ng  

Grid  compu3ng  will  be  mainly  needed  for  large-­‐


scale,  high-­‐performance  compu3ng.  
Distributed  Compu3ng  

• Distributed  compu3ng  is  a  programming  model  


in  which  processing  occurs  in  many  
geographically  distributed  places.    
– Processing  can  occur  wherever  it  makes  the  most  
sense,  whether  that  is  on  a  server,  Web  site,  personal  
computer,  etc.  
 
• Distributed  compu3ng  and  grid  compu3ng  either    
– overlap  or  distributed  compu3ng  is  a  subset  of  grid  
compu3ng  
From  “The  Anatomy  of  the  Grid:  Enabling  Scalable  Virtual  Organiza3ons”  Foster  et  al  
High  Throughput  Compu3ng  
• HTC  employs  large  amounts  of  compu3ng  power  for  
very  lengthy  periods    
– HTC  is  needed  for  doing  sensi3vity  analyses,  parametric  
studies  or  simula3ons  to  establish  sta3s3cal  confidence.  
• The  features  of  HTC  are  
– Availability  of  compu3ng  power  for  a  long  period  of  3me  
– Efficient  fault  tolerance  mechanism    
• The  key  to  HTC  in  grids  
– Efficiently  harness  the  use  of  all  available  resources  across  
organiza3ons  
High  Performance  Compu3ng  

• HPC  brings  enormous  amounts  of  compu3ng  power  to  


bear  over  rela3vely  short  periods  of  3me.  
– HPC  is  needed  for  decision-­‐support  or  applica3ons  under  
sharp  3me-­‐constraint,  such  as  weather  modeling  
• HPC  applica3ons  are:  
– Large  in  scale  and  complex  in  structure.    
– Real  3me  requirements.    
– Ul3mately  must  run  on  more  than  one  type  of  HPC  system.  
HPC/HTC  requirements  

• HPC/HTC  requires  a  balance  of  computa3on  


and  communica3on  among  all  resources  
involved.  
– Managing  computa3on,    
– communica3on,    
– data  locality  
 
outline  
• e-­‐Science    
• Grid  approach  
• Grid  compu3ng  
• Programming  models  for  the  Grid  
• Grid-­‐middleware  
• Web  Services  
• Open  Grid  Service  Architecture  (OGSA)  
Programming  Model  for  the  grid  
• To  achieve  petaflop  rates  on  3ghtly/loosely  
coupled  grid  clusters,  applica3ons  will  have  to  
allow:  
–  extremely  large  granularity  or  produce  massive  
parallelism  such  that  high  latencies  can  be  
tolerated.  

• This  type  of  parallelism,  and  the  performance  


delivered  by  it  in  a  heterogeneous  
environment,  is    
– currently  manageable  by  hand-­‐coded  applica3ons  
Programming  Model  for  the  grid  
• A  programming  model  can  be  presented  in  different  
forms:      a  language,  a  library  API,  or  a  tool  with  
extensible  func3onality.  

• The  successful  programming  model  will    


– enable  both  high-­‐performance  and  the  flexible  
composi3on  and  management  of  resources.    
– influence  the  en3re  soeware  lifecycle:  design,  
implementa3on,  debugging,  opera3on,  maintenance,  etc.    
– facilitate  the  effec3ve  use  of  all  manner  of  development  
tools,  e.g.,  compilers,  debuggers,  performance  monitors,  
etc  
Grid  Programming  Issues  
• Portability,  Interoperability,  and  Adaptability  
• Discovery  
• Performance  
• Fault  Tolerance  
• Security  
Programming  models  
• Shared-­‐state  models  
• Message  passing  models  
• RPC  and  RMI  models  
• Hybrid  Models  
• Peer  to  Peer  Models  
• Web  Service  Models  
• ...  
outline  
• e-­‐Science    
• Grid  approach  
• Grid  compu3ng  
• Programming  models  for  the  Grid  
• Grid-­‐middleware  
• Web  Services  
• Open  Grid  Service  Architecture  (OGSA)  
Grid  Middleware  Defini3on  
• Architecture  iden3fies  the  fundamental  system  
components,  specifies  purpose  and  func3on  of  these  
components,  and  indicates  how  these  components  
interact  with  each  other.  
• Grid  architecture  is  a  protocol  architecture,  with  protocols  
defining  the  basic  mechanisms  by  which  VO  users  and  
resources  nego3ate,  establish,  manage  and  exploit  
sharing  rela3onships.  
• Grid  architecture  is  also  a  service  standard-­‐based  open  
architecture  that  facilitates  extensibility,  interoperability,  
portability  and  code  sharing.  

“Introduc+on  to  Grid  Technology” B.Ramamurthy  


Architecture  

Applica3on  

Internet  Protocol  Architecture  


“Coordina3ng  mul3ple  resources”:  
ubiquitous  infrastructure  services,  app-­‐ Collec3ve  
specific  distributed  services   Applica3on  

“Sharing  single  resources”:  nego3a3ng  


access,  controlling  use   Resource  

“Talking  to  things”:  communica3on  


(Internet  protocols)  &  security   Connec3vity   Transport  
Internet  
“Controlling  things  locally”:  Access  to,  
&  control  of  resources   Fabric   Link  
Emergence  of  Open  Grid  Standards  
Managed shared
Computer science research
Increased functionality,

virtual systems
standardization

Open Grid
Web services, etc.
Services Arch
Real standards
Multiple implementations
Internet
Globus Toolkit
standards
Defacto standard
Custom Single implementation
solutions

1990 1995 2000 2005 2010


“Grid Computing and Scaling Up the Internet” I. Foster, IPv6 Forum, an
Examples  of  Grid  Middleware  
• Globus  Toolkit  (GT4.X)  now  (GT5.X)  
– www.globus.org  
• Legion/Avaki  
– h@p://www.avaki.com/  
– h@p://legion.virginia.edu/  
• Grid  Sun  engine  
– h@p://www.sun.com/service/sungrid/
overview.jsp  
• Unicore  
– h@p://www.unicore.org  
The  Grid  Middleware    

• Soeware  toolkit  addressing  key  technical  areas  


– Offer  a  modular  “bag  of  technologies”  
– Enable  incremental  development  of  grid-­‐enabled  tools  and  
applica3ons    
– Define  and  standardize  grid  protocols  and  APIs  
 
• Focus  is  on  inter-­‐domain  issues,  not  clustering  
– Collabora3ve  resource  use  spanning  mul3ple  organiza3ons  
– Integrates  cleanly  with  intra-­‐domain  services    
– Creates  a  “collec3ve”  service  layer  
“Basics  Globus  Toolkit™  Developer  Tutorial”    
Globus  Team,  2003
Globus  Approach  
• Focus  on  architecture  issues   A  p  p  l  i  c  a  t  i  o  n  s    
– Provide  implementa3ons  of  grid   Diverse  global  services  
protocols  and  APIs  as  basic  
infrastructure  
– Use  to  construct  high-­‐level,  domain-­‐
specific  solu3ons  
Core  Globus  
• Design  principles   services  
– Keep  par3cipa3on  cost  low  
– Enable  local  control  
– Support  for  adapta3on  

Local  OS  

“Basics  Globus  Toolkit™  Developer  Tutorial”    


Globus  Team,  2003
Globus  Toolkit  2.0  Components  
1   MDS  client  API  calls  
to  locate  resources  
Client   MDS:  Grid  Index  Info  Server  
2   MDS  client  API  calls   Site  boundary  
to  get  resource  info  
4  
GRAM  client  API  calls  to    
request  resource  alloca3on   MDS:  Grid  Resource  Info  Server  
and  process  crea3on.   Query  current  status  
GRAM  client  API  state   3   of  resource  
Globus  Security   change  callbacks  
Infrastructure   Local  Resource  Manager  
7   8   create  
Allocate  &  
processes  
5  Create   Job  Manager  
Gatekeeper   6  Parse   Monitor  &  
Process  

control   Process  
RSL  Library  
Process  
outline  
• e-­‐Science    
• Grid  approach  
• Grid  compu3ng  
• Programming  models  for  the  Grid  
• Grid-­‐middleware  
• Web  Services  
• Open  Grid  Service  Architecture  (OGSA)  
Best  of  Two  Worlds  
Open  Grid  Services  Architecture  
share   manage  
access  

Applica+ons  on     Resources  


demand   on  demand  

Secure  and     Global    


universal  access   Accessibility  

Business     Vast  resource  


integra+on   scalability  

Web  Services   Grid  Protocols  

‘Open  Grid  Services  Architecture  Evolu3on,  J.P.  Prost,  IBM  Montpellier,  France,  Ecole  Bruide  2004  
Web  Services  
• Increasingly  popular  standards-­‐based  framework  for  
accessing  network  applica3ons  
– W3C  standardiza3on;  Microsoe,  IBM,  Sun,  others  
• WSDL:  Web  Services  Descrip3on  Language  
– Interface  Defini3on  Language  for  Web  services  
• SOAP:  Simple  Object  Access  Protocol  
– XML-­‐based  RPC  protocol;  common  WSDL  target  
• WS-­‐Inspec3on  
– Conven3ons  for  loca3ng  service  descrip3ons  
• UDDI:  Universal  Desc.,  Discovery,  &  Integra3on    
– Directory  for  Web  services  
“Globus  Toolkit  Futures:  An  Open  Grid  Services  Architecture” Ian  Foster  et  al.  Globus  Tutorial,  
Argonne  Na3onal  Laboratory,  January  29,  2002  
The  Need  to  Support  
Transient  Service  Instances  
• “Web  services”  address  discovery  &  invoca3on  of  
persistent  services  
– Interface  to  persistent  state  of  en3re  enterprise  
• In  Grids,  must  also  support  transient  service  instances,  
created/destroyed  dynamically  
– Interfaces  to  the  states  of  distributed  ac3vi3es  
– E.g.  workflow,  video  conf.,  dist.  data  analysis  
• Significant  implica3ons  for  how  services  are  managed,  
named,  discovered,  and  used  
– In  fact,  much  of  the  work  is  concerned  with  the  
management  of  service  instances  
“Globus  Toolkit  Futures:  An  Open  Grid  Services  Architecture” Ian  Foster  et  al.  Globus  Tutorial,  
Argonne  Na3onal  Laboratory,  January  29,  2002  
outline  
• e-­‐Science    
• Grid  approach  
• Grid  compu3ng  
• Programming  models  for  the  Grid  
• Grid-­‐middleware  
• Web  Services  
• Open  Grid  Service  Architecture  (OGSA)  
Open  Grid  Services  Architecture  
• Service  orienta3on  to  virtualize  resources  
• From  Web  services:  
– Standard  interface  defini3on  mechanisms:  
mul3ple  protocol  bindings,  mul3ple  
implementa3ons,  local/remote  transparency  
• Building  on  Globus  Toolkit:  
– Grid  service:  seman3cs  for  service  interac3ons  
– Management  of  transient  instances  (&  state)  
– Factory,  Registry,  Discovery,  other  services  
– Reliable  and  secure  transport  
• Mul3ple  hos3ng  targets:  J2EE,  .NET,  …  

“Globus  Toolkit  Futures:  An  Open  Grid  Services  Architecture” Ian  Foster  et  al.  Globus  Tutorial,  
Argonne  Na3onal  Laboratory,  January  29,  2002  
Open  Grid  Services  Architecture  
Objec3ves  
• Manage  resources  across  distributed  heterogeneous  
plarorms    
• Deliver  seamless  QoS    
• Provide  a  common  base  for  autonomic  management  
solu3ons  
• Define  open,  published  interfaces  
• Exploit  industry-­‐standard  integra3on  technologies  
– Web  Services,  SOAP,  XML,...    
• Integrate  with  exis3ng  IT  resources  

‘Open Grid Services Architecture Evolution, J.P. Prost, IBM Montpellier, France, Ecole Bruide 2004

You might also like