0% found this document useful (0 votes)

30 views

f32 Book Parallel Pres pt4

This document discusses low-diameter architectures like hypercubes. It covers topics like hypercubes and their algorithms, sorting and routing on hypercubes, other hypercubic architectures, and other networks. Specific topics covered include the definition and main properties of hypercubes, embeddings and their usefulness, embedding of arrays and trees, simple hypercube algorithms, and matrix operations on hypercubes.

Uploaded by

Harish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

f32 Book Parallel Pres pt4

Uploaded by

Harish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 94

Part IV

Low-Diameter Architectures

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 1

About This Presentation
This presentation is intended to support the use of the textbook
Introduction to Parallel Processing: Algorithms and Architectures
(Plenum Press, 1999, ISBN 0-306-45970-1). It was prepared by
the author in connection with teaching the graduate-level course
ECE 254B: Advanced Computer Architecture: Parallel Processing,
at the University of California, Santa Barbara. Instructors can use
these slides in classroom teaching and for other educational
purposes. Any other use is strictly prohibited. © Behrooz Parhami

Edition Released Revised Revised

First Spring 2005 Spring 2006

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 2

IV Low-Diameter Architectures
Study the hypercube and related interconnection schemes:
• Prime example of low-diameter (logarithmic) networks
• Theoretical properties, realizability, and scalability
• Complete our view of the “sea of interconnection nets”

Topics in This Part

Chapter 13 Hypercubes and Their Algorithms
Chapter 14 Sorting and Routing on Hypercubes
Chapter 15 Other Hypercubic Architectures
Chapter 16 A Sampler of Other Networks

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 3

13 Hypercubes and Their Algorithms
Study the hypercube and its topological/algorithmic properties:
• Develop simple hypercube algorithms (more in Ch. 14)
• Learn about embeddings and their usefulness

Topics in This Chapter

13.1 Definition and Main Properties
13.2 Embeddings and Their Usefulness
13.3 Embedding of Arrays and Trees
13.4 A Few Simple Algorithms
13.5 Matrix Multiplication
13.6 Inverting a Lower-Triangular Matrix

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 4

13.1 Definition and Main Properties
P
0
P P
8 1 P P P
0 1 2
Intermediate
P P
architectures:
7 2 logarithmic or P P P
sublogarithmic 3 4 5
diameter
P P
6 3
P P P
6 7 8
P P
5 4

Begin studying networks that are intermediate between

diameter-1 complete network and diameter-p1/2 mesh
Sublogarithmic diameter Superlogarithmic diameter
1 2 log n / log log n log n n n/2 n1

Complete PDN Star, Binary tree, Torus Ring Linear

network pancake hypercube array

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 5

Hypercube and Its History
Binary tree has logarithmic diameter, but small bisection
Hypercube has a much larger bisection
Hypercube is a mesh with the maximum possible number of dimensions
222 ... 2
 q = log2 p 
We saw that increasing the number of dimensions made it harder to
design and visualize algorithms for the mesh
Oddly, at the extreme of log2 p dimensions, things become simple again!

Brief history of the hypercube (binary q-cube) architecture

Concept developed: early 1960s [Squi63]
Direct (single-stage) and indirect (multistage) versions: mid 1970s
Initial proposals [Peas77], [Sull77] included no hardware
Caltech’s 64-node Cosmic Cube: early 1980s [Seit85]
Introduced an elegant solution to routing (wormhole switching)
Several commercial machines: mid to late 1980s
Intel PSC (personal supercomputer), CM-2, nCUBE (Section 22.3)
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 6
0
Basic Definitions
00 01
0 1

Hypercube is generic term; (a) Binary 1-cube, (b) Binary 2-cube,

built of two built of two 1
3-cube, 4-cube, . . . , q-cube binary 0-cubes, binary 1-cubes,
10 11

labeled 0 and 1 labeled 0 and 1

in specific cases
100 101

Fig. 13.1 000 001 100 101

000 001

The recursive
0 1
structure of 110 111

binary 010 011 110 111

010 011
hypercubes. (c) Binary 3-cube, built of two binary 2-cubes, labeled 0 and 1

Parameters: 0100 0101 1100 1101

p = 2q 0000 0001 1000 1001

0 1
B = p/2 = 2q–1
0110 0111 1110 1111

D = q = log2p
0010 0011 1010 1011

d = q = log2p
(d) Binary 4-cube, built of two binary 3-cubes, labeled 0 and 1

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 7

The 64-Node
Hypercube

Only sample
wraparound
links are
shown to
avoid clutter

Isomorphic to
the 4  4  4
3D torus
(each has
64  6/2 links)

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 8

Neighbors of a Node in a Hypercube
xq–1xq–2 . . . x2x1x0 ID of node x
xq–1xq–2 . . . x2x1x0 dimension-0 neighbor; N0(x)
The q
xq–1xq–2 . . . x2x1x0 dimension-1 neighbor; N1(x)
neighbors
. .
. . of node x
. .
xq–1xq–2 . . . x2x1x0 dimension-(q – 1) neighbor; Nq–1(x)
0100 Dim 0
0101
Nodes whose labels differ in k bits xx
Dim 2 Dim 3
(at Hamming distance k) connected 1100
by shortest path of length k 0000 Dim 1
1101
Both node- and edge-symmetric
1111
Strengths: symmetry, log diameter,
0110 0111
and linear bisection width
Weakness: poor scalability 1010 1011
0010 0011
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 9
13.2 Embeddings and Their Usefulness
Dilation = 1
0 Congestion = 1 3 5 Fig. 13.2
Load factor = 1 Embedding a
a b c e

a b
seven-node
1 2 1 0 2 binary tree
c d e f d f into 2D
meshes of
3 4 5 6 4 6
various sizes.
Dilation = 2 Dilation = 1
Congestion = 2 Congestion = 2
Load factor = 1 Load factor = 2 Expansion:
a b f b ratio of the
1 0 2 6 0,1 2,5
number of
c e c, d f
nodes (9/7, 8/7,
d
3 4 5 3,4 6
and 4/7 here)

Dilation: Longest path onto which an edge is mapped (routing slowdown)

Congestion: Max number of edges mapped onto one edge (contention slowdown)
Load factor: Max number of nodes mapped onto one node (processing slowdown)
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 10
13.3 Embedding of Arrays and Trees
(q – 1)-bit
Gray code
x 0 000 . . . 000
N q–1(x)
0 000 . . . 001
Nk(x) 0 000 . . . 011
N q–1(N (x))
k .
.
.
0 100 . . . 000
(q – 1)-cube 0 (q – 1)-cube 1 1 100 . . . 000
.
Fig. 13.3 Hamiltonian cycle in the q-cube. .
.
Alternate inductive proof: Hamiltonicity of the q-cube 1 000 . . . 011
is equivalent to the existence of a q-bit Gray code 1 000 . . . 010
1 000 . . . 000
Basis: q-bit Gray code beginning with the all-0s codeword (q – 1)-bit
and ending with 10q–1 exists for q = 2: 00, 01, 11, 10 Gray code
in reverse
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 11
Mesh/Torus Embedding in a Hypercube

Dim 3 Column 3
Dim 2

Column 2
Dim 1

Column 1

Dim 0 Column 0

Fig. 13.5 The 4  4 mesh/torus is a subgraph of the 4-cube.

Is a mesh or torus a subgraph of the hypercube of the same size?

We prove this to be the case for a torus (and thus for a mesh)
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 12
Torus is a Subgraph of Same-Size Hypercube
0a
0 a
A tool used in our proof 0b 3-by-2
2 = 2a torus
 b
Product graph G1  G2: 1
1a 2b
1b
Has n1  n2 nodes
  =
Each node is labeled by a
pair of labels, one from each
component graph
Two nodes are connected if  =
either component of the two
nodes were connected in
the component graphs Fig. 13.4 Examples of product graphs.
The 2a  2b  2c . . . torus is the product of 2a-, 2b-, 2c-, . . . node rings
The (a + b + c + ... )-cube is the product of a-cube, b-cube, c-cube, . . .
The 2q-node ring is a subgraph of the q-cube
If a set of component graphs are subgraphs of another set, the product
graphs will have the same relationship
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 13
Embedding Trees in the Hypercube
The (2q – 1)-node complete binary tree even weight
is not a subgraph of the q-cube odd weights
even weights
Proof by contradiction based on the parity of
node label weights (number of 1s is the labels)

even weights
The 2q-node double-rooted complete
binary tree is a subgraph of the q-cube odd weights

x Nc(x) Nb(N (x))

c
Na(x) New Roots Nc(N (x))
b Fig. 13.6
The 2q-node
Nb(x) Nc(N (x))
a double-
Na(N (x))
c rooted
complete
binary tree is
2q -node double-rooted Double-rooted tree Double-rooted tree a subgraph of
complete binary tree in the (q–1)-cube 0 in the (q–1)-cube 1
the q-cube.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 14
A Useful Tree Embedding in the Hypercube
Processor 000

The (2q – 1)-node 001 Despite the load

complete binary tree factor of q, many
010
can be embedded tree algorithms
into the (q – 1)-cube 011 entail no slowdown
Dim-2
link 100
101
Dim-1
links 110
111
Dim-0
links

Fig. 13.7 Embedding a 15-node complete binary tree into the 3-cube.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 15
13.4 A Few Simple Algorithms
Semigroup computation on the q-cube 4 5
Processor x, 0  x < p do t[x] := v[x]
{initialize “total” to own value} 0 1
for k = 0 to q – 1 processor x, 0  x < p, do
get y :=t[Nk(x)]
6 7

set t[x] := t[x]  y 2 3

endfor 0-7 0-7

4-5 4-5 4-7 4-7
Commutativity
of the operator  0-1 0-1 0-3 0-3 0-7 0-7

is implicit here.
How can we 6-7 6-7 4-7 4-7 0-7 0-7

remove this 2-3 0-3 0-3 0-7 0-7

2-3
assumption?
Fig. 13.8 Semigroup computation on a 3-cube.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 16
Parallel Prefix Computation
Parallel prefix computation on the q-cube
t : subcube “total”
Processor x, 0  x < p, do t[x] := u[x] := v[x]
{initialize subcube “total” and partial prefix} 4 5
for k = 0 to q – 1 processor
Legendx, 0  xt < p, do u : subcube prefix
0 1
get y :=t[Nk(x)]
set t[x] := t[x]  y u
if x > Nk(x) then set u[x]
t: Subcube  u[x]
:= y "total" 6 7
u: Subcube prefix
endfor 2 3
All "totals" 0-7
Commutativity 4-5 4-5 4-7 4-7
of the operator  4 4-5 0-4 0-5
0-1 0-1 0-3 4 0-3 4-5
is implicit in this
6-7 6-7
algorithm as well. 0 0-1 0 4-7 0-1 4-7 0 0-1
6 6-7 4-6 4-7 0-6 0-7
How can we 2-3 2-3 0-3 0-3
remove this 2 2-3 0-2 0-3 0-2 0-3
assumption? Fig. 13.8 Semigroup computation on a 3-cube.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 17
Sequence Reversal on the Hypercube
Reversing a sequence on the q-cube
for k = 0 to q – 1 Processor x, 0  x < p, do e 100 101 f
get y := v[Nk(x)] b
a
000 001

set v[x] := y
h
endfor 110

g
111

c 010 011
d

f e h g d c
100 101 100 101 100 101

b a d c h g
000 001 000 001 000 001

h g f e b a
110 111 110 111 110 111

d c b a f e
010 011 010 011 010 011

Fig. 13.11 Sequence reversal on a 3-cube.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 18
4 5

Ascend, Descend, and Normal 0 1

Semigroup
Hypercube
Algorithms 2
6

3
7

0-7
Dimension 4-5 4-5 4-7 4-7 0-7

0-1 0-1 0-3 0-3 0-7 0-7

Dimension-order
q–1 communication 6-7 6-7

0-3
4-7

0-7
0-7

2-3 2-3
4 5

. Ascend Legend t 0 1

. u Parallel
. t: Subcube "total"
u: Subcube prefix
6 7
prefix
2 3
4-5 4-5 All "totals" 0-7
4-7
3
4-7

Normal 0-1
4 0-1 4-5
0-3 4 0-3 4-5 0-4 0-5

6-7 6-7 4-7 4-7

0 0-1 0 0-1 0 0-1
2 2-3
6
2-3 6-7 0-3 4-6
0-3
4-7 0-6 0-7

2 2-3 0-2 0-3 0-2 0-3

1 Descend e 100 101 f

b
a

Sequence
000 001

0 110

g
111 h
reversal
0 1 2 3 . . . c 010 011
d

Algorithm Steps f
100
e
101
h
100
g
101
d
100
c
101

b a d c h
g

Graphical depiction of ascend, descend, 000

h
001

g
000

f
001

e
000

b
001

and normal algorithms.

110 111 110 111 110 111

d c b a f e
010 011 010 011 010 011

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 19

13.5 Matrix Multiplication
p = m3 = 2q processors, indexed as ijk (with three q/3-bit segments)
RA 1 2 RA 2 2
1. Place elements RA 4 100 5 101 RB 5 6 RB 5 6
RB
of A and B in 1 000 2 001 1 2 1 1
0 1
registers RA & RB 5 6 5
3
6
4
5
4
6
4
of m2 processors 6 110 7 111 7 8 7 8
with the IDs 0jk 3 010 4 011 3 4 3 3
7 2 8
3 7 8 7 8

 13 24   

5 6
7 8
2. Replicate inputs: communicate
across 1/3 of the dimensions 6. Move Cjk to
3, 4. Rearrange R C := RA  RB
2
processor 0jk
the data by RA 2 14 16 RC
RB 7 8
communicating
1 1 5 6
across the 5 6
19 22
remaining 2/3 of 4 4 28 32
7 8
dimensions so
3 3
that processor ijk 6
15 18 43 50
5
has Aji and Bik
Fig. 13.12 Multiplying two 2  2 matrices on a 3-cube.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 20
Analysis of Matrix Multiplication
RA 1 2 RA 2 2
RA 4 100 5 101 RB 5 6 RB 5 6
RB
The algorithm involves 1
5 0
000 2
1
001 1 2
6
1
5
1
6
6 5
communication steps in 6 110 7 111
3
7
4
8
4
7
4
8
three loops, each with 3 010 4
3 011
3 4 3 3
7 2 8 7 8 7 8
q / 3 iterations (in one of  13 24   

5 6

the 4 loops, 2 values are 7 8

R C := RA  RB
exchanged per iteration) RA 2 2 14 16
RB 7 8
RC

Tmul (m, m3) = 1

5
1
6
5 6 19 22
4 4 28 32

O(q) = O(log m) 7
3
8
3 15 18 43 50
5 6

Analysis in the case of block matrix multiplication (m  m matrices):

Matrices are partitioned into p1/3  p1/3 blocks of size (m / p1/3)  (m / p1/3)
Each communication step deals with m2 / p2/3 block elements
Each multiplication entails 2m3/p arithmetic operations
Tmul(m, p) = m2 / p2/3  O(log p) + 2m3 / p
Communication Computation
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 21
13.6 Inverting a Lower-Triangular Matrix
B 0 B–1 0
For A = we have A–1 =
C D –D–1CB–1 D–1
I
B 0 B –1
0 BB–1 0
 =
C D –D CB
–1 –1
D
–1
CB–1 – DD–1CB–1 DD–1 0
0 I

Because B and D are both lower aij

triangular, the same algorithm
can be used recursively to invert ij
them in parallel

Tinv(m) = Tinv(m/2) + 2Tmul(m/2) = Tinv(m/2) + O(log m) = O(log2m)

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 22

14 Sorting and Routing on Hypercubes
Study routing and data movement problems on hypercubes:
• Learn about limitations of oblivious routing algorithms
• Show that bitonic sorting is a good match to hypercube

Topics in This Chapter

14.1 Defining the Sorting Problem
14.2 Bitonic Sorting on a Hypercube
14.3 Routing Problems on a Hypercube
14.4 Dimension-Order Routing
14.5 Broadcasting on a Hypercube
14.6 Adaptive and Fault-Tolerant Routing

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 23

14.1 Defining the Sorting Problem
Arrange data in order of processor ID numbers (labels) 4 5
The ideal parallel sorting algorithm:
T(p) = ((n log n)/p) 0 1
Smallest
This ideal has not been achieved in all
value
cases for the hypercube
6 7
Largest
1-1 sorting (p items to sort, p processors) 2 3 value

Batcher’s odd-even merge or bitonic sort: O(log2p) time

O(log p)-time deterministic algorithm not known

k-k sorting (n = kp items to sort, p processors)

Optimal algorithms known for n >> p or when average

running time is considered (randomized)
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 24
Hypercube Sorting: Attempts and Progress
One of the oldest
log2 n for n = p, bitonic
No bull’s eye yet! parallel algorithms;
discovered 1960,
log p log n/log(p/n), n p/4;
in particular, log p for n = p 1–  published 1968

There are three log n randomized

categories of practical
(n log n)/p for n >> p
sorting algorithms:
log n (log log n)2
1. Deterministic 1-1,
log n log log n
O(log2p)-time
2. Deterministic k-k,
optimal for n >> p
?
log n

1990
(that is, for large k) 1988

3. Probabilistic More than p items

1987
1980
(1-1 or k-k)
Practical, probabilistic 1960s

Pursuit of O(log p)-time algorithm Fewer than p items

is of theoretical interest only Practical, deterministic

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 25

Bitonic Sequences
Bitonic sequence: In Chapter 7, we designed bitonic sorting nets
Bitonic sorting is ideally suited to hypercube
1 3 3 4 6 6 6 2 2 1 0 0
Rises, then falls

8 7 7 6 6 6 5 4 6 8 8 9 (a) Cyclic shift of (a)

Falls, then rises

8 9 8 7 7 6 6 6 5 4 6 8 (b) Cyclic shift of (b)

The previous sequence,
right-rotated by 2 Fig. 14.1 Examples of bitonic
sequences.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 26
Sorting a Bitonic Sequence on a Linear Array
Time needed to Shifted right half Bitonic sequence
sort a bitonic Shift right half of dat a to
sequence on a left half (superimpose
the two halves)
p-processor
linear array:

In each position, keep

B(p) = p + p/2 0 1 2 n/2 n1 the smaller of the two
+ p/4 + . . . + 2 = values and ship the
2p – 2 larger value to the right

Not competitive,
because we can
sort an arbitrary Each half is a bit onic
sequence in 2p – 2 sequence that can be
sorted independently
unidirectional
communication
0 1 2 n/2 n1
steps using odd-
even transposition Fig. 14.2 Sorting a bitonic sequence on a linear array.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 27
Bitonic Sorting on a Linear Array
5 9 10 15 3 7 14 12 8 1 4 13 16 11 6 2
----> <---- ----> <---- ----> <---- ----> <----
5 9 15 10 3 7 14 12 1 8 13 4 11 16 6 2
------------> <------------ ------------> <------------
5 9 10 15 14 12 7 3 1 4 8 13 16 11 6 2
----------------------------> <----------------------------
3 5 7 9 10 12 14 15 16 13 11 8 6 4 2 1
------------------------------------------------------------>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Fig. 14.3 Sorting an arbitrary sequence on a linear array through

recursive application of bitonic sorting.

Sorting an arbitrary sequence of length p: Recall that

T(p) = T(p/2) + B(p) = T(p/2) + 2p – 2 = 4p – 4 – 2 log2p B(p) = 2p – 2

Alternate derivation:
T(p) = B(2) + B(4) + . . . + B(p) = 2 + 6 + . . . + (2p – 2) = 4p – 4 – 2 log2p

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 28

14.2 Bitonic Sorting on a Hypercube
For linear array, the 4p-step bitonic sorting algorithm is inferior to
odd-even transposition which requires p compare-exchange steps
(or 2p unidirectional communications)
The situation is quite different for a hypercube

Sorting a bitonic sequence on a hypercube: Compare-exchange

values in the upper subcube (nodes with xq–1 = 1) with those in the
lower subcube (xq–1 = 0); sort the resulting bitonic half-sequences
B(q) = B(q – 1) + 1 = q Complexity: 2q communication steps

Sorting a bitonic sequence of size n on q-cube, q = log2n

for l = q – 1 downto 0 processor x, 0  x < p, do
if xl = 0
then get y := v[Nl(x)]; keep min(v(x), y); send max(v(x), y) to Nl(x)
endif
endfor This is a “descend” algorithm
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 29
Bitonic Sorting on a Hypercube
h 100 101 f
Fig. 14.4 c
T(q) = T(q – 1) +
a
Sorting a 000 001 B(q)
Data ordering
in upper cube
bitonic = T(q – 1) + q
Data ordering
sequence of in lower cube 110 111 b = q(q + 1)/2
size 8 on the c
e = O(log2 p)
3-cube. 010 011
g

h f e f e f
100 101 100 101 100 101
a c a b a b
000 001 000 001 000 001

e g h g g h
110 111 110 111 110 111
c b c c c c
010 011 010 011 010 011

Dimension 2 Dimension 1 Dimension 0

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 30

14.3 Routing Problems on a Hypercube
Recall the following categories of routing algorithms:
Off-line: Routes precomputed, stored in tables
On-line: Routing decisions made on the fly
Oblivious: Path depends only on source and destination
Adaptive: Path may vary by link and node conditions

Good news for routing on a hypercube:

Any 1-1 routing problem with p or fewer packets can be solved in
O(log p) steps, using an off-line algorithm; this is a consequence
of there being many paths to choose from

Bad news for routing on a hypercube:

Oblivious routing requires (p1/2/log p) time in the worst case
(only slightly better than mesh)
In practice, actual routing performance is usually much closer to
the log-time best case than to the worst case.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 31
Limitations of Oblivious Routing
Theorem 14.1: Let G = (V, E) be a p-node, degree-d network. Any oblivious
routing algorithm for routing p packets in G needs (p1/2/d) worst-case time

Proof Sketch: Let Pu,v be the unique path

used for routing messages from u to v
There are p(p – 1) possible paths for
routing among all node pairs
These paths are predetermined and do not
v
depend on traffic within the network
Our strategy: find k node pairs ui, vi (1  i  k)
such that ui  uj and vi  vj for i  j, and
Pui,vi all pass through the same edge e
Because  2 packets can go through a link in one step, (k) steps
will be needed for some 1-1 routing problem
The main part of the proof consists of showing that k can be
as large as p1/2/d
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 32
14.4 Dimension-Order Routing
dim 0 dim 1 dim 2
Source 01011011 0 0 0
Destination 11010110 1 1 1
Differences ^ ^^ ^
Path: 01011011 2 2 2
11011011
11010011 3 3
q
3
11010111 4 4
2 Rows
11010110
4

5 5 5

6 6 6
Unfolded hypercube
(indirect cube, butterfly) 7 7 Unfold 7
facilitates the discussion, 0 1 2 3
Hypercube
q + 1 Columns Fold
visualization, and
analysis of routing Fig. 14.5 Unfolded 3-cube or the 32-node
algorithms butterfly network.

Dimension-order routing between nodes i and j in q-cube can be viewed

as routing from node i in column 0 (q) to node j in column q (0) of the
butterfly
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 33
Self-Routing on a Butterfly Network
dim 0 dim 1 dim 2
0 0
Fig. 14.6 Example
dimension-order 1 1
routing paths.
2 2

Ascend 3 3 Descend

4 4

Number of cross 5 5
links taken = length
6 6
of path in hypercube
7 7
0 1 2 3

From node 3 to 6: routing tag = 011  110 = 101 “cross-straight-cross”

From node 3 to 5: routing tag = 011  101 = 110 “cross-cross-straight”
From node 6 to 1: routing tag = 110  001 = 111 “cross-cross-cross”
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 34
Butterfly Is Not a Permutation Network
dim 0 dim 1 dim 2 dim 0 dim 1 dim 2
0 A 0 0 0

1 A B 1 1 1

2 C 2 2 2

3 B D 3 3 3

4 C 4 4 4

5 5 5 5

6 D 6 6 6

7 7 7 7
0 1 2 3 0 1 2 3

Fig. 14.7 Packing is a “good” Fig. 14.8 Bit-reversal permutation is

routing problem for dimension- a “bad” routing problem for dimension-
order routing on the hypercube. order routing on the hypercube.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 35
Why Bit-Reversal Routing Leads to Conflicts?
Consider the (2a + 1)-cube and messages that must go from nodes
0 0 0 . . . 0 x1 x2 . . . xa–1 xa to nodes xa xa–1 . . . x2 x1 0 0 0 . . . 0
a + 1 zeros a + 1 zeros
If we route messages in dimension order, starting from the right end,
all of these 2a = (p1/2) messages will pass through node 0
Consequences of this result:
1. The (p1/2) delay is even worse than (p1/2/d) of Theorem 14.1
2. Besides delay, large buffers are needed within the nodes

True or false? If we limit nodes to a constant number of message

buffers, then the (p1/2) bound still holds, except that messages are
queued at several levels before reaching node 0

Bad news (false): The delay can be (p) for some permutations

Good news: Performance usually much better; i.e., log2 p + o(log p)

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 36

Wormhole Routing on a Hypercube
Good/bad routing problems are good/bad for wormhole routing as well
Dimension-order routing is deadlock-free
dim 0 dim 1 dim 2 dim 0 dim 1 dim 2
0 A 0 0 0

1 A B 1 1 1

2 C 2 2 2

3 B D 3 3 3

4 C 4 4 4

5 5 5 5

6 D 6 6 6

7 7 7 7
0 1 2 3 0 1 2 3

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 37

14.5 Broadcasting on a Hypercube
Flooding: applicable to any network with all-port communication
00000 Source node
00001, 00010, 00100, 01000, 10000 Neighbors of source
00011, 00101, 01001, 10001, 00110, 01010, 10010, 01100, 10100, 11000 Distance-2 nodes
00111, 01011, 10011, 01101, 10101, 11001, 01110, 10110, 11010, 11100 Distance-3 nodes
01111, 10111, 11011, 11101, 11110 Distance-4 nodes
11111 Distance-5 node

Binomial broadcast tree with single-port communication

00000 Time

Fig. 14.9 10000

0
The binomial 01000 11000
1
broadcast tree 00100 01100 11100
2
10100
for a 5-cube. 00010
3
4
00001
5
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 38
Hypercube Broadcasting Algorithms
ABCD ABCD
Fig. 14.10
Three
hypercube ABCD

broadcasting ABCD

schemes as Binomial-t ree scheme (nonpipelined)

performed B A B
A A
on a 4-cube. A B C A

A A
A A
A B C D
A A B B
A B A C A

Pipelined binomial-tree scheme

A
A D B A
A B A
D
B B A A
B B A A
A D D B C A
C D C
B A D
C
D C C B C B A B A A

Johnsson & Ho’s method To avoid clutter, only A shown

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 39

14.6 Adaptive and Fault-Tolerant Routing
There are up to q node-disjoint and edge-disjoint shortest paths between
any node pairs in a q-cube
Thus, one can route messages around congested or failed nodes/links
A useful notion for designing adaptive wormhole routing algorithms is
that of virtual communication networks
4 5 4 5
Each of the two subnetworks
0 1 0 1 in Fig. 14.11 is acyclic
Hence, any routing scheme
6 7 6 7
that begins by using links in
2 3 2 3 subnet 0, at some point
switches the path to subnet 1,
Subnetwork 0 Subnetwork 1
and from then on remains in
Fig. 14.11 Partitioning a 3-cube into subnet 1, is deadlock-free
subnetworks for deadlock-free routing.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 40
Robustness of the Hypercube
Rich connectivity
Source S X
provides many
alternate paths for
message routing
X
Three
faulty
nodes
X
The node that is
furthest from S is
not its diametrically
opposite node in Destination
the fault-free
hypercube The fault diameter of the q-cube is q + 1.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 41

15 Other Hypercubic Architectures
Learn how the hypercube can be generalized or extended:
• Develop algorithms for our derived architectures
• Compare these architectures based on various criteria

Topics in This Chapter

15.1 Modified and Generalized Hypercubes
15.2 Butterfly and Permutation Networks
15.3 Plus-or-Minus-2i Network
15.4 The Cube-Connected Cycles Network
15.5 Shuffle and Shuffle-Exchange Networks
15.6 That’s Not All, Folks!

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 42

15.1 Modified and Generalized Hypercubes
4 5 4 5

0 1 0 1

6 7 6 7
2 3 2 3

3-cube and a 4-cycle in it Twisted 3-cube

Fig. 15.1 Deriving a twisted 3-cube by

redirecting two links in a 4-cycle.

Diameter is one less than the original hypercube

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 43

Folded Hypercubes
4 5 4 5

Fig. 15.2 Deriving a

folded 3-cube by 0 1 0 1
adding four diametral
links.
Diameter is half that of 6 3 7 6 3 7
2 2
the original hypercube
A diametral path in the 3-cubeFolded 3-cube
4 5 4 3 5

0 1
Rotate
0 71Fig. 15.3
180 Folded 3-cube
degrees
viewed as
6 7 6 3 17 3-cube with a
2 3 2 5 redundant
Folded 3-cube with dimension.
After renaming, diametral
Dim-0 links removed links replace dim-0 links

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 44

Generalized Hypercubes

A hypercube is a power or homogeneous product network

q-cube = (oo)q ; q th power of K2
Generalized hypercube = qth power of Kr
(node labels are radix-r numbers)
Node x is connected to y iff x and y differ in one digit
Each node has r – 1 dimension-k links
x3
Example: radix-4 generalized hypercube x2
Node labels are radix-4 numbers
x1

x0
Dimension-0 links

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 45

15.2 Butterfly and Permutation Networks
Dim 0
Dim 0 Dim 1 Dim 2
Dim 1 Dim 2
0 0 0 0
Fig. 7.4
Butterfly 1 1 1 1
and
wrapped 2 2 2 2
butterfly
networks. 3 3 3 3

4 4 4 4

5 5 5 5

6 6 6 6

7 7 7 7
0 1 2 3 1 2 3=0
q q
2 rows, q + 1 columns 2 rows, q columns

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 46

Dim 0 Dim 1 Dim 2 Dim 3
0

Structure of Butterfly Networks 1 1

Dim 1 Dim 0 Dim 2 2 2

0 0
3 3

4 4
1 1
Switching these
two row pairs 5 5

converts this to 2 2
6 6
the original
butterfly network.
3 3 7 7
Changing the
order of stages in 8 8
a butterfly is thus 4 4
equi valent to a 9 9
relabeling of t he
rows (in this 5 5 10 10

example, row xyz

11 11
becomes row xzy)
6 6
12 12

7 7 13 13

0 1 2 3
14 14

Fig. 15.5 Butterfly network 15 15

0 1 2 3 4
with permuted dimensions.
The 16-row butterfly network.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 47
Fat Trees
P0

P1 P4

P2 P3 P5 P6

Fig. 15.6 Two representations of a fat tree. Skinny tree? P

7 P8

5 6 7
3 4
2
0 1
Front view: Side view:
Binary tree 3 Inverted
2 6 7 binary tree
0 1 4 5
Fig. 15.7
Butterfly
0 1 2 3 4
5
6 7
network
redrawn as
a fat tree.
0 1 2 3 4 5 6 7

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 48

Butterfly as Multistage Interconnection Network
log2 p + 1 Columns of 2-by-2 Switches
p Processors log2 p Columns of 2-by-2 Switches p Memory Banks 0 1 2 3
0 1 2 3 000 0
0000 0 0 0000
0001 1 1 0001 001 1
0010 2 2 0010 010 2
0011 3 3 0011 011 3
0100 4 4 0100 100 4
0101 5 5 0101 101 5
0110 6 6 0110 110 6
0111 7 7 0111 111 7
1000 8 8 1000 000 0
1001 9 9 1001 001 1
1010 10 10 1010 010 2
1011 11 11 1011 011 3
1100 12 12 1100 100 4
1101 13 13 1101 101 5
1110 14 14 1110 110 6
1111 15 15 1111 111 7

Fig. 6.9 Example of a multistage Fig. 15.8 Butterfly network

memory access network used to connect modules
that are on the same side
Generalization of the butterfly network
High-radix or m-ary butterfly, built of m  m switches
Has mq rows and q + 1 columns (q if wrapped)
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 49
Beneš Network
Processors 2 log2p – 1 Columns of 2-by-2 Switches Memory Banks
0 1 2 3 4 000
000 0 0
001 1 1 001
010 2 2 010
011 3 3 011
100 4 4 100
101 5 5 101
110 6 6 110
111 7 7 111

Fig. 15.9 Beneš network formed from two back-to-back butterflies.

A 2q-row Beneš network:

Can route any 2q  2q permutation
It is “rearrangeable”

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 50

Routing Paths in a Beneš Network
0 0
To which 1 1
memory 2 2
modules 3 3
can we 4 4
connect 5 5
proc 4 6 6
without 7 7
rearranging 8 8
the other 9 9
paths? 10 10
11 11
12 12
What 13 13
about 14 14
proc 6? 15 0 1 2 3 4 5 6 15
q+1
2 q+1 Inputs 2 q Rows, 2q + 1 Columns 2 Outputs
Fig. 15.10 Another example of a Beneš network.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 51
15.3 Plus-or-Minus-2i Network
0 1 2 3 4 5 6 7
±1 4 5

0 1
±2

6 7
2 3

±4

Fig. 15.11 Two representations of the eight-node PM2I network.

The hypercube is a subgraph of the PM2I network

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 52

a
Unfolded PM2I b
Network 0 0

1 1
Data manipulator network
was used in Goodyear 2 2
MPP, an early SIMD
parallel machine. 3 3 q
2 Rows
“Augmented” means that 4 4
switches in a column are
independent, as opposed 5 5
to all being set to same
6 6
state (simplified control).
7 7
0 1 2 3
Fig. 15.12 Augmented a
data manipulator network. b
q + 1 Columns
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 53
15.4 The Cube-Connected Cycles Network
Dim 0
Dim 1 Dim 2
The cube-connected 0 0 0 0

5 5 5 5

6 6 6 6

7 7 7 7
1 2 3=0 0 1 2
q columns q columns/dimensions

Fig. 15.13 A wrapped butterfly (left)

converted into cube-connected cycles.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 54
Another View of The CCC Network
4 5 4,2 Example of
0,2
hierarchical
0 1 substitution
0,1
0,0 1,0 to derive a
lower-cost
6 7 network from
a basis
2 2,1 network
3

Fig. 15.14 Alternate derivation

of CCC from a hypercube.

Replacing each node of a high-dimensional

q-cube by a cycle of length q is how CCC
was originally proposed

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 55

Emulation of Hypercube Algorithms by CCC
Hypercube
Dimension

q–1
. Ascend
.
.

3 Cycle ID = x Proc ID = y
Normal
2 2m bits m bits
1 Descend Nj–1 (x), j–1
0
0 1 2 3 . . .
Algorithm Steps x, j–1 Dim j–1
Ascend, descend,
and normal algorithms. Cycle x, j
x Nj (x) , j
Dim j
Node (x, j) is communicating
along dimension j; after the x, j+1 Nj (x), j–1
Dim j+1
next rotation, it will be linked
to its dimension-(j + 1)
neighbor. Nj+1 (x), j+1
Fig. 15.15 CCC emulating
a normal hypercube algorithm. N j+1(x) , j
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 56
15.5 Shuffle and Shuffle-Exchange Networks
000 0 0 0 0 0 0

001 1 1 1 1 1 1

010 2 2 2 2 2 2

011 3 3 3 3 3 3

100 4 4 4 4 4 4

101 5 5 5 5 5 5

110 6 6 6 6 6 6

111 7 7 7 7 7 7
Shuffle Exchange Shuffle-Exchange Alternate
Structure
Unshuffle

Fig. 15.16 Shuffle, exchange, and shuffle–exchange connectivities.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 57
Shuffle-Exchange Network

0 1 2 3 4 5 6 7

SE
S 1 3 SE S
SE S S
S
SE
0 2 SE 5 7
S S S SE
SE 4 6
SE
Fig. 15.17 Alternate views of an eight-node shuffle–exchange network.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 58

Routing in Shuffle-Exchange Networks
In the 2q-node shuffle network, node x = xq–1xq–2 . . . x2x1x0

is connected to xq–2 . . . x2x1x0xq–1 (cyclic left-shift of x)

In the 2q-node shuffle-exchange network, node x is
additionally connected to xq–2 . . . x2x1x0x q–1
01011011 Source
11010110 Destination
^ ^^ ^ Positions that differ
01011011 Shuffle to 10110110 Exchange to 10110111
10110111 Shuffle to 01101111
01101111 Shuffle to 11011110
11011110 Shuffle to 10111101
10111101 Shuffle to 01111011 Exchange to 01111010
01111010 Shuffle to 11110100 Exchange to 11110101

11110101 Shuffle to 11101011

11101011 Shuffle to 11010111 Exchange to 11010110
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 59
Diameter of Shuffle-Exchange Networks
For 2q-node shuffle-exchange network: D = q = log2p, d = 4
With shuffle and exchange links provided separately, as in Fig. 15.18,
the diameter increases to 2q – 1 and node degree reduces to 3

0 1 2 3 4 5 6 7
Exchange Shuffle
(dotted) (solid)
2 E
S 3 S S
S S
0 E 1 6 E 7
S S
S E
4 5
Fig. 15.18 Eight-node network with separate shuffle and exchange links.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 60
0 0 0 0

Multistage 1 1 1 A 1

Shuffle- 2 2 2 2

3 3
Exchange
3 3

4 A 4 4 4
Network 5 5 5 5

6 6 6 6

7 7 7 7
0 1 2 3 0 1 2 3
q + 1 Columns q + 1 Columns

0 0 0 0

1 1 1 1
Fig. 15.19 2 2 2 2
Multistage A
3 3 3 3
shuffle–exchange
4
network A
4 4 4

(omega network) 5 5 5 5

is the same as 6 6 6 6
butterfly network. 7 7 7 7
0 1 2 0 1 2
q Columns q Columns
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 61
15.6 That’s Not All, Folks!
When q is a power of 2, the 2qq-node cube-connected cycles network
derived from the q-cube, by replacing each node with a q-node cycle,
is a subgraph of the (q + log2q)-cube  CCC is a pruned hypercube
Other pruning strategies are possible, leading to interesting tradeoffs
100 101

000 001 Odd-dimension

links are kept in
the odd subcube D = log2 p + 1
Even-dimension 110 111

links are kept in

the even subcube d = (log2 p + 1)/2
010 011

B = p/4
All dimension-0
links are kept

Fig. 15.20 Example of a pruned hypercube.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 62

Möbius Cubes
Dimension-i neighbor of x = xq–1xq–2 ... xi+1xi ... x1x0 is:
xq–1xq–2 ... 0xi... x1x0 if xi+1 = 0 (xi complemented, as in q-cube)
xq–1xq–2 ... 1xi... x1x0 if xi+1 = 1 (xi and bits to its right complemented)
For dimension q – 1, since there is no xq, the neighbor can be defined
in two possible ways, leading to 0- and 1-Mobius cubes
A Möbius cube has a diameter of about 1/2 and an average internode
distance of about 2/3 of that of a hypercube
4 5 7 6

0 0
1 1 Fig. 15.21
6 7 4 5
Two 8-node
Möbius cubes.
2 2 3
3
0-Mobius cube 1-Mobius cube
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 63
16 A Sampler of Other Networks
Complete the picture of the “sea of interconnection networks”:
• Examples of composite, hybrid, and multilevel networks
• Notions of network performance and cost-effectiveness

Topics in This Chapter

16.1 Performance Parameters for Networks
16.2 Star and Pancake Networks
16.3 Ring-Based Networks
16.4 Composite or Hybrid Networks
16.5 Hierarchical (Multilevel) Networks
16.6 Multistage Interconnection Networks

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 64

16.1 Performance Parameters for Networks
A wide variety of direct
interconnection networks
have been proposed for, or
used in, parallel computers

They differ in topological,

performance, robustness,
and realizability attributes.

Fig. 4.8 (expanded)

The sea of direct
interconnection networks.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 65

Diameter and Average Distance
Diameter D (indicator of worst-case message latency)
Routing diameter D(R); based on routing algorithm R
Average internode distance  (indicator of average-case latency)
Routing average internode distance (R)

Sum of distances For the 3  3 mesh:

from corner node: P P P  = (4  18 + 4  15 +12)
0 1 2
2 1 +3  2 + 2 3 / (9  8) = 2
+ 1  4 = 18 [or 144 / 81 = 16 / 9]
Sum of distances
from side node:
P P P
3 4 5 3 1 +3  2 +
2  3 = 15
Sum of distances
from center node: For the 3  3 torus:
4  1 + 4  2 = 12 Average distance:
P P P =(4 (4  +1 4+415 +2) / 8
 18
6 7 8 =1 1.5
12)[or
/ (912 / 9= =2 4 / 3]
x 8)

Finding the average internode distance of a 3  3 mesh.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 66
Bisection Width 0
31 1
30 2

Indicator or random 29 3

communication 28 4

capacity 27 5

26 6
Node bisection and
25 7
link bisection
24 8

Hard to determine; 23 9
Intuition can be
very misleading 22 10

21 11

Fig. 16.2 A network 20 12

whose bisection width is 19 13

not as large at it 18
17 15
14
16
appears.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 67
Determining the Bisection Width
Establish upper bound by taking a number of trial cuts.
Then, try to match the upper bound by a lower bound.
P0 P1 P2
K9 P0

P3 P4 P5 P8 P1 Establishing
a lower
bound on B:
P6 P7 P8
P7 P2 Embed Kp
0 1 2
into p-node
network
7
3 4 5
P6 P3 Let c be the
maximum
7 congestion
6 7 8 P5 P4
B  p2/4/c
An embedding of Bisection width = 4  5 = 20
K9 into 3  3 mesh

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 68

Degree-Diameter Relationship
Age-old question: What is the best way to interconnect p nodes
of degree d to minimize the diameter D of the resulting network?
Alternatively: Given a desired diameter D and nodes of degree d,
what is the max number of nodes p that can be accommodated?
Moore bounds (digraphs) d 2 nodes
d nodes
p  1 + d + d2 + . . . + dD = (dD+1 – 1)/(d – 1)
D  logd [p(d – 1) + 1] – 1
Only ring and Kp match these bounds x
Moore bounds (undirected graphs)
d (d – 1) nodes
p  1 + d + d(d – 1) + . . . + d(d – 1)D–1 d nodes
= 1 + d [(d – 1)D – 1]/(d – 2)
D  logd–1[(p – 1)(d – 2)/d + 1]
Only ring with odd size p and a few other x
networks match these bounds
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 69
Moore Graphs
A Moore graph matches the bounds on diameter and number of nodes.

For d = 2, we have p  2D + 1
Odd-sized ring satisfies this bound

For d = 3, we have p  3  2D – 2
D = 1 leads to p  4 (K4 satisfies the bound)
D = 2 leads to p  10 and the first nontrivial example (Petersen graph)
11010

00111 10101 01101

11001 10110

01011 01110

11100 10011 Fig. 16.1 The 10-node Petersen graph.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 70
How Good Are Meshes and Hypercubes?
For d = 4, we have D  log3[(p + 1)/2]
So, 2D mesh and torus networks are far from optimal in diameter,
whereas butterfly is asymptotically optimal within a constant factor

For d = log2 p (as for d-cube), we have D = (d / log d)

So the diameter d of a d-cube is a factor of log d over the best possible
We will see that star graphs match this bound asymptotically

Summary:
For node degree d, Moore’s bounds establish the lowest possible
diameter D that we can hope to achieve with p nodes, or the largest
number p of nodes that we can hope to accommodate for a given D.
Coming within a constant factor of the bound is usually good enough;
the smaller the constant factor, the better.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 71

Layout Area and Longest Wire
The VLSI layout area required by an interconnection network is
intimately related to its bisection width B

If B wires must cross the bisection in 2D layout

of a network and wire separation is 1 unit, the
smallest dimension of the VLSI chip will be  B
The chip area will thus be (B2) units
p-node 2D mesh needs O(p) area
p-node hypercube needs at least (p2) area B wires crossing a bisection

The longest wire required in VLSI layout affects network performance

For example, any 2D layout of a p-node hypercube requires wires of
length ((p / log p)1/2); wire length of a mesh does not grow with size
When wire length grows with size, the per-node performance is bound
to degrade for larger systems, thus implying sublinear speedup

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 72

Measures of Network Cost-Effectiveness
Composite measures, that take both the network performance and its
implementation cost into account, are useful in comparisons

One such measure is the degree-diameter product, dD

Mesh / torus: (p1/2)
Binary tree: (log p)
Not quite similar in cost-performance
Pyramid: (log p)
Hypercube: (log2 p)
However, this measure is somewhat misleading, as the node degree d
is not an accurate measure of cost; e.g., VLSI layout area also
depends on wire lengths and wiring pattern and bus based systems
have low node degrees and diameters without necessarily being cost-
effective
Robustness must be taken into account in any practical comparison of
interconnection networks (e.g., tree is not as attractive in this regard)

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 73

16.2 Star and Pancake Networks
1234 4231
4 Has p = q ! nodes
3214 2134 3241
3 2 2431 Each node labeled with a
string x1x2 ... xq which is a
2 3
2314 3421 permutation of {1, 2, ... , q}
3 2
3124 2341 Node x1x2 ... xi ... xq is
1324 4321
connected to xi x2 ... x1 ... xq
3412 2413 for each i (note that x1
1432 4213
and xi are interchanged)
4312 1423
When the i th symbol is
1342 4123 switched with x1 , the
4132 1243
corresponding link is
3142 2143 called a dimension-i link

Fig. 16.3 The four-dimensional star graph. d = q – 1; D = 3(q – 1)/2

D, d = O(log p / log log p)
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 74
Routing in the Star Graph
Source node 1 5 4 3 6 2 The diameter
Dimension-2 link to 5 1 4 3 6 2 of star is in fact
Dimension-6 link to 2 1 4 3 6 5 somewhat less
Last symbol now adjusted D = 3(q – 1)/2
Dimension-2 link to 1 2 4 3 6 5
Dimension-5 link to 6 2 4 3 1 5 Clearly, this is
Last 2 symbols now adjusted not a shortest-
Dimension-2 link to 2 6 4 3 1 5 path routing
Dimension-4 link to 3 6 4 2 1 5 algorithm.
Last 3 symbols now adjusted
Dimension-2 link to 6 3 4 2 1 5 Correction to text,
Dimension-3 link to 4 3 6 2 1 5 p. 328: diameter is
Last 4 symbols now adjusted not 2q – 3
Dimension-2 link (Dest’n) 3 4 6 2 1 5
We need a maximum of two routing steps per symbol, except that last
two symbols need at most 1 step for adjustment  D  2q – 3
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 75
Star’s Sublogarithmic Degree and Diameter
d = (q) and D = (q); but how is q related to the number p of nodes?

p = q !  e–qqq (2q)1/2 [ using Striling’s approximation to q ! ]

ln p  –q + (q + 1/2) ln q + ln(2)/2 = (q log q) or q = (log p / log log p)

Hence, node degree and diameter are sublogarithmic

Star graph is asymptotically optimal to within a constant factor with regard
to Moore’s diameter lower bound
Routing on star graphs is simple and reasonably efficient; however,
virtually all other algorithms are more complex than the corresponding
algorithms on hypercubes

Network diameter 4 5 6 7 8 9
Star nodes 24 -- 120 720 -- 5040
Hypercube nodes 16 32 64 128 256 512
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 76
The Star-Connected Cycles Network
1234,4 4
1234,3 1234,2
Replace degree-(q – 1)
3 2 nodes with (q – 1)-cycles

2 3 This leads to a scalable

version of the star graph
3 2 whose node degree of 3
does not grow with size

The diameter of SCC is

about the same as that of
a comparably sized CCC
network

However, routing and

other algorithms for SCC
Fig. 16.4 The four-dimensional are more complex
star-connected cycles network.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 77
Pancake Networks
1234 Similar to star networks in terms of
Dim 4
node degree and diameter
Dimension-i neighbor obtained by
Dim 2 4321 “flipping” the first i symbols;
Dim 3 hence, the name “pancake”

We need two flips per symbol in

3214 the worst case; D  2q – 3
2134

Source node 1 5 4 3 6 2
Dimension-2 link to 5 1 4 3 6 2
Dimension-6 link to 2 6 3 4 1 5
Last 2 symbols now adjusted
Dimension-4 link to 4 3 6 2 1 5
Last 4 symbols now adjusted
Dimension-2 link (Dest’n) 3 4 6 2 1 5
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 78
Cayley Networks
Node x
Group:
Gen 3
A semigroup with an identity element
and inverses for all elements.
Gen 1 x  3
Gen 2 Example 1: Integers with addition or
multiplication operator form a group.
Example 2: Permutations, with the
x  2
x  1 composition operator, form a group.

Star and pancake networks are

Cayley graph:
instances of Cayley graphs
Node labels are from a group G, and
Elements of S are “generators” a subset S of G defines the
of G if every element of G can connectivity via the group operator 
be expressed as a finite Node x is connected to node y
product of their powers iff x   = y for some   S
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 79
Star as a Cayley Network
1234 (1 4) (2) (3) 4231
4 Four-dimensional star:
3214 2 2134 3241
3 2431
Group G of the
2 3 permutations of {1, 2, 3, 4}
2314 3421
3 2
3124 2341 The generators are the
1324 4321 following permutations:
3412 2413 (1 2) (3) (4)
1432 4213 (1 3) (2) (4)
4312 1423 (1 4) (2) (3)

1342 4123
The identity element is:
4132 1243 (1) (2) (3) (4)
(1 3) (2) (4)
3142 2143

Fig. 16.3 The four-dimensional star graph.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 80

16.3 Ring-Based Networks
Rings are simple, Remote Message
Ring destination Local Ring
but have low
performance and D
lack robustness

Hence, a variety
of multilevel and
augmented ring S
networks have Message
been proposed source

Fig. 16.5 A 64-node ring-of-rings architecture composed

of eight 8-node local rings and one second-level ring.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 81
0 0
Chordal Ring 7 1 7 1
Networks
6 2 6 2
Routing algorithm:
Greedy routing
5 3 5 3
Given one chord 4 4
type s, the optimal ((a)a ) (b)( b )
length for s is
s 0=1 s 0=1
v–1 approximately p1/2
v v+1 0 v–1 v v+1

s1 7 1 s1
. . . .
. . . .
..
Fig. . 16.6 .
v–s 1 Unidirectional ring, v+s 1 v–s 1 v+s 1
6 2
two chordal rings,
s k–1 s k–1
and node
connectivity in 5 3
v–s general.
k–1 v+s k–1 4 v–s k–1 v+s k–1

(a) (b)( c ) (a)( d )

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 82
Chordal Rings Compared to Torus Networks
0 1 2 0 1 2
The ILLIAC IV
interconnection
scheme, often 3 4 5 3 4 5
described as
8  8 mesh or
torus, was really 6 7 6 7 8
a 64-node
chordal ring with Fig. 16.7 Chordal rings redrawn to show
skip distance 8. their similarity to torus networks.

Perfect Difference Networks

A class of chordal rings, recently studied at UCSB (two-part paper
in IEEE TPDS, August 2005) have a diameter of D = 2
Perfect difference {0, 1, 3}: All numbers in the range 1-6 mod 7
can be formed as the difference of two numbers in the set.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 83
Periodically Regular Chordal Rings
G
ro
upp
/g–1 G
ro
up0 0 s
0 Gro u

N
o de
sp–g Node
s0
top–1 tog–1 7 1
G
ro
up1
s
0
s
2 s
1
As k iplinklea
d sto N o
desg
thes am erela
tive to2g–1
positioninth e 6 2
destin atio
n
grou p
Nod
e s2
g
to3g–1
G
ro
up2
No de
sig
to(i+
1)g–1 5 3
G
ro
upi 4
Fig. 16.8 Periodically regular chordal ring.

Modified greedy routing: first route to the head of a group;

then use pure greedy routing

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 84

Some Properties of PRC Rings b

c e
0
g No skip in this
4 d f dimension

0 1 2 3

4 5 6 7
1

8 9 10 11
5
2
12 13 14 15
6 To 6
To 3
16 17 18 19
3 7
63
Fig. 16.9 VLSI layout for a 64-node 20 21 22 23

periodically regular chordal ring. 24 25 26 27

Dimension 1
d s 4 = 16
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 85
16.4 Composite or Hybrid Networks

Motivation: Combine the connectivity schemes from

two (or more) “pure” networks in order to:

 Achieve some advantages from each structure

 Derive network sizes that are otherwise unavailable

 Realize any number of performance / cost benefits

A very large set of combinations have been tried

New combinations are still being discovered

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 86

Composition by Cartesian Product Operation
0 a
0a Properties of product
0b 3-by-2 graph G = G  G:
2a torus
2  b =
1
2b Nodes labeled (x , x ),
1a
x   V , x   V 
1b
p = pp
 d = d + d
 =
D = D + D
 =  + 
Routing: G -first
(x , x )  (y , x )
 =  (y , y )
Broadcasting
Semigroup & parallel
Fig. 13.4 Examples of product graphs. prefix computations
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 87
Other Properties and Examples of Product Graphs
If G and G are Hamiltonian, then the p p torus is a subgraph of G
For results on connectivity and fault diameter, see [Day00], [AlAy02]

Mesh of trees (Section 12.6) Product of two trees

Fig. 16.11 Mesh of trees compared with mesh-connected trees.

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 88
16.5 Hierarchical (Multilevel) Networks
We have already seen
several examples of
hierarchical networks:
multilevel buses (Fig. 4.9);
CCC; PRC rings
Can be defined
from the bottom up Fig. 16.13 Hierarchical or multilevel bus network.
or from the top down
Take first-level ring
networks and
interconnect them
as a hypercube
Take a top-level
hypercube and
replace its nodes
with given networks
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 89
Example: Mesh of Meshes Networks

The same idea can be used N

to form ring of rings, W E
hypercube of hypercubes,
complete graph of complete S
graphs, and more generally,
X of Xs networks

When network topologies at

the two levels are different,
we have X of Ys networks

Generalizable to three levels

(X of Ys of Zs networks),
four levels, or more Fig. 16.12 The mesh of meshes network
exhibits greater modularity than a mesh.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 90
Example: Swapped Networks
Build a p2-node network using p-node building blocks (nuclei or clusters)
by connecting node i in cluster j to node j in cluster i
Also known in the literature as OTIS (optical transpose interconnect system) network
Cluster # Node #
00 10 01 11

Fig. 4.8 (modified)

The sea of indirect
interconnection networks.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 92
Self-Routing Permutation Networks
Do there exist self-routing permutation networks? (The butterfly network
is self-routing, but it is not a permutation network)

Permutation routing through a MIN is the same problem as sorting

7 (111) 0 0 0 (000)
0 (000) 1 1 1 (001)
4 (100) 3 3 2 (010)
6 (110) 2 2 3 (011)
1 (001) 7 4 4 (100)
5 (101) 4 5 5 (101)
3 (011) 6 7 6 (110)
2 (010) 5 6 7 (111)
Sort by Sort by the Sort by
MSB middle bit LSB
Fig. 16.14 Example of sorting on a binary radix sort network.
Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 93
Partial List of Important MINs
Augmented data manipulator (ADM): aka unfolded PM2I (Fig. 15.12)
Banyan: Any MIN with a unique path between any input and any output (e.g. butterfly)
Baseline: Butterfly network with nodes labeled differently
Beneš: Back-to-back butterfly networks, sharing one column (Figs. 15.9-10)
Bidelta: A MIN that is a delta network in either direction
Butterfly: aka unfolded hypercube (Figs. 6.9, 15.4-5)
Data manipulator: Same as ADM, but with switches in a column restricted to same state
Delta: Any MIN for which the outputs of each switch have distinct labels (say 0 and 1
for 2  2 switches) and path label, composed of concatenating switch output labels
leading from an input to an output depends only on the output
Flip: Reverse of the omega network (inputs  outputs)
Indirect cube: Same as butterfly or omega
Omega: Multi-stage shuffle-exchange network; isomorphic to butterfly (Fig. 15.19)
Permutation: Any MIN that can realize all permutations
Rearrangeable: Same as permutation network
Reverse baseline: Baseline network, with the roles of inputs and outputs interchanged

Spring 2006 Parallel Processing, Low-Diameter Architectures Slide 94

Introduction To Parallel Computing: Solution Manual
No ratings yet
Introduction To Parallel Computing: Solution Manual
70 pages
Alexander Schrijver - Theory of Linear and Integer Programming PDF
100% (2)
Alexander Schrijver - Theory of Linear and Integer Programming PDF
484 pages
f32 Book Parallel Pres pt4
No ratings yet
f32 Book Parallel Pres pt4
106 pages
Embeddings in Hypercubes - 1988 - Mathematical and Computer Modelling
No ratings yet
Embeddings in Hypercubes - 1988 - Mathematical and Computer Modelling
6 pages
unit-3.2 static interconnection networks
No ratings yet
unit-3.2 static interconnection networks
10 pages
Parallel Algorithms: Peter Harrison and William Knottenbelt
No ratings yet
Parallel Algorithms: Peter Harrison and William Knottenbelt
65 pages
05 Notes
No ratings yet
05 Notes
30 pages
Chapter 05
No ratings yet
Chapter 05
32 pages
Mesh and Hypercube
No ratings yet
Mesh and Hypercube
16 pages
Solution 2-DD
No ratings yet
Solution 2-DD
70 pages
UBICC-Cameraready 481
No ratings yet
UBICC-Cameraready 481
9 pages
Parallel 2ndtweek Class2
No ratings yet
Parallel 2ndtweek Class2
18 pages
Introduction
No ratings yet
Introduction
46 pages
VLSI Design of Data Processing Architecture For Wireless Sensor Nodes
No ratings yet
VLSI Design of Data Processing Architecture For Wireless Sensor Nodes
3 pages
Advanced Computer Architecture CSE 8383
No ratings yet
Advanced Computer Architecture CSE 8383
56 pages
Interconnection Network Topology Design Trade-Offs
No ratings yet
Interconnection Network Topology Design Trade-Offs
29 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Using Interconnection Networks We Can
No ratings yet
Using Interconnection Networks We Can
96 pages
Geometric Data Structures For Computer Graphics Elmar Langetepe instant download
100% (2)
Geometric Data Structures For Computer Graphics Elmar Langetepe instant download
77 pages
20_interconnectTopologiesFinal
No ratings yet
20_interconnectTopologiesFinal
47 pages
Chapter 03
No ratings yet
Chapter 03
68 pages
CENG205 W-11b
No ratings yet
CENG205 W-11b
41 pages
++probleme Tot
No ratings yet
++probleme Tot
22 pages
rohini_71721380822
No ratings yet
rohini_71721380822
13 pages
E Mbedding B Us A ND R Ing Into H Ex - C Ell I Nterconnection N Etwork
No ratings yet
E Mbedding B Us A ND R Ing Into H Ex - C Ell I Nterconnection N Etwork
12 pages
Parallel
No ratings yet
Parallel
59 pages
Intro To Communication: - Advantages
No ratings yet
Intro To Communication: - Advantages
13 pages
Computational Complexity Theory
No ratings yet
Computational Complexity Theory
6 pages
L2 Parallel Computing Models
No ratings yet
L2 Parallel Computing Models
31 pages
And Other Applications of Graphs and Networks: Jo Ellis-Monaghan
No ratings yet
And Other Applications of Graphs and Networks: Jo Ellis-Monaghan
34 pages
Parallel Processing Lecture3
No ratings yet
Parallel Processing Lecture3
54 pages
Design and Analysis of Algorithms: Data Structures
No ratings yet
Design and Analysis of Algorithms: Data Structures
19 pages
Lecture - 28
No ratings yet
Lecture - 28
24 pages
An Overview of The Lecture 2: Advanced Topics in Algorithms and Data Structures
No ratings yet
An Overview of The Lecture 2: Advanced Topics in Algorithms and Data Structures
24 pages
Lec 3
No ratings yet
Lec 3
27 pages
A Fast Direct N-Body Solver On The Connection Machine
No ratings yet
A Fast Direct N-Body Solver On The Connection Machine
9 pages
1 s2.0 0166218X9500046T Main
No ratings yet
1 s2.0 0166218X9500046T Main
15 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
System Interconnect Network & Topologies
No ratings yet
System Interconnect Network & Topologies
48 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
Maha Revision DSA Gate Wallah
No ratings yet
Maha Revision DSA Gate Wallah
71 pages
Avda Co2
No ratings yet
Avda Co2
84 pages
The Parma Polyhedra Library User's Manual (Version 1.2)
No ratings yet
The Parma Polyhedra Library User's Manual (Version 1.2)
542 pages
FPGA Based Binary Heap Implementation - With An Application To Web
No ratings yet
FPGA Based Binary Heap Implementation - With An Application To Web
62 pages
Geometric Data Structures For Computer Graphics: Gabriel Zachmann & Elmar Langetepe
No ratings yet
Geometric Data Structures For Computer Graphics: Gabriel Zachmann & Elmar Langetepe
54 pages
Lec1 Introduction
No ratings yet
Lec1 Introduction
23 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
No ratings yet
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
104 pages
SAT SMT by Example
No ratings yet
SAT SMT by Example
585 pages
Ch.11 Graphs: Data Structures: A Pseudocode Approach With C
No ratings yet
Ch.11 Graphs: Data Structures: A Pseudocode Approach With C
65 pages
SAT SMT by Example PDF
No ratings yet
SAT SMT by Example PDF
575 pages
Freefem Doc
No ratings yet
Freefem Doc
398 pages
Chapter 8 - Advanced Parallel Algorithms
No ratings yet
Chapter 8 - Advanced Parallel Algorithms
56 pages
S0285 Optimization of Sparse Matrix Matrixltiplication On GPU
No ratings yet
S0285 Optimization of Sparse Matrix Matrixltiplication On GPU
21 pages
计算机最终考试主题
No ratings yet
计算机最终考试主题
15 pages
3RD UNIT HALF 2
No ratings yet
3RD UNIT HALF 2
8 pages
Computing With Hpadaptive Finite Elements Volume Ii Frontiers Three Dimensional Elliptic And Maxwell Problems With Applications Leszek Demkowicz instant download
No ratings yet
Computing With Hpadaptive Finite Elements Volume Ii Frontiers Three Dimensional Elliptic And Maxwell Problems With Applications Leszek Demkowicz instant download
89 pages
Mapping To Hardware: 6.5930/1 Hardware Architectures For Deep Learning
No ratings yet
Mapping To Hardware: 6.5930/1 Hardware Architectures For Deep Learning
60 pages
Build Logic Gates with Universal NAND: CMOS and TTL in Action
From Everand
Build Logic Gates with Universal NAND: CMOS and TTL in Action
GURUPRASAD N H
No ratings yet
An Introduction To Digital Design
From Everand
An Introduction To Digital Design
Jason King
2/5 (1)
Build Switch and Logic Gates Using Transistors on the Breadboard
From Everand
Build Switch and Logic Gates Using Transistors on the Breadboard
GURUPRASAD N H
No ratings yet
SANY SET240S Off-Highway Mining Truck - 191453
No ratings yet
SANY SET240S Off-Highway Mining Truck - 191453
2 pages
Psychology of Prophetism A Secular Look at The Bible Book by Koenraad
No ratings yet
Psychology of Prophetism A Secular Look at The Bible Book by Koenraad
177 pages
SY26U T4f 144214
No ratings yet
SY26U T4f 144214
7 pages
Jeep Belgaum 2
No ratings yet
Jeep Belgaum 2
58 pages
Pulse and Digital Circuits
No ratings yet
Pulse and Digital Circuits
42 pages
Ijresm 19 22
No ratings yet
Ijresm 19 22
3 pages
Somasekhara - OBJECT ORIENTED PROGRAMMING WITH JAVA - Sample Chapters
No ratings yet
Somasekhara - OBJECT ORIENTED PROGRAMMING WITH JAVA - Sample Chapters
33 pages
Up PL
No ratings yet
Up PL
38 pages
99011M72SC0-74E-XL6-CNG Supppl-Owner Manual PDF
No ratings yet
99011M72SC0-74E-XL6-CNG Supppl-Owner Manual PDF
24 pages
V85 TT - V85 TT Adventure E5
No ratings yet
V85 TT - V85 TT Adventure E5
27 pages
Rohini 27336950025
No ratings yet
Rohini 27336950025
6 pages
Pdfds 7302268529017660702
No ratings yet
Pdfds 7302268529017660702
6 pages
The Rigveda A Historical Analysis Shrikant G Talageri Z Lib Org
No ratings yet
The Rigveda A Historical Analysis Shrikant G Talageri Z Lib Org
739 pages
Isuzu V Cross
No ratings yet
Isuzu V Cross
24 pages
MX 13 Spec Sheet
No ratings yet
MX 13 Spec Sheet
2 pages
Volvo FM Product Guide Euro3 5 en en
No ratings yet
Volvo FM Product Guide Euro3 5 en en
24 pages
MX13 Engine Brochure
100% (2)
MX13 Engine Brochure
8 pages
2020 z900rs Cafe
No ratings yet
2020 z900rs Cafe
44 pages
2021 Teryx4 Le
No ratings yet
2021 Teryx4 Le
16 pages
Tatra t815 790r99 8x8 Container Carrier - en
100% (1)
Tatra t815 790r99 8x8 Container Carrier - en
2 pages
Armored Safari
No ratings yet
Armored Safari
1 page
Isuzu Hi Lander
No ratings yet
Isuzu Hi Lander
25 pages
8x8 Tech Specs
No ratings yet
8x8 Tech Specs
2 pages
12x12 Tech Specs
No ratings yet
12x12 Tech Specs
1 page
TL340H Brochure
No ratings yet
TL340H Brochure
6 pages
Tata Ace Gold Diesel Brochure
No ratings yet
Tata Ace Gold Diesel Brochure
4 pages
Fortuner Spec Table Dec2020
No ratings yet
Fortuner Spec Table Dec2020
2 pages
Graphs in Python: Origins of Graph Theory
No ratings yet
Graphs in Python: Origins of Graph Theory
21 pages
Graphs Lecture Notes
No ratings yet
Graphs Lecture Notes
22 pages
CSC508 TEST2 12july2023
No ratings yet
CSC508 TEST2 12july2023
4 pages
Graphing Rational Functions PDF
No ratings yet
Graphing Rational Functions PDF
4 pages
5.1 Walks, Trails and Paths
No ratings yet
5.1 Walks, Trails and Paths
7 pages
B.SC - Maths Syllabus
No ratings yet
B.SC - Maths Syllabus
32 pages
ADVANCED ALG
No ratings yet
ADVANCED ALG
5 pages
Math a Matics
No ratings yet
Math a Matics
46 pages
Graphs: Nitin Upadhyay
No ratings yet
Graphs: Nitin Upadhyay
29 pages
ISE-4th-Sem-Revised-Syllabus2023-24 (1)
No ratings yet
ISE-4th-Sem-Revised-Syllabus2023-24 (1)
30 pages
The Zagreb Indices of Zero-Divisor Graph of The Ring of Integers Modulo N - Helin Tarq Sadq..
No ratings yet
The Zagreb Indices of Zero-Divisor Graph of The Ring of Integers Modulo N - Helin Tarq Sadq..
31 pages
Discrete Structures Presentation
No ratings yet
Discrete Structures Presentation
7 pages
Sample Lesson Plan in Aral Pan Incorporating Numeracy
100% (2)
Sample Lesson Plan in Aral Pan Incorporating Numeracy
3 pages
Rational Functions Asymptotes
No ratings yet
Rational Functions Asymptotes
3 pages
Lecture 11a- Introduction to Graphs
No ratings yet
Lecture 11a- Introduction to Graphs
17 pages
Cs3401 Algorithm Unit4
No ratings yet
Cs3401 Algorithm Unit4
10 pages
Daa 5 Marks Answers
No ratings yet
Daa 5 Marks Answers
8 pages
A Review on Graph Neural Network Methods in Financial Applications
No ratings yet
A Review on Graph Neural Network Methods in Financial Applications
32 pages
Whitehead Algorithm
No ratings yet
Whitehead Algorithm
4 pages
PATH FINDING - Dijkstra's Algorithm: Ravikiran Jaliparthi Venkat
No ratings yet
PATH FINDING - Dijkstra's Algorithm: Ravikiran Jaliparthi Venkat
6 pages
On The Mathematics of Flat Origami - Hull
No ratings yet
On The Mathematics of Flat Origami - Hull
10 pages
(A40508) Design and Analysis of Algorithms
No ratings yet
(A40508) Design and Analysis of Algorithms
2 pages
Traveling Salesman Problem
No ratings yet
Traveling Salesman Problem
53 pages
dm_ppt[1]
No ratings yet
dm_ppt[1]
12 pages
Cs3401 Algorithm Unit2
100% (1)
Cs3401 Algorithm Unit2
34 pages
MMW Exercise Set 6.4
100% (2)
MMW Exercise Set 6.4
10 pages
Prim's Algorithm For A Growing MST
No ratings yet
Prim's Algorithm For A Growing MST
9 pages
Kerala University s8 Syllabus 2008 Scheme
No ratings yet
Kerala University s8 Syllabus 2008 Scheme
29 pages
Mathematical Problems in Eng..... Paper - 1
No ratings yet
Mathematical Problems in Eng..... Paper - 1
11 pages