Von Neumann Architecture vs. Parallel Processing

The Von Neumann architecture from the 1940s remains the basis for modern computing, though parallel processing allows for additional speedup through multiple processor cores executing instructions simultaneously. While parallel processing increases complexity through additional processors, memory, and communication systems, it provides identical results to sequential processing and remains necessary as gains from refining single processors have slowed. Instruction-level parallelism provided some speedup for Von Neumann architectures by overlapping instruction stages through pipelining, but true parallelism requires multi-core processors and code written to exploit parallel execution across cores. Parallel processing will likely enable continued performance improvements as it dominates modern computing.

Uploaded by

Eric Cosimini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views3 pages

Von Neumann Architecture vs. Parallel Processing

Uploaded by

Eric Cosimini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

c

Von Neumann Architecture vs. Parallel Processing

The computer architecture Von Neumann devised in the 1940s remains the basis for
mainstream computing still today. The base components for his architecture were the core elements:
Input/output devices, Memory, Logic Unit, and Control Unit. Not much has changed over the years
except for our ability to make these units run faster. The technology behind these devices has changed,
but the theoretical concepts have not. For instance, instead of vacuum tubes, we have switched to
transistors and have refined our manufacturing methods to make them smaller and smaller. However,
in the last few years, the exponential leaps and bounds we have been able to achieve in refining our
manufacturing methods has died down. Gains of this nature are still being seen over time but they are
modest gains, not nearly at the rate from the 40s to 70s, or even from the 70s to 90s. These gains are
leveling off quickly and so a change was needed. Parallel processing, while not an extreme change from
the Von Neumann Architecture, is allowing for additional speedup via multiple processor cores which
execute instructions in parallel.

This change in the approach to achieving speedup is exemplified by an excellent quote from the
reading, ͞If you cannot build something to work twice as fast, do two things at once. The results will be
identical.͟ (Schneider & Gersting, 2007, p.226). The results are identical if done correctly, although, the
increased complexity comes at a price too. Not only are multiple processors needed, but typically they
all require their own cache memory, and an extremely intricate inter-processor communication system
to keep the duplicated cashed memory addresses up to date and correct and to keep instructions
appropriately allocated to the multiple processors. Because we are no longer executing instructions in a
sequential fashion, we need to ensure that logically the instructions we execute out of order can be
done without causing a problem. Instruction order is mainly limited by data dependencies, for example
if instruction A needs to read a memory address that instruction B writes to, then B is dependent on A
and cannot be executed before or parallel to it. Depending on the code being executed this can be a
major limitation in parallel processing. There also exist methods such as loop unrolling and register
renaming for increasing the parallelization of instructions but these also come at a cost in complexity.
The end result after all the added complexity has been dealt with is indeed a faster machine. So, the
ends here justify the means but the overall complexity in parallel processor systems may eventually slow
or stop the speedup growth that we can achieve in such systems.

The Von Neumann architecture, for the last 20 years or so, extended its usable lifetime a bit by
incorporating execution methods similar to those used in multi core machines today. A major limitation
that original sequential processing suffered from was the unnecessary idling of system resources. Even
if a Von Neumann machine does not contain multiple cores and thread level parallelism, some degree of
instruction level parallelism is achievable via parallel execution pipelines. (Hennessy & Patterson, 2007,
p.66) ͞All processors since about 1985 use pipelining to overlap the execution of instructions and
improve performance. This potential overlap among instructions is called instruction level parallelism
(ILP), since the instructions can be evaluated in parallel͟. A key difference between a Von Neumann
machine with instruction level parallelism and a muti-processor machine with thread level parallelism, is
that with a single core, instructions cannot actually be executed in parallel, but really just set up in
parallel since there is really only one core to perform executions. Ultimately the goal in instruction
level parallelism is to reduce the clock cycles per instructions (CPI) and this is done by more effectively
utilizing our system resources. While the stages of an instruction need to be performed sequentially,
they can overlap each other and since different recourses are required for the different stages, they can
be done in parallel.

The text, (Schneider & Gersting, 2007, p.219), describes these basic phases as the Fetch Phase,
the Decode Phase, and the Execute phase, but generally when implementing a basic instruction pipeline
we use five stages, the extra two being Memory Access and Register Write Back. Having five stages will
mean that we should also have five pipelines to execute the different stages for five instructions in
parallel. A visual aid here (From Wikipedia͛s Parallel Computing) can help to conceptualize how this
works:

͞A canonical five-stage pipeline in a RISC machine (IF = Instruction Fetch, ID = Instruction Decode, EX =
Execute, MEM = Memory access, WB = Register write back)͟

While executing these pipelined instructions in parallel will speed up execution and help utilize
our system resources it will not achieve a five fold decrease in CPI. Since a chain is only as strong as its
weakest link, the pipeline which requires the most clock cycles can hold up the other pipelines while
they wait for it to complete.

Another extremely important aspect parallel processing (especially as it relates to a

programming languages course) is that for its parallelism to be fully utilized it depends on code written
to exploit it. It is possible to execute differing jobs in parallel and still gain loads of benefit, but for an
individual program to be efficiently parallelized it requires the programmer to write code in a manner in
which it can be divided and conquered by these cores. This is a big change as far as programming is
concerned. Assuming it is logically ok for us to do so, instead of simply writing our code to do task A, do
task B, do task C, we can now write code which does these three tasks in parallel. If the tasks are really
independent of each other, most high level programming languages support syntax which makes doing
things like this fairly easy. Programming can get complicated if various tasks are somewhat dependent
on each other and we still want to execute them in parallel. Here we would need to facilitate
communication between different threads which we have spun out, this can get a little confusing but is
sometimes still worth the trouble. By putting a little extra effort in on the programming end we can
achieve far better algorithmic efficiency by using techniques like this in the right circumstances.

In conclusion, the days of a single processor Von Neumann architecture are at an end because
the concepts it is based on cannot be refined much further. This is true in terms of transistor size, clock
frequency, and even innovative logical level tweaks such as pipelining. Short of a breakthrough in one of
those areas the next most practical way to achieve significant speedup is through the use of thread
level, multi core, parallel processing. Parallel processing machines have been gaining steadily in
popularity over the last 10 years and now dominate the sales rack at best buy. The overall architecture
behind these machines is nothing revolutionary. They are based on the Von Neumann architecture and
only vary in that they have multiple processing cores and require all the circuitry needed for those cores
to play nicely with one another. While we wait for quantum computing to come of age or some other
more drastic change in how computing is done, hopefully we will be able to exploit parallel processing to
achieve speedup for years to come.

References:

http://en.wikipedia.org/wiki/Parallel_computing
! "# $# "%! & '(Invitation to Computer Science Third
Edition ± Java Version
u"% $ ")*!+& '(Computer Architecture Fourth Edition ± A
Quantitative Approach

How To Write A Movie Script
100% (8)
How To Write A Movie Script
22 pages
Unspoken Rules and Belonging
100% (2)
Unspoken Rules and Belonging
4 pages
Rizal's Travel Timeline
61% (80)
Rizal's Travel Timeline
43 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Books
No ratings yet
Books
14 pages
Moduspec Checklist
100% (10)
Moduspec Checklist
64 pages
Reliance Power Power Finance Corporation
No ratings yet
Reliance Power Power Finance Corporation
5 pages
Current Affairs Bit Bank
No ratings yet
Current Affairs Bit Bank
73 pages
Real Is The Last Thing That Strikes You About Reality Shows
No ratings yet
Real Is The Last Thing That Strikes You About Reality Shows
4 pages
Veritas Netbackup Interview Questions
100% (2)
Veritas Netbackup Interview Questions
4 pages
C It Means Understanding Your Own and Other People S Emotional
No ratings yet
C It Means Understanding Your Own and Other People S Emotional
3 pages
GATS
No ratings yet
GATS
4 pages
Your TOP 3, MBCA Clinic Notes
No ratings yet
Your TOP 3, MBCA Clinic Notes
5 pages
Part 2 Physical Layer
No ratings yet
Part 2 Physical Layer
10 pages
Remote Controlled Switch Board
No ratings yet
Remote Controlled Switch Board
7 pages
Seminar Topics
No ratings yet
Seminar Topics
6 pages
Components of Industrial Engineering Class 2
No ratings yet
Components of Industrial Engineering Class 2
13 pages
Gate 2011 Preparation: A Complete Guide A Set Of Μmust Read¶ Articles For M.Tech Aspirants Number Of Students Appeared In Gate 2010 Gate 2010 Qualifying Marks Highest Marks In Gate 2010
No ratings yet
Gate 2011 Preparation: A Complete Guide A Set Of Μmust Read¶ Articles For M.Tech Aspirants Number Of Students Appeared In Gate 2010 Gate 2010 Qualifying Marks Highest Marks In Gate 2010
3 pages
Reading Log 1: Traditional Literature and Modern Fantasy: Renee Jackson
No ratings yet
Reading Log 1: Traditional Literature and Modern Fantasy: Renee Jackson
35 pages
Written Spoken Reading English
No ratings yet
Written Spoken Reading English
2 pages
Business Strategy
100% (1)
Business Strategy
17 pages
Pruea Eva
No ratings yet
Pruea Eva
3 pages
Cost Acc
50% (2)
Cost Acc
1,215 pages
Bar Den Stein Poster
No ratings yet
Bar Den Stein Poster
1 page
1987 Philippine Constitution
No ratings yet
1987 Philippine Constitution
6 pages
The Visit of Wise Men
No ratings yet
The Visit of Wise Men
3 pages
NORSU Awards Honor Students
No ratings yet
NORSU Awards Honor Students
6 pages
Final 1
No ratings yet
Final 1
17 pages
Sap Cust Mast Fields Ver 1.0
No ratings yet
Sap Cust Mast Fields Ver 1.0
13 pages
All About Madonna
No ratings yet
All About Madonna
19 pages
60 Math Questions
No ratings yet
60 Math Questions
19 pages
The Democratic Vote
No ratings yet
The Democratic Vote
2 pages
Lola Basyang
0% (1)
Lola Basyang
4 pages
Jose Rizal
No ratings yet
Jose Rizal
8 pages
Gershwin Outline-Milestone 2
No ratings yet
Gershwin Outline-Milestone 2
2 pages
Sanjeev Mansotra
No ratings yet
Sanjeev Mansotra
2 pages
MM Quiz 2
No ratings yet
MM Quiz 2
26 pages
Kingsley NG Wiki
No ratings yet
Kingsley NG Wiki
4 pages
2
No ratings yet
2
53 pages
SWOT Analysis Acer Inc.
No ratings yet
SWOT Analysis Acer Inc.
2 pages
Mami's Resume
No ratings yet
Mami's Resume
2 pages
Ubuntu Uninstallation
No ratings yet
Ubuntu Uninstallation
2 pages
Brecht's "A Short Organum For The Theatre"
100% (3)
Brecht's "A Short Organum For The Theatre"
4 pages
Section 2-Group 07 - Xanadu Hospital
No ratings yet
Section 2-Group 07 - Xanadu Hospital
6 pages
Jain Irrigation
No ratings yet
Jain Irrigation
36 pages
Jack Davis For Congress
No ratings yet
Jack Davis For Congress
3 pages
Dbms
No ratings yet
Dbms
25 pages
Zakat Calculator Ver 2005
No ratings yet
Zakat Calculator Ver 2005
5 pages
Professor Predicts Human Time Travel This Century
No ratings yet
Professor Predicts Human Time Travel This Century
3 pages
Pharmaceutical Sciences 3320
No ratings yet
Pharmaceutical Sciences 3320
35 pages
Alp en Lie Be
No ratings yet
Alp en Lie Be
7 pages
Durga Kavacham
No ratings yet
Durga Kavacham
4 pages
Psychology of Bomb Disposal Experts
100% (1)
Psychology of Bomb Disposal Experts
2 pages
Division of Legitime
No ratings yet
Division of Legitime
3 pages
NVMe Performance Hacks
From Everand
NVMe Performance Hacks
Mei Gates
No ratings yet
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
Concurrency in C++: Writing High-Performance Multithreaded Code
From Everand
Concurrency in C++: Writing High-Performance Multithreaded Code
Robert Johnson
No ratings yet
Concurrency and Multithreading in C: POSIX Threads and Synchronization
From Everand
Concurrency and Multithreading in C: POSIX Threads and Synchronization
Larry Jones
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
EL2043 L09 Rangkaian Logika
No ratings yet
EL2043 L09 Rangkaian Logika
103 pages
GL827 Usb 2.0
No ratings yet
GL827 Usb 2.0
47 pages
Manual Pcchips P23g V1.0a: Read/Download
20% (5)
Manual Pcchips P23g V1.0a: Read/Download
2 pages
Raspberry Pi: Access A Motor With Gertboard and Node - Js
No ratings yet
Raspberry Pi: Access A Motor With Gertboard and Node - Js
5 pages
Infotech4 Intermediate Unit2 Workbook PDF
No ratings yet
Infotech4 Intermediate Unit2 Workbook PDF
1 page
Embedded Systems and IoT - CS3691 - Notes
No ratings yet
Embedded Systems and IoT - CS3691 - Notes
101 pages
MAIN Electrical Parts List: Firmware
No ratings yet
MAIN Electrical Parts List: Firmware
13 pages
Rtos
No ratings yet
Rtos
34 pages
Globus: Lab Manual of Cmos & Vlsi Design
No ratings yet
Globus: Lab Manual of Cmos & Vlsi Design
15 pages
Question Bank Advanced Processors
67% (6)
Question Bank Advanced Processors
2 pages
EL Project Report
No ratings yet
EL Project Report
5 pages
Design of Very Large Scale Integration Circuits
No ratings yet
Design of Very Large Scale Integration Circuits
22 pages
User Manual OMRON CPM2A E
No ratings yet
User Manual OMRON CPM2A E
2 pages
Unit 4 Design and Synthesis of Datapath Controllers: Department of Communication Engineering, NCTU
No ratings yet
Unit 4 Design and Synthesis of Datapath Controllers: Department of Communication Engineering, NCTU
25 pages
Program Outcomes (Pos) :: Electronic and Communication Engineering
No ratings yet
Program Outcomes (Pos) :: Electronic and Communication Engineering
10 pages
Experiment 1- DDCA
No ratings yet
Experiment 1- DDCA
9 pages
Muhammad Muneeb Arshad (359126)
No ratings yet
Muhammad Muneeb Arshad (359126)
5 pages
Memorias Floyd
No ratings yet
Memorias Floyd
36 pages
DSD Lab SENSE Manual Sample
No ratings yet
DSD Lab SENSE Manual Sample
127 pages
MPL Lab Manual 2018-2019 PDF
No ratings yet
MPL Lab Manual 2018-2019 PDF
101 pages
AN-D26 High Voltage Isolated MOSFET Driver: Features
No ratings yet
AN-D26 High Voltage Isolated MOSFET Driver: Features
6 pages
MPMC Lab Manual 15-11-2016
No ratings yet
MPMC Lab Manual 15-11-2016
139 pages
8086 Microprocessor: Multiple Choice Questions
No ratings yet
8086 Microprocessor: Multiple Choice Questions
24 pages
K20 Sub-Family Reference Manual: Supports: MK20DX128VLL7, MK20DX256VLL7, MK20DX64VMC7, MK20DX128VMC7, MK20DX256VMC7
No ratings yet
K20 Sub-Family Reference Manual: Supports: MK20DX128VLL7, MK20DX256VLL7, MK20DX64VMC7, MK20DX128VMC7, MK20DX256VMC7
1,449 pages
Ieee Paper
No ratings yet
Ieee Paper
6 pages
M6800. Assembly Language Programming
No ratings yet
M6800. Assembly Language Programming
53 pages
Lesson 4 - Parts of Computer
100% (3)
Lesson 4 - Parts of Computer
32 pages
Fingerprint Computer Interface: Figure 1shows The Proposed Block Diagram
No ratings yet
Fingerprint Computer Interface: Figure 1shows The Proposed Block Diagram
10 pages
Acer Aspire5560 (Wistron Ag1 UMA-1-0309 PDF
No ratings yet
Acer Aspire5560 (Wistron Ag1 UMA-1-0309 PDF
45 pages
Dell c600
100% (1)
Dell c600
36 pages

Von Neumann Architecture vs. Parallel Processing

Uploaded by

Von Neumann Architecture vs. Parallel Processing

Uploaded by

c 

Von Neumann Architecture vs. Parallel Processing

Another extremely important aspect parallel processing (especially as it relates to a

You might also like

c