0% found this document useful (0 votes)
9 views20 pages

2-Architecture

Uploaded by

Asdasd Sdas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views20 pages

2-Architecture

Uploaded by

Asdasd Sdas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Operating Systems

Eric Lo

2 Modern Computer Architecture


Overview, Glossary, and Revision
Basic Computer Architecture

http://faculty.ycp.edu/~dhovemey/spring2006/cs101/lecture/lecture1.html
https://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/1_Introduction.html 2
Inside a CPU -- Registers

Ref: Teach ICT

3
Inside a CPU
• Program Counter:
• A register that contains the memory address of the next instruction
• Instruction Cycle (Fetch-Decode-Execute-MemoryAccess-WriteBack)
– “Fetch”: fetch the next instruction from the memory
– “Decode”: prepares various registers in readiness of the next step.
– E.g., instruction: ADD a, b will
» Load content of a to register 1
» Load content of b to register 2
– “Execute”: carry out the computation / compute the memory address
– “MemoryAccess”: read/write from/to memory
• For LOAD/STORE instruction
– “WriteBack”: write back results to the
register

CPU speed = CPU clock rate


E.g., 1.8 GHz can perform 1,800,000,000
clock cycles per second

4
Sequential vs Scalar (Pipeline)

4 instructions are running in parallel,


though in different phases

Ref.lighterra.com; the Memory Access Phase is missing in the figure because sequential processors are too old, no MA
phase by that time

5
Scalar vs Superscalar
• Scalar = Pipelining
• Super-scalar =
instruction-level
parallelism
= multiple cycles in
parallel

6
Processor Design

RISC vs CISC. Which one is better? Why?


Which one dominates the market? Why?

http://www.markedbyteachers.com

7
Intel

RF Wireless World

8
Intel’s Dilemma
• Now almost all compliers generate CISC ISA code

machine
code Decode: CSIC -> RISC

Execute RISC code

9
More inside a modern CPU

edux.pjwstk.edu.pl
10
More inside a modern CPU
• With speculative execution
– Instructions executed != Instructions retired
• OoO execution (More instruction-level parallelism)
– Examines a sliding window of consecutive instructions
• The “instruction window”
– ”Ready” instructions that don’t (or no longer) depend on
any former instructions could be executed first
– https://en.wikipedia.org/wiki/Out-of-order_execution
• Can the compiler do the job?
– Complier doesn’t have the runtime information
• E.g., it can’t pick ahead if the next instruction is a branch

11
Multi-core + NUMA
• Even more cycles in parallel

Intel Sandy Bridge


Ref: Paul Sim

12
Memory Hierarchy

Ref. DZone 13
Why we have to know all these?
• Random access vs. sequential access time on HDD
• OS and all other systems had been optimizing to
reduce random access on HDD
• “Tape is Dead, Disk is Tape, Flash is Disk, RAM
Locality is King.”
• What about
– Random access vs. sequential access time on RAM?
– Random access hurts CPU’s speculative execution
• => branch mis-prediction => expensive!
• How to achieve “if” without using “if”?

14
Wait… how are these hardware things related to OS?
• Kernel is the piece of of software that directly deals with the
hardware
• OS/Kernel programmers should arguably be writing the most
efficient programs on earth
– Efficient on memory usage
• Remember PDP-7 only had 9kB RAM
– Efficient on speed
• If kernel programs don’t care about speed, who cares more?
– Different implementations for different architectures!
• E.g., Intel Xeon’s cache is inclusive whereas AMD Opteron’s cache is non-
inclusive
– When reading X from memory,
» Inclusive: X will have a copy from L1 to L3
» Non-inclusive: X will read X into L1 directly (L2 and L3 have no copies)
• Kernel programmers need to know this when designing cache replacement
policies
• That also explains why…

15
Different implementations are needed for different
architectures
• x

16
CPU’s Privileged Instructions
• Some instructions just add two values
• Some instructions are privileged
– E.g., set the segmentation boundary of the memory
• CPU has a 1-bit register to check if currently in user-
mode or kernel-mode
– If (user-mode & this-instruction-is-privileged) then
• generate an “insufficient privilege access” exception
• On an exception / a hardware interrupt
– CPU will go to a hardcoded memory address to lookup
the corresponding handler

17
T.Anderson book
Physical Memory Address
from 000000H to 0003FFH (1024 bytes)
Kernel space

18
Hacking?
• So, a malicious user writes code to access other
memory region through using some privileged
instruction directly?
– Any normal program must run on top of your OS and
your OS won’t permit your program to set that bit
– So the most powerful attack is that you gain physical
access to a machine and insert a boot device and reboot
to your own OS…
• But this is not the kind of security we concern

19
System Calls
• A benign program that wants to do something low-
level?
– Through system calls
• written by OS developers
• exposed for anyone to write program on

int a_sys_call() {
1-bit register = kernel-mode;
… access the kernel memory
1-bit register = user-mode;
}

20

You might also like