0% found this document useful (0 votes)
6 views

2-Architecture

Uploaded by

Asdasd Sdas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

2-Architecture

Uploaded by

Asdasd Sdas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Operating Systems

Eric Lo

2 Modern Computer Architecture


Overview, Glossary, and Revision
Basic Computer Architecture

http://faculty.ycp.edu/~dhovemey/spring2006/cs101/lecture/lecture1.html
https://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/1_Introduction.html 2
Inside a CPU -- Registers

Ref: Teach ICT

3
Inside a CPU
• Program Counter:
• A register that contains the memory address of the next instruction
• Instruction Cycle (Fetch-Decode-Execute-MemoryAccess-WriteBack)
– “Fetch”: fetch the next instruction from the memory
– “Decode”: prepares various registers in readiness of the next step.
– E.g., instruction: ADD a, b will
» Load content of a to register 1
» Load content of b to register 2
– “Execute”: carry out the computation / compute the memory address
– “MemoryAccess”: read/write from/to memory
• For LOAD/STORE instruction
– “WriteBack”: write back results to the
register

CPU speed = CPU clock rate


E.g., 1.8 GHz can perform 1,800,000,000
clock cycles per second

4
Sequential vs Scalar (Pipeline)

4 instructions are running in parallel,


though in different phases

Ref.lighterra.com; the Memory Access Phase is missing in the figure because sequential processors are too old, no MA
phase by that time

5
Scalar vs Superscalar
• Scalar = Pipelining
• Super-scalar =
instruction-level
parallelism
= multiple cycles in
parallel

6
Processor Design

RISC vs CISC. Which one is better? Why?


Which one dominates the market? Why?

http://www.markedbyteachers.com

7
Intel

RF Wireless World

8
Intel’s Dilemma
• Now almost all compliers generate CISC ISA code

machine
code Decode: CSIC -> RISC

Execute RISC code

9
More inside a modern CPU

edux.pjwstk.edu.pl
10
More inside a modern CPU
• With speculative execution
– Instructions executed != Instructions retired
• OoO execution (More instruction-level parallelism)
– Examines a sliding window of consecutive instructions
• The “instruction window”
– ”Ready” instructions that don’t (or no longer) depend on
any former instructions could be executed first
– https://en.wikipedia.org/wiki/Out-of-order_execution
• Can the compiler do the job?
– Complier doesn’t have the runtime information
• E.g., it can’t pick ahead if the next instruction is a branch

11
Multi-core + NUMA
• Even more cycles in parallel

Intel Sandy Bridge


Ref: Paul Sim

12
Memory Hierarchy

Ref. DZone 13
Why we have to know all these?
• Random access vs. sequential access time on HDD
• OS and all other systems had been optimizing to
reduce random access on HDD
• “Tape is Dead, Disk is Tape, Flash is Disk, RAM
Locality is King.”
• What about
– Random access vs. sequential access time on RAM?
– Random access hurts CPU’s speculative execution
• => branch mis-prediction => expensive!
• How to achieve “if” without using “if”?

14
Wait… how are these hardware things related to OS?
• Kernel is the piece of of software that directly deals with the
hardware
• OS/Kernel programmers should arguably be writing the most
efficient programs on earth
– Efficient on memory usage
• Remember PDP-7 only had 9kB RAM
– Efficient on speed
• If kernel programs don’t care about speed, who cares more?
– Different implementations for different architectures!
• E.g., Intel Xeon’s cache is inclusive whereas AMD Opteron’s cache is non-
inclusive
– When reading X from memory,
» Inclusive: X will have a copy from L1 to L3
» Non-inclusive: X will read X into L1 directly (L2 and L3 have no copies)
• Kernel programmers need to know this when designing cache replacement
policies
• That also explains why…

15
Different implementations are needed for different
architectures
• x

16
CPU’s Privileged Instructions
• Some instructions just add two values
• Some instructions are privileged
– E.g., set the segmentation boundary of the memory
• CPU has a 1-bit register to check if currently in user-
mode or kernel-mode
– If (user-mode & this-instruction-is-privileged) then
• generate an “insufficient privilege access” exception
• On an exception / a hardware interrupt
– CPU will go to a hardcoded memory address to lookup
the corresponding handler

17
T.Anderson book
Physical Memory Address
from 000000H to 0003FFH (1024 bytes)
Kernel space

18
Hacking?
• So, a malicious user writes code to access other
memory region through using some privileged
instruction directly?
– Any normal program must run on top of your OS and
your OS won’t permit your program to set that bit
– So the most powerful attack is that you gain physical
access to a machine and insert a boot device and reboot
to your own OS…
• But this is not the kind of security we concern

19
System Calls
• A benign program that wants to do something low-
level?
– Through system calls
• written by OS developers
• exposed for anyone to write program on

int a_sys_call() {
1-bit register = kernel-mode;
… access the kernel memory
1-bit register = user-mode;
}

20

You might also like