Comparch Book Summary Ish
Comparch Book Summary Ish
Week 2
2.1 (Introduction)
2.2 (Operations of the Computer Hardware);
2.3 (Operands of the Computer Hardware);
2.4 (Signed and Unsigned Numbers);
3.5.1 (Floating Point);
3.10 (Concluding Remarks);
___________________________________________________________________________
Week 3
A.2 (Gates, Truth tables, and Logic Equations); A.3.1 (Combinational Logic - decoder); A.3.2 (Combinational Logic -
multiplexer); A.5.1 (1-bit ALU) ; A.8 (Flip-flops, latches, registers)
___________________________________________________________________________
Week 4
2.5 (Representing Instructions in the Computer);
2.6 (Logical Operations);
2.7 (Instructions for Making Decisions);
RISC-V simulator demo
___________________________________________________________________________
Week 5
2.8 (Supporting Procedures in Computer Hardware);
2.12 (Translating and Starting a Program);
2.13 (A C Sort Example to Put It All Together);
2.14 (Arrays versus Pointers);
A.7 (clocks); Extra: interrupts
___________________________________________________________________________
Week 6
4.1 (Introduction);
4.2 (Logic Design Conventions);
4.3 (Building a Datapath);
4.4 (A Simple Implementation Scheme);
4.5 (An Overview of Pipelining);
5.1 (Introduction);
5.3 (The Basics of Caches);
Extra: System Bus incl. address lines, data lines and control lines
Week 7
priviliged mode!
5.5.1 (Dependable Memory Hierarchy - Defining failure);
5.6 (Virtual Machines);
5.18.6 (Historical Perspective and Further Reading - xxxxx);
5.18.7 (Historical Perspective and Further Reading - xxxxx);
6.1 (Introduction);
6.14 (Concluding Remarks);
A10 (Finite-State Machines);
Slatzer & Schroeder (1975)
Note: All terms are relative to computer architectures
Chapter 1.2
Background information.
Chapter 1.3
Prediction: Start working rather than wait until you know for sure, assuming that the mechanism is
able to recover from a misprediction without it being too expensive.
Input device: A mechanism through which the computer is fed information, such as a keyboard.
Output device: A mechanism that conveys the result of a computation to a user, such as a display, or
to another computer.
<Shit about LCD and pixels (matrix of bits called bit map) i guess go read the book>
Integrated circuit: Also called a chip. A device combining dozens to millions of transistors.
Central processor unit (CPU): The active part of the computer, which contains the datapath and
control and which adds numbers, tests numbers, signals I/O devices
to activate, and so on.
Control: The component of the processor that commands the datapath, memory, and I/O devices
according to the instructions of the program.
Dynamic random access memory (DRAM): Memory built as an integrated circuit; it provides random
access to any location. (Access times are 50 nanoseconds and cost
per gigabyte in 2012 was $5 to $10.)
Cache memory: A small, fast memory that acts as a buffer for a slower, larger memory.
Static random access memory (SRAM): Memory built as an integrated circuit, but faster and less
dense than DRAM.
Instruction set architecture: An abstract interface between the hardware and the lowest-level
software that encompasses all the information necessary to write a
machine language program that will run correctly, including
instructions, registers, memory access, I/O, and so on.
Application binary interface (ABI): The user portion of the instruction set plus the operating system
interfaces used by application programmers. It defines a standard for
binary portability across computers.
Keep in mind that adding two positive numbers can result in an overflow (= the result of an
operation cannot be represented with the available hardware). A 4 bit number can only represent
2^4=16 numbers, so 0-15.
Subtraction in binary also works the same as subtraction in decimals (see image below). Simply put
the two binary numbers under each other and subtract them.
A more solid option to subtract numbers in binary is to add the two’s complement representation of
the negative number(s) to your positive number.
First convert your negative number to a two’s complement number, for example -7, by taking the
following steps:
1. Convert +7 to binary.
2. Invert +7. (zeros become ones, ones become zeros)
3. Add 1 to the inverted number.
You can check if your calculated twos complement number is correct by adding all the numbers, but
taking the most left bit as a negative number.
After this, you can add the two’s complement number to your positive number.
This was calculated as 7 bit number, so this calculation resulted in an overflow. You can ignore this
number, and the rest of your number will be the result.
The next example is calculated as 8bit number. The result of this calculation is a negative (two’s
complement) number. To check if the number is correct you can apply the same method as shown
earlier: adding all numbers together.
Overflow occurs in subtraction when we subtract a negative number from a positive number and get
a negative result, or when we subtract a positive number from a negative number and get a positive
result.
Arithmetic Logic Unit (ALU): Hardware that performs addition, subtraction and usually logical
operations such as AND and OR.
Chapter 3.3 Multiplication
Multiplication with binary is done the same was as multiplication with decimals. You can simply put
the two numbers under each other and multiply every digit in the bottom number with all the digits
of the top number.
__________________________________________________________________________________
Week 2: 2.1-4, 3.5.1 and 3.10
Chapter 2.1 Introduction
Instruction set: The list of commands understood by a given architecture.
Stored-program concept: The idea that instructions and data of many types can be stored in
memory as numbers and thus be easy to change, leading to the stored-program computer.
Chapter 2.2
We use the RISC-V assembly language notation. All instructions in RISC-V take 1 clock cycle, except
load and store.
RISC-V has 32 registers: 0x-31. In RISC-V, data must be in registers to perform arithmetic.
a=b+c
d=a–e
The compiler will translate these from the high-level programming language instructions to RISC-V
assembly language instructions. These instructions will look like this:
add a, b, c
sub, d, a ,e
Chapter 2.3
The size of a register in the RISC-V architecture is 32 bits. This group of 32 bits is also referred to as a
word. A group of 64 bits would be called a doubleword in the RISC-V architecture.
Address: a value used to delineate the location of a specific data element withing a memory array.
f = (g + h) – (i + j);
In RISC-V code:
g = h + A[8]
In RISC-V code:
lw x9, 8(x22) // (load word)x9 temporarily gets the value of x22 with offset 8 (array A at index 8)
add x20, x21, x9 // x20 (= g) gets the value of x21 (= h) + x9
sw x9, 8(x22) // (store word)The value of x9 gets put back into x22 with offset 9(array A at index 8)
An alternative that avoids the load instruction is a version of the arithmetic instruction in which one
operand is a constant. To add a constant number to a register we can use addi.
d x base^i
2^1 = 2
2^0 = 1
_______+
3
Chapter 3.10
Concluding remarks, just read the book I guess.
__________________________________________________________________________________
Week 3:
To be continued
A.2 (Gates, Truth tables, and Logic Equations); A.3.1 (Combinational Logic - decoder); A.3.2
(Combinational Logic - multiplexer); A.5.1 (1-bit ALU) ; A.8 (Flip-flops, latches, registers)
A problem occurs when an instruction needs longer fields than those shown above. For this reason,
different formats were introduced. The format mention above is called the R-type.
All types:
R = Register, Arithmetic instruction format
I = Immediate, Loads & immediate arithmetic
S = Stores
SB = Conditional branch format
UJ-type = Unconditional jump format
U-type = Uppet immediate format
I-type format:
S-type format:
Examples:
Code example:
A[30] = h + A[30] + 1;
(I-type)
lw x9, 120(x10) // Temporary register x9 gets the value of register x10 (array A) with offset 120 (index 30,
120 = 30x4).
(R-type)
add x9, x21, x9 // Temporary register x9 gets the value of x21 (h) + x9 (array A with index 30)
(I-type)
addi x9, x9, 1 // Temporary register x9 gets the value of x9 (currently h+A[30]) + constant 1
(S-type)
sw x9, 120(x10) // Store the value of x9 back into x10 with offset 120 (A[30])
Keep in mind that the offset in this case is stored in two variables: immediate[11:5] and
immediate[4:0]. The first 7 bits (3 = 0000011) and the last 5 bits (24 = 11000). Put these two after
each other and you get 0000011 11000, which makes the binary value for 120.
A logical bit shift left simply consists of moving the bits one to the left. In this process the most
significant bit is lost and a 0 is added to the right side.
A logical bit shift left by 4 consists of moving the bits 4 spots to the left. In this case the 4 most
significant bits are lost and 4 zeros are added to the right side.
Mind that doing this would be the same as calculating 9x2x2x2x2=144. 2 being the base (binary).
A logical bit shift right simply consists of moving the bits one to the right. In doing so, the least
significant bit is discarded, and a 0 is added in front of the most significant bit.
Mind that doing this would be the same as calculating 26/2=13. 2 being the base (binary). If a
remainder comes out of this equation, the remainder is discarded. For example:
5/2 would be 2,5. We discard the remainder 5 and the bit shift result is 2.
An arithmetic bit shift right consists of moving the bits to the right. The least significant bit is lost
and the most significant bit is copied.
slli x11, x19, 4 // register x11 gets the value of x19 << 4 bits
Code:
and x9, x10, x11 // register x9 gets the value of x10 & x11
Example of AND (&):
0011
1010
____AND
0010
xor x9, x10, x11 // register x9 gets the value of x10 ^ x11
Example of XOR (^):
0011
1010
____XOR
1001
Chapter 2.7
RISC-V assembly language includes two decision-making instructions, similar to and if statement:
beq and bne.
if (i == j) f = g + h; else f = g − h;
bne x22, x23, Else // Go to the Else block if x22 (i) does not equal x23 (j)
add x19, x20, x21 // Skipped if x22 (i) does not equal x23 (j)
beq x0, x0, Exit // Unconditional branch, an expression that is always true
Else: sub x19, x20, x21 // Skipped if x22 (i) equals x23 (j)
Exit: // Stop the loop
while (save[i] == k)
i += 1;
Loop: slli x10, x22, 2 // Temporary register x10 gets the value of i*4 (bitshift 2)
add x10, x10, x25 // x10 = address of save[i]
lw x9, 0(x10) // Temporary register x9 getst the value of save[i]
bne x9, x24, Exit // Go to exit if x9 (save[i]) is not equal to x24 (k)
addi x22, x22, 1 // x22(i) = x22(i) + 1
beq x0, x0, Loop // Unconditional statement (statement is always true), loop again
Exit: // Stop the loop
<Something about indexoutofbounds shortcut and branch address table for switch/case>
Register x10-x17: Eight parameter registers in which to pass parameters or return values.
Register x1: One return address register to return to the point of origin.
Jump and link instruction (jal): Branches to an address and simultaneously saves the address of the
following instruction to the destination register rd.
jal x1, ProcedureAddreess // Jump to ProcedureAddress and write return address to x1.
Jump and link register (jalr): Branches to the address stored in register x1.
Program counter (PC): The register containing the address of the instruction in the program being
executed
Stack: A data structure for spilling registers organized as a last-infirst-out queue. Element can be
added and removed using push and pop.
Stack pointer: A value denoting the most recently allocated address in a stack that shows where
registers should be spilled or where old register values can be found. In RISC-V, it is register sp, or x2.
TODO:
RISC-V makes sure register x0 always has the value 0. If a value is written to this register, it will be
overwitten with 0. Whenever the register x0 is used, it always supplies a 0.
<Something about pseudoinstructions giving RISC-v a richer set of assembly language instructions>
Linker (Also called link editor): A systems program that combines independently assembled
machine language programs and resolves all undefined labels into an executable file.
Executable file: A functional program in the format of an object file that contains no unresolved
references. It can contain symbol tables and debugging information. A “stripped executable” does
not contain that information. Relocation information may be included for the loader.
Dynamically linked libraries (DLLs): Library routines that are linked to a program during execution.
The lazy procedure linkage is used in the below example.
Java bytecode: Instruction from an instruction set designed to interpret Java programs.
Java Virtual Machine (JVM): The program that interprets Java bytecodes.
Just In Time compiler (JIT): The name commonly given to a compiler that operates at runtime,
translating the interpreted code segments into the native code of the computer.
Chapter 2.13
<Swap procedure>
<Sort example>
Chapter 2.14
<Arrays vs pointers>
Chapter A.7
Edge-triggered clocking: A clocking scheme in which all state changes occur on a clock edge.
Clocking methodology: The approach used to determine when data are valid and stable relative to
the clock.
Synchronous system: A memory system that employs clocks and where data signals are read only
when the clock indicates that the signal values are stable.
In an edge-triggered design, either the rising or falling edge of the clock is active and causes state to
be changed.
Week 6: 4.1-5, 5.1
Chapter 4.1
For every instruction in RISC-V, the first two steps are identical:
1. Send the program counter (PC) to the memory that contains the code and fetch the instruction
from that memory.
2. Read one or two registers, using fields of the instruction to select the registers to read. For the lw
instruction, we need to read only one register, but most other instructions require reading two
registers.
After these two steps, the actions required to complete the instruction depend on the instruction
class.
Above is a basic implementation of the RISC-V subset, including the necessary multiplexors and
control lines (blue). The PC does not require a write control, since it is written once at the end of
every clock cycle; the branch control logic determines whether it is written with the incremented PC
or the branch target address. Appendix A in the book describes the multiplexor.
Chapter 4.2
Combinational element: An operational element, such as an AND gate or an ALU. Given a set of
inputs, it always produces the same output because it has no internal storage.
Other elements in the design are not combinational, but instead contain state.
Control signal: A signal used for multiplexor selection or for directing the operation of a functional
unit; contrasts with a data signal, which contains information that is operated on by a functional
unit.
Chapter 4.3
Datapath element: A unit used to operate on or hold data within a processor. In the RISC-V
implementation, the datapath elements include the instruction and data memories, the register file,
the ALU, and adders.
Register file: A state element that consists of a set of registers that can be read and written by
supplying a register number to be accessed.
Sign-extend: To increase the size of a data item by replicating the high-order sign bit of the original
data item in the highorder bits of the larger, destination data item.
Branch target address: The address specified in a branch, which becomes the new program counter
(PC) if the branch is taken. In the RISC-V architecture, the branch target is given by the sum of the
offset field of the instruction and the address of the branch.
Branch taken: A branch where the branch condition is satisfied and the program counter (PC)
becomes the branch target. All unconditional branches are taken branches.
Branch not taken or (untaken branch): A branch where the branch condition is false and the
program counter (PC) becomes the address of the instruction that sequentially follows the branch.
The operations of arithmetic-logical (or R-type) instructions and the memory instructions datapath
are quite similar. The key differences are the following:
- The arithmetic-logical instructions use the ALU, with the inputs coming from the two
registers. The memory instructions can also use the ALU to do the address calculation,
although the second input is the signextended 12-bit offset field from the instruction.
- The value stored into a destination register comes from the ALU (for an R-type instruction)
or the memory (for a load).
Chapter 4.4
Truth table: From logic, a representation of a logical operation by listing all the values of the inputs
and then in each case showing what the resulting outputs should be.
Don’t-care term: An element of a logical function in which the output does not depend on the values
of all the inputs. Don’t-care terms may be specified in different ways.
Although the single-cycle design will work correctly, it is too inefficient to be used in modern
designs.
At the end of a clock cycle, all data that is used in subsequent clock cycles must be stored in a state
element.
Data used by subsequent instructions in a later clock cycle is stored into one of the programmer-
visible state elements: the register file, the PC, or the memory. In contrast, data used by the same
instruction in a later cycle must be stored into one of these additional registers that are appended to
each functional unit.
- The Instruction register (IR) and the Memory data register (MDR) are added to save the
output of the memory for an instruction read and a data read, respectively. Two separate
registers are used, since, as will be clear shortly, both values are needed during the same
clock cycle.
- The A and B registers are used to hold the register operand values read from the register
file.
Next-state function: A combinational function that, given the inputs and the current state,
determines the next state of a finite-state machine.
Spatial locality: The locality principle stating that if a data location is referenced, data locations with
nearby addresses will tend to be referenced soon.
Memory hierarchy: A structure that uses multiple levels of memories; as the distance from the
processor increases, the size of the memories and the access time both increase while the cost pert
bit decreases.
Block (or line): The minimum unit of information that can be either present or not present in a
cache.
Hit rate: The fraction of memory accesses found in a level of the memory hierarchy.
Miss rate: The fraction of memory accesses not found in a level of the memory hierarchy.
Hit time: The time required to access a level of the memory hierarchy, including the time needed to
determine whether the access is a hit or a miss.
Miss penalty: The time required to fetch a block into a level of the memory hierarchy from the lower
level, including the time to access the block, transmit it from one level to the other, insert it in the
level that experienced the miss, and then pass the block to the requestor.
Week 7:
privileged mode!
Not sure what was meant by this?
2. Service interruption, where the delivered service is different from the specified service
Transitions from state 1 to state 2 are caused by failures, and transitions from state 2 to state 1 are
called restorations.
AFR: Annual Failure Rate. The percentage of devices that would be expected to fail in a year for a
given MTTF.
Reliability: A measure of the continuous service accomplishment.
Fault tolerance: Using redundancy to allow the service to comply with the service specification
despite faults occurring.
Fault forecasting: Predicting the presence and creation of faults, allowing the component to be
replaced before it fails.
Error detection: code A code that enables the detection of an error in data, but not the precise
location and, hence, correction of the error.
1. Start numbering bits from 1 on the left, contrary to the traditional numbering of the
rightmost bit being 0.
2. Mark all bit positions that are powers of 2 as parity bits (positions 1, 2, 4, 8, 16, …).
3. All other bit positions are used for data bits (positions 3, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, …).
4. The position of parity bit determines sequence of data bits that it checks is:
- Bit 1 (0001) checks bits (1,3,5,7,9,11,...), which are bits where rightmost bit of address is 1
(0001, 0011, 0101, 0111, 1001, 1011,…).
- Bit 2 (0010two) checks bits (2,3,6,7,10,11,14,15,…), which are the bits where the second bit
to the right in the address is 1.
- Bit 4 (0100two) checks bits (4–7, 12–15, 20–23,…), which are the bits where the third bit to
the right in the address is 1.
- Bit 8 (1000two) checks bits (8–15, 24–31, 40–47,...), which are the bits where the fourth bit
to the right in the address is 1.
Note that each data bit is covered by two or more parity bits.
1. Managing software.
2. Managing hardware.
6.1 (Introduction);
Multiprocessor: A computer system with at least two processors. This computer is in contrast to a
uniprocessor, which has one, and is increasingly hard to find today.
Parallel processing program: A single program that runs on multiple processors simultaneously.
Cluster: A set of computers connected over a local area network that function as a single large
multiprocessor.
Shared memory multiprocessor (SMP): A parallel processor with a single physical address space.