MPC601UM
MPC601UM
PowerPC 601
™
Organization
Following is a summary and a brief description of the major sections of this manual:
• Chapter 1, “Overview,” is useful for readers who want a general understanding of
the features and functions of the PowerPC architecture and the 601 processor. This
chapter also provides a general description of how the 601 differs from the PowerPC
architecture.
• Chapter 2, “Registers and Data Types,” is useful for software engineers who need to
understand the PowerPC programming model and the functionality of the registers
implemented in the 601. This chapter also describes PowerPC conventions for
storing data in memory.
• Chapter 3, “Addressing Modes and Instruction Set Summary,” provides an overview
of the PowerPC addressing modes and a description of the instructions implemented
by the 601, including the portion of the PowerPC instruction set and the additional
instructions implemented by the 601.
Specific differences between the 601 implementation and the PowerPC
implementation of individual instructions are noted.
• Chapter 4, “Cache and Memory Unit Operation,” provides a discussion of cache
timing, look-up process, MESI protocol, and interaction with other units. This
chapter contains information that pertains both to the PowerPC virtual environment
architecture and to the specific implementation in the 601.
• Chapter 5, “Exceptions,” describes the exception model defined in the PowerPC
operating environment architecture and the specific exception model implemented
in the 601.
• Chapter 6, “Memory Management Unit,” provides descriptions of the MMU,
interaction with other units, and address translation. Although this chapter does not
provide an in-depth description of both the 64-bit and 32-bit memory management
model defined by the PowerPC operating environment architecture, it does note
differences between the defined 32-bit PowerPC definition and the 601 memory
management implementation.
• Chapter 7, “Instruction Timing,” provides information about latencies, interlocks,
special situations, and various conditions to help make programming more efficient.
This chapter is of special interest to software engineers and system designers.
Because each PowerPC implementation is unique with respect to instruction timing,
this chapter primarily contains information specific to the 601.
Conventions
This document uses the following notational conventions:
ACTIVE_HIGH Names for signals that are active high are shown in uppercase text
without an overbar.
ACTIVE_LOW A bar over a signal name indicates that the signal is active low—for
example, ARTRY (address retry) and TS (transfer start). Active-low
signals are referred to as asserted (active) when they are low and
negated when they are high. Signals that are not active-low, such as
AP0–AP3 (address bus parity signals) and TT0–TT4 (transfer type
signals) are referred to as asserted when they are high and negated
when they are low.
mnemonics Instruction mnemonics are shown in lowercase bold.
italics Italics indicate variable command parameters, for example, bcctrx
x'0F' Hexadecimal numbers
b'0011' Binary numbers
rA|0 The contents of a specified GPR or the value 0.
REG[FIELD] Abbreviations or acronyms for registers are shown in uppercase
text. Specific bit fields or ranges are shown in brackets.
x In certain contexts, such as a signal encoding, this indicates a don’t
care. For example, if TT0–TT3 are binary encoded b'x001', the state
of TT0 is a don’t care.
CR Condition register
EA Effective address
IQ Instruction queue
IU Integer unit
L2 Secondary cache
LIFO Last-in-first-out
LR Link register
MQ MQ register
No-Op No operation
PR Privilege-level bit
RAW Read-after-write
SR Segment register
WAR Write-after-read
WAW Write-after-write
Terminology Conventions
Table ii describes terminology conventions used in this manual.
Table ii. Terminology Conventions
IBM This Manual
Interrupt Exception
Relocation Translation
Table iii describes register and bit naming conventions used in this manual.
Table iii. Register and Bit Name Convention
IBM This Manual
Block effective page index (BATx[BEPI]) Block logical page index (BATx[BLPI])
RTCU INSTRUCTION
QUEUE
RTCL
+
8 WORDS
INSTRUCTION INSTRUCTION
ISSUE LOGIC
IU BPU FPU
+ * / + + * /
CTR
GPR CR FPR
XER FILE LR FILE FPSCR
1 WORD 2 WORDS
DATA
ADDRESS
MMU
32-KBYTE
UTLB ITLB PHYSICAL ADDRESS TAGS CACHE
(INSTRUC-
BAT TION AND DA-
ARRAY
ADDRESS
DATA
MEMORY UNIT 4 WORDS
READ WRITE QUEUE DATA
QUEUE SNOOP 8 WORDS
SNOOP
ADDRESS
ADDRESS
DATA
2 WORDS
SYSTEM INTERFACE
ADDRESS DATA
(from cache) (from cache)
SYSTEM INTERFACE
The other two elements in the write queue are used for store operations and writing back
modified sectors that have been deallocated by updating the queue; that is, when a cache
location is full, the least-recently used cache sector is deallocated by first being copied into
the write queue and from there to system memory. Note that snooping can occur after a
sector has been pushed out into the write queue and before the data has been written to
system memory. Therefore, to maintain a coherent memory, the write queue elements are
compared to snooped addresses in the same way as the cache tags. If a snoop hits a write
queue element, the data is first stored in system memory before it can be loaded into the
cache of the snooping bus master. Coherency checking between the cache and the write
queue prevents dependency conflicts. Single-beat writes in the write queue are not snooped;
coherency is ensured through the use of special cache operations that accompany the
single-beat write operation on the bus.
Execution of a load or store instruction is considered complete when the associated address
translation completes, guaranteeing that the instruction has completed to the point where it
is known that it will not generate an internal exception. However, after address translation
is complete, a read or write operation can still generate an external exception.
Load and store instructions are always issued and translated in program order with respect
to other load and store instructions. However, a load or store operation that hits in the cache
can complete ahead of those that miss in the cache; additionally, loads and stores that miss
the cache can be reordered as they arbitrate for the system bus.
If a load or store misses in the cache, the operation is managed by the memory unit which
prioritizes accesses to the system bus. Read requests, such as loads, RWITMs, and
instruction fetches have priority over single-beat write operations. The priorities for
accessing the system bus are listed in Section 4.10.2, “Memory Unit Queuing Priorities.”
1.3.1 Features
The 601 is a high-performance, superscalar PowerPC implementation. The PowerPC
architecture allows optimizing compilers to schedule instructions to maximize performance
through efficient use of the PowerPC instruction set and register model. The multiple,
independent execution units allow compilers to maximize parallelism and instruction
8 SETS
Although exceptions have other characteristics as well, such as whether they are maskable
or nonmaskable, the distinctions shown in Table 1-1 define categories of exceptions that the
601 handles uniquely. Note that Table 1-1 includes no synchronous imprecise instructions.
Reserved 00000 —
System reset 00100 A system reset is caused by the assertion of either SRESET or HRESET.
Machine check 00200 A machine check is caused by the assertion of the TEA signal during a data bus
transaction.
Data access 00300 The cause of a data access exception can be determined by the bit settings in
the DSISR, listed as follows:
1 Set if the translation of an attempted access is not found in the primary
hash table entry group (HTEG), or in the rehashed secondary HTEG, or in
the range of a BAT register; otherwise cleared.
4 Set if a memory access is not permitted by the page or BAT protection
mechanism described in Chapter 6, “Memory Management Unit”; otherwise
cleared.
5 Set if the access was to an I/O segment (SR[T] =1) by an eciwx, ecowx,
lwarx, stwcx., or lscbx instruction; otherwise cleared. Set by an eciwx or
ecowx instruction if the access is to an address that is marked as
write-through.
6 Set for a store operation and cleared for a load operation.
9 Set if an EA matches the address in the DABR while in one of the three
compare modes.
11 Set if eciwx or ecowx is used and EAR[E] is cleared.
Instruction 00400 An instruction access exception is caused when an instruction fetch cannot be
access performed for any of the following reasons:
• The effective (logical) address cannot be translated. That is, there is a page
fault for this portion of the translation, so an instruction access exception
must be taken to retrieve the translation from a storage device such as a
hard disk drive.
• The fetch access is to an I/O segment.
• The fetch access violates memory protection. If the key bits (Ks and Ku) in
the segment register and the PP bits in the PTE or BAT are set to prohibit
read access, instructions cannot be fetched from this location.
External 00500 An external interrupt occurs when the INT signal is asserted.
interrupt
Alignment 00600 An alignment exception is caused when the 601 cannot perform a memory
access for any of several reasons, such as when the operand of a floating-point
load or store operation is in an I/O segment (SR[T] = 1) or when a scalar
load/store operand crosses a page boundary. Specific exception sources are
described in Section 5.4.6, “Alignment Exception (x'00600').”
Program 00700 A program exception is caused by one of the following exception conditions,
which correspond to bit settings in SRR1 and arise during execution of an
instruction:
• Floating-point enabled exception—A floating-point enabled exception
condition is generated when the following condition is met:
(MSR[FE0] | MSR[FE1]) & FPSCR[FEX] is 1.
FPSCR[FEX] is set by the execution of a floating-point instruction that
causes an enabled exception or by the execution of a “move to FPSCR”
instruction that results in both an exception condition bit and its
corresponding enable bit being set in the FPSCR.
• Illegal instruction—An illegal instruction program exception is generated
when execution of an instruction is attempted with an illegal opcode or illegal
combination of opcode and extended opcode fields (including PowerPC
instructions not implemented in the 601), or when execution of an optional
instruction not provided in the 601 is attempted (these do not include those
optional instructions that are treated as no-ops). The PowerPC instruction
set is described in Chapter 3, “Addressing Modes and Instruction Set
Summary.”
• Privileged instruction—A privileged instruction type program exception is
generated when the execution of a privileged instruction is attempted and the
MSR register user privilege bit, MSR[PR], is set. In the 601, this exception is
generated for mtspr or mfspr with an invalid SPR field if SPR[0] = 1 and
MSR[PR] = 1. This may not be true for all PowerPC processors.
• Trap—A trap type program exception is generated when any of the
conditions specified in a trap instruction is met.
Decrementer 00900 The decrementer exception occurs when the most significant bit of the
decrementer (DEC) register transitions from 0 to 1. Must also be enabled with
the MSR[EE] bit.
I/O controller 00A00 An I/O controller interface error exception is taken only when an operation to an
interface error I/O controller interface segment fails (such a failure is indicated to the 601 by a
particular bus reply packet). If an I/O controller interface exception is taken on a
memory access directed to an I/O segment, the SRR0 contains the address of
the instruction following the offending instruction. Note that this exception is not
implemented in other PowerPC processors.
Reserved 00B00 —
System call 00C00 A system call exception occurs when a System Call (sc) instruction is executed.
Reserved 00D00 Other PowerPC processors may use this vector for trace exceptions.
Reserved 00E00 The 601 does not generate an interrupt to this vector. Other PowerPC
processors may use this vector for floating-point assist exceptions.
Reserved 00E10–00FFF —
Run mode/ 02000 The run mode exception is taken depending on the settings of the HID1 register
trace exception and the MSR[SE] bit.
The following modes correspond with bit settings in the HID1 register:
• Normal run mode—No address breakpoints are specified, and the 601
executes from zero to three instructions per cycle
• Single instruction step mode—One instruction is processed at a time. The
appropriate break action is taken after an instruction is executed and the
processor quiesces.
• Limited instruction address compare—The 601 runs at full speed (in parallel)
until the EA of the instruction being decoded matches the EA contained in
HID2. Addresses for branch instructions and floating-point instructions may
never be detected.
• Full instruction address compare mode—Processing proceeds out of IQ0.
When the EA in HID2 matches the EA of the instruction in IQ0, the
appropriate break action is performed. Unlike the limited instruction address
compare mode, all instructions pass through the IQ0 in this mode. That is,
instructions cannot be folded out of the instruction stream.
The following mode is taken when the MSR[SE] bit is set.
• MSR[SE] trace mode—Note that in other PowerPC implementations, the
trace exception is a separate exception with its own vector x'00D00'.
FA
Fetch Arbitration
CARB
CACC ISB
Cache (memory subsystem)
IQ7 FPSB
IQ6
Data Access
IQ5 Queueing Unit
Dispatch Unit
(Instructions in the IQ IQ4
are said to be in the
dispatch stage (DS))
BE IQ3 F1
IQ2
MR IQ1 FD
IQ0
ID FPM
1
FPA
IE FWA
FWL
Floating-Point
Unit (FPU)
BW IC IWA IWL
= Cycle Boundary
1 An integer instruction can be passed to the ID stage in the same cycle in which it enters IQ0.
CLOCKS
+3.6 V
The system interface supports bus pipelining, which allows the address tenure of one
transaction to overlap the data tenure of another. The extent of the pipelining depends on
external arbitration and control circuitry. Similarly, the 601 supports split-bus transactions
for systems with multiple potential bus masters—one device can have mastership of the
address bus while another has mastership of the data bus. Allowing multiple bus
transactions to occur simultaneously increases the available bus bandwidth for other
activity and as a result, improves performance.
The 601 supports multiple masters through a bus arbitration scheme that allows various
devices to compete for the shared bus resource. The arbitration logic can implement priority
protocols, such as fairness, and can park masters to avoid arbitration overhead. The MESI
protocol ensures coherency among multiple devices and system memory. Also, the 601's
on-chip cache and UTLB and optional second-level caches can be controlled externally.
The 601 clocking structure allows the bus to operate at integer multiples of the processor
cycle time.
The following sections describe the 601 bus support for memory and I/O controller
interface operations. Note that some signals perform different functions depending upon
the addressing protocol used.
601
TRANSFER 3 1
TBST CKSTP_OUT
ATTRIBUTE 1 1
CI HRESET
1 1
WT SRESET
1 1 SYSTEM
GBL RSRV
1 1 STATUS
CSE0–CSE2 SC_DRIVE
3 1
HP_SNP_REQ
1
ADDRESS AACK
1
TERMINATION ARTRY
1
SHD ESP INTERFACE ESP SCAN
1 7
INTERFACE
2X_PCLK TEST INTERFACE
1 21
PCLK_EN SYS_QUIESC
1 1 TEST
CLOCKS BCLK_EN RESUME
1 1 SIGNALS
RTC QUIESC_REQ
1 1
59 59
+3.6 V
This chapter describes the PowerPC 601 microprocessor’s register organization, how these
registers are accessed, and how data is represented in these registers. The 601 always
operates in one of three distinct states which are described as follows:
• Normal instruction execution state—In this state, the 601 executes instructions in
either user mode or supervisor mode. User mode can be entered from supervisor
mode by executing the appropriate instructions. If an exception is detected while in
user mode, the processor enters supervisor mode and begins executing the
instructions at a predetermined location associated with the type of exception
detected. In supervisor mode, the program has access to memory, registers,
instructions, and other resources not available to programs executing in user mode.
• Reset state—In the reset state all processor instruction execution is aborted, registers
are initialized appropriately, and external signals are placed in the high-impedance
state. For more information about the reset state, see Section 2.7, “Reset.”
• Checkstop state—When a processor is in the checkstop state, instruction processing
is suspended and generally cannot be restarted without resetting the processor. The
checkstop state is provided to help identify and diagnose problems. The checkstop
state is described in Section 5.4.2.2, “Checkstop State (MSR[ME] = 0).”
The PowerPC architecture defines register-to-register operations for all computational
instructions. Source data for these instructions are accessed from the on-chip registers or
are provided as immediate values embedded in the opcode. The three-register instruction
format allows specification of a target register distinct from the two source registers, thus
preserving the original data for use by other instructions and reducing the number of
instructions required for certain operations. Data is transferred between memory and
registers with explicit load and store instructions only.
FPSCR Reserved
VXIDI VXZDZ VXSOFT
VXISI VXIMZ VXSQRT
VXSNAN VXVC VXCVI
FX FEX VX OX UX ZX XX FR FI FPRF 0 VE OE UE ZE XE 0 RN
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 19 20 21 22 23 24 25 26 27 28 29 30 31
1 FEX Floating-point enabled exception summary (FEX). This bit signals the occurrence of any of
the enabled exception conditions. It is the logical OR of all the floating-point exception bits
masked with their respective enable bits. The mcrfs instruction implicitly clears FPSCR[FEX]
if the result of the logical OR described above becomes zero. The mtfsf, mtfsfi, mtfsb0, and
mtfsb1 instructions cannot set or clear FPSCR[FEX] explicitly. This is not a sticky bit.
2 VX Floating-point invalid operation exception summary (VX). This bit signals the occurrence of
any invalid operation exception. It is the logical OR of all of the invalid operation exceptions.
The mcrfs instruction implicitly clears FPSCR[VX] if the result of the logical OR described
above becomes zero. The mtfsf, mtfsfi, mtfsb0, and mtfsb1 instructions cannot set or clear
FPSCR[VX] explicitly. This is not a sticky bit.
3 OX Floating-point overflow exception (OX). This is a sticky bit. See Section 5.4.7.4, “Overflow
Exception Condition.”
4 UX Floating-point underflow exception (UX). This is a sticky bit. See Section 5.4.7.5, “Underflow
Exception Condition.”
5 ZX Floating-point zero divide exception (ZX). This is a sticky bit. See Section 5.4.7.3, “Zero
Divide Exception Condition.”
6 XX Floating-point inexact exception (XX). This is a sticky bit. See Section 5.4.7.6, “Inexact
Exception Condition.”
7 VXSNAN Floating-point invalid operation exception for SNaN (VXSNAN). This is a sticky bit. See
Section 5.4.7.2, “Invalid Operation Exception Conditions.”
8 VXISI Floating-point invalid operation exception for ∞-∞ (VXISI). This is a sticky bit. See Section
5.4.7.2, “Invalid Operation Exception Conditions.”
9 VXIDI Floating-point invalid operation exception for ∞/∞ (VXIDI). This is a sticky bit. See Section
5.4.7.2, “Invalid Operation Exception Conditions.”
10 VXZDZ Floating-point invalid operation exception for 0/0 (VXZDZ). This is a sticky bit. See Section
5.4.7.2, “Invalid Operation Exception Conditions.”
11 VXIMZ Floating-point invalid operation exception for ∞*0 (VXIMZ). This is a sticky bit. See Section
5.4.7.2, “Invalid Operation Exception Conditions.”
12 VXVC Floating-point invalid operation exception for invalid compare (VXVC). This is a sticky bit.
See Section 5.4.7.2, “Invalid Operation Exception Conditions.”
13 FR Floating-point fraction rounded (FR). The last floating-point instruction that potentially
rounded the intermediate result incremented the fraction. See Section 2.5.6, “Rounding.” This
bit is not sticky.
14 FI Floating-point fraction inexact (FI). The last floating-point instruction that potentially rounded
the intermediate result produced an inexact fraction or a disabled overflow exception. See
Section 2.5.6, “Rounding.” This bit is not sticky.
15–19 FPRF Floating-point result flags (FPRF). This field is based on the value placed into the target
register even if that value is undefined. Refer to Table 2-2 for specific bit settings.
15 Floating-point result class descriptor (C). Floating-point instructions other than the
compare instructions may set this bit with the FPCC bits, to indicate the class of
the result.
16–19 Floating-point condition code (FPCC). Floating-point compare instructions always
set one of the FPCC bits to one and the other three FPCC bits to zero. Other
floating-point instructions may set the FPCC bits with the C bit, to indicate the
class of the result. Note that in this case the high-order three bits of the FPCC
retain their relational significance indicating that the value is less than, greater
than, or equal to zero.
16 Floating-point less than or negative (FL or <)
17 Floating-point greater than or positive (FG or >)
18 Floating-point equal or zero (FE or =)
19 Floating-point unordered or NaN (FU or ?)
20 — Reserved
21 VXSOFT Not implemented in the 601. This is a sticky bit. For more detailed information refer to
Table 5-17 and Section 5.4.7.2, “Invalid Operation Exception Conditions.”
22 VXSQRT Not implemented in the 601. For more detailed information refer to Table 5-17 and Section
5.4.7.2, “Invalid Operation Exception Conditions.”
23 VXCVI Floating-point invalid operation exception for invalid integer convert (VXCVI). This is a sticky
bit. See Section 5.4.7.2, “Invalid Operation Exception Conditions.”
24 VE Floating-point invalid operation exception enable (VE). See Section 5.4.7.2, “Invalid
Operation Exception Conditions.”
25 OE Floating-point overflow exception enable (OE). See Section 5.4.7.4, “Overflow Exception
Condition.”
26 UE Floating-point underflow exception enable (UE). This bit should not be used to determine
whether denormalization should be performed on floating-point stores. See Section 5.4.7.5,
“Underflow Exception Condition.”
27 ZE Floating-point zero divide exception enable (ZE). See Section 5.4.7.3, “Zero Divide Exception
Condition.”
28 XE Floating-point inexact exception enable (XE). See Section 5.4.7.6, “Inexact Exception
Condition.”
29 — Reserved. This bit may be implemented as the non-IEEE mode bit (NI) in other PowerPC
implementations.
01001 –Infinity
10010 –Zero
00010 +Zero
00101 +Infinity
1 Positive (GT)—This bit is set when the result is positive (and not zero).
3 Summary overflow (SO)—This is a copy of the final state of XER[SO] at the completion of the instruction.
4 Floating-point exception (FX)—This is a copy of the final state of FPSCR[FX] at the completion of the
instruction.
5 Floating-point enabled exception (FEX)—This is a copy of the final state of FPSCR[FEX] at the
completion of the instruction.
6 Floating-point invalid exception (VX)—This is a copy of the final state of FPSCR[VX] at the completion of
the instruction.
7 Floating-point overflow exception (OX)—This is a copy of the final state of FPSCR[OX] at the completion
of the instruction.
*Here, the bit indicates the bit number in any one of the four-bit subfields, CR0–CR7.
MQ
0 31
The MQ register is not defined in the PowerPC architecture. However, in the 601, it may be
modified as a side effect during the execution of the mulli, mullw, mulhs, mulhu, divw,
and divwu instructions, which are PowerPC instructions.
The value written to the MQ register during these operations is operand-dependent and
therefore, the MQ contents become undefined after any of these instructions executes. In
addition, the MQ is modified by the implementation-specific instructions supported by the
601 that are not part of the PowerPC architecture. These are listed in Table 2-6.
Table 2-6. PowerPC 601 Microprocessor-Specific Instructions that Modify the MQ
Register
Mnemonic Instruction Name Read/Write
The Move to Special Purpose Register (mtspr) and Move from Special Purpose Register
(mfspr) can access the MQ register. The SPR number for the MQ register is 0.
The MQ register is not part of the PowerPC architecture and will not be supported in other
PowerPC microprocessors.
The MQ register is cleared by hard reset.
Reserved
0 1 2 3 15 16 23 24 25 31
XER is designated SPR1. The bit definitions for XER, shown in Table 2-8, are based on the
operation of an instruction considered as a whole, not on intermediate results. For example,
the result of the Subtract from Carrying (subfcx) instruction is specified as the sum of three
values. This instruction sets bits in the XER based on the entire operation, not on an
intermediate sum.
0 SO Summary Overflow (SO)—The summary overflow bit (OV) is set whenever an instruction (except
mtspr) sets the overflow bit (OV) to indicate overflow and remains set until software clears it (with
the mtspr or mcrxr instruction). It is not altered by compare instructions or other instructions that
cannot overflow.
1 OV Overflow (OV)—The overflow bit is set to indicate that an overflow has occurred during execution
of an instruction. Integer and subtract instructions having OE = 1 set OV if the carry out of bit 0 is
not equal to the carry out of bit 1, and clear it otherwise. The OV bit is not altered by compare
instructions or other instructions that cannot overflow.
2 CA Carry (CA)—In general, the carry bit is set to indicate that a carry out of bit 0 occurred during
execution of an instruction. Add carrying, subtract from carrying, add extended, and subtract from
extended instructions set CA to one if there is a carry out of bit 0, and clear it otherwise. The CA
bit is not altered by compare instructions, or other instructions that cannot carry, except that shift
right algebraic instructions set the CA bit to indicate whether any “1” bits have been shifted out of
a negative quantity.
3–15 — Reserved
16–23 This field contains the byte to be compared by a Load String and Compare Byte Indexed (lscbx)
instruction. Note that lscbx is not a part of the PowerPC architecture.
24 — Reserved
25–31 This field specifies the number of bytes to be transferred by a Load String Word Indexed (lswx),
Store String Word Indexed (stswx) or Load String and Compare Byte Indexed (lscbx) instruction.
RTCU
0 31
(1)
Reserved
00 RTCL 0000000
0 1 2 24 25 31
(2)
The RTC runs constantly while power is applied and the external 7.8125 MHz oscillator is
connected. Note that the RTC will not be implemented in other PowerPC processors. The
condition register is cleared by hard reset. Note that when an external clock is connected to
the RTC, the RTCL and RTCU registers are incremented automatically.
Both registers are cleared by a hard reset.
2.2.5.3.1 Real-Time Clock Lower (RTCL) Register
The RTCL functions as a 23-bit counter that provides the lower word of the RTC. As an
indicator of the granularity of the RTC, enough bits are implemented to provide a resolution
that is finer than the time required to execute 10 Add Immediate (addi) instructions. The
following details describe the RTCL:
• Bits 0–1 and bits 25–31 are not implemented. (The number of lower order bits
required is determined by the frequency of the oscillator—7.8125 MHz)
• The least significant implemented bit of the RTCL (bit 24) is incremented every
128 nS.
• The period of the RTCL is one billion nanoseconds (one second).
Branch Address
0 31
Note that although the two least-significant bits can accept any values written to them, they
are ignored when the LR is used as an address. The link register can be accessed by the
mtspr and mfspr instructions using SPR number 8. Fetching instructions along the target
path (loaded by an mtspr instruction) is possible provided the link register is loaded
sufficiently ahead of the branch instruction. It is usually possible for the 601 to fetch along
a target path loaded by a branch and link instruction.
Both conditional and unconditional branch instructions include the option of placing the
effective address of the instruction following the branch instruction in the LR.
As a performance optimization, and as an aid for handling the precise exception model, the
601 implements a two-entry link register shadow. Shadowing allows the link register to be
updated by branch instructions that are executed out-of-order with respect to integer
instructions without destroying machine state information if any integer instructions takes
a precise exception. This is not visible from software. The link register is cleared by hard
reset.
Note that although the 601 does not implement a link stack register, one may be
implemented in subsequent PowerPC processors. For compatibility, use of the link register
should be controlled following the description in Section 3.6.1.5, “Branch Conditional to
Link Register Address Mode.”
CTR
0 31
Fetching instructions along the target path is also possible provided the count register is
loaded sufficiently ahead of the branch instruction.
The count register can be accessed by the mtspr and mfspr instructions by specifying the
SPR number 9. In branch conditional instructions, the BO field specifies the conditions
under which the branch is taken. The first four bits of the BO field specify how the branch
is affected by or affects the condition register and the count register. The encoding for the
BO field is shown in Table 3-25. The count register is cleared by hard reset.
Reserved
0 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0–15 — Reserved*
17 PR Privilege level
0 The processor can execute both user- and supervisor-level instructions.
1 The processor can only execute user-level instructions.
18 FP Floating-point available
0 The processor prevents dispatch of floating-point instructions, including floating-point
loads, stores, and moves.
1 The processor can execute floating-point instructions, and can take floating-point
enabled exception type program exceptions.
25 EP Exception prefix. The setting of this bit specifies whether an exception vector offset is
prepended with Fs or 0s. In the following description, nnnnn is the offset of the exception. See
Table 5-2.
0 Exceptions are vectored to the physical address x'000n_nnnn'.
1 Exceptions are vectored to the physical address x'FFFn_nnnn'.
28–29 — Reserved
*These reserved bits may be used by other PowerPC processors. Attempting to change these bits does not
affect the operation of the 601. These bit positions always return a zero value when read. Note that bits 15 and
31 (ELE and LE) are defined by the PowerPC architecture to control little- and big-endian mode.
The floating-point exception mode bits are interpreted as shown in Table 2-10. For further
details, see Section 5.4.7.1, “Floating-Point Enabled Program Exceptions.” Note that these
bits are logically ORed, so that if either is set the processor operates in precise mode.
Table 2-10. Floating-Point Exception Mode Bits
FE0 FE1 Mode
Table 2-11 indicates the state of the MSR after a hard reset.
Table 2-11. State of MSR at Power Up
Bit Description
0–15 0 (Reserved)
16–18 0
19 1
20–24 0
25 1
26–27 0
28–31 0 (Reserved)
T Ks Ku 00000 VSID
0 1 2 3 7 8 31
Segment registers can be accessed by using the mtsr and mtsrin instructions. Segment
register bit settings when T = 0 are described in Table 2-12.
Table 2-12. Segment Register Bit Settings (T = 0)
Bits Name Description
3–7 — Reserved
0 1 2 3 11 12 27 28 31
The bits in the segment register when T = 1 are described in Table 2-13.
Table 2-13. Segment Register Bit Settings (T = 1)
Bits Name Description
28–31 Packet 1(0–3) This field contains address bits 0–3 of the
packet 1 cycle (address-only).
FPSCR Zero
SDR1 Zero
HID0 Zero
HID15 Zero
In some cases, not all of a register’s bits are implemented in hardware. For example, the
RTCL register is defined to be 32 bits, but in the 601 only the 23 most significant bits exist
in hardware. Similarly, the DEC register is defined as having 32 bits, but only the 25 most
significant bits are implemented in hardware. In both cases, the unimplemented bits are
returned as zeros when they are read by the mfspr instruction.
The RTCU and RTCL register in supervisor mode and the mtspr instruction requires a
different SPR encoding. For the mtspr instruction, RTCU is SPR20 and RTCL is SPR21.
When the 601 detects SPR encodings other than those defined in this document, it either
takes a program exception (if bit 0 of the SPR encoding is set) or it treats the instruction as
a no-op (if bit 0 of the SPR encoding is clear).
tlbie
1 Context-synchronizing event Context-synchronizing event
1 The context-synchronizing event (most likely an isync instruction) prior to the tlbie instruction ensures that all
previously issued memory access instructions have completed to a point where they will no longer cause an
exception. The context-synchronizing event following the tlbie instruction ensures that subsequent memory
access instructions will not use the TLB entry being invalidated. To ensure that all memory accesses previously
translated by the TLB entry being invalidated have completed with respect to memory and that reference and
change bit updates associated with those memory accesses have completed, a sync instruction rather than a
context-synchronizing event is required after the tlbie instruction. Multiprocessor systems have other
requirements to synchronize TLB invalidation.
Note that the sync instruction, although not defined as context-synchronizing in the
PowerPC architecture, can sometimes be used to provide the required synchronization.
When a sync instruction is encountered, the 601 processor synchronizes updates to the CR,
CTR, LR, MSR, FPSCR, and XER registers.
In general, context-synchronization is required when writes to registers that affect
addressing are preceded or followed by load or store instructions. Specifically, a context-
synchronizing operation or a sync instruction must precede a modification of the BAT or
segment registers when the corresponding address translations are enabled. A sync
instruction must precede the modification of SDR1 when the corresponding (data accesses
with MSR[DT] = 1 or instruction fetches with MSR[IT] = 1) address translations are
enabled, guaranteeing that the reference and change bits are updated in the correct context.
If the corresponding address translations are enabled, a context synchronization operation
must follow the modification of any of the above registers.
When several of the registers listed above are modified with no intervening instructions that
are affected by the changes, context synchronization or sync instructions are not required
between the alterations. However, instructions fetched and/or executed after the alteration
but before the context synchronizing operation may be fetched and/or executed in either the
context that existed before the alteration or the context established by the alteration.
For synchronization within a sequence of instructions, the isync instruction can be used as
shown in the first example.
DSISR
0 31
DAR
0 31
The effective address generated by a memory access instruction is placed in the DAR if the
access causes an exception (I/O controller interface error, or alignment exception). For
information, see Section 5.4.3, “Data Access Exception (x'00300'),” and Section 5.4.6,
“Alignment Exception (x'00600').”
DEC
0 31
Reserved
0–15 HTABORG The high-order 16 bits of the 32-bit physical address of the page table
16–22 — Reserved
The HTABORG field in SDR1 contains the high-order 16 bits of the 32-bit physical address
of the page table. Therefore, the page table is constrained to lie on a 216 byte (64 Kbytes)
boundary at a minimum. At least 10 bits from the hash function are used to index into the
page table. The page table must consist of at least 64 Kbytes 210 PTEGs of 64 bytes each.
The page table can be any size 2n where 16 ≤ n ≤ 25. As the table size is increased, more
bits are used from the hash to index into the table and the value in HTABORG must have
more of its low-order bits equal to 0. The HTABMASK field in SDR1 contains a mask value
that determines how many bits from the hash are used in the page table index. This mask
must be of the form b'00...011...1'; that is, a string of 0 bits followed by a string of 1bits.
The 1 bits determine how many additional bits (at least 10) from the hash are used in the
index; HTABORG must have this same number of low-order bits equal to 0. See
Figure 6-21.
The number of low-order 0 bits in HTABORG must be at least the number of 1 bits in
HTABMASK so that the final 32-bit physical address can be formed by logically ORing the
various components.
When an exception occurs, SRR0 is set to point to an instruction such that all prior
instructions have completed execution and no subsequent instruction has begun execution.
The instruction addressed by SRR0 may not have completed execution, depending on the
exception type. SRR0 addresses either the instruction causing the exception or the
immediately following instruction. The instruction addressed can be determined from the
exception type and status bits.
The SRR0 is cleared by hard reset.
For information on how specific exceptions affect SRR0, refer to the descriptions of
individual exceptions in Chapter 5, “Exceptions.”
SRR1
0 151631
In general, when an exception occurs, bits 0–15 of SRR1 are loaded with exception-specific
information and bits 16–31 of MSR are placed into bits 16–31 of SRR1.
The SRR1 is cleared by hard reset.
For information on how specific exceptions affect SRR1, refer to the individual exceptions
in Chapter 5, “Exceptions.”
Reserved
E 000000000000000000000000000 RID
0 1 272831
This register is provided to support the External Control Input Word Indexed (eciwx) and
External Control Output Word Indexed (ecowx) instructions, which are described in
Chapter 10, “Instruction Set.” Although access to the EAR is privileged, the operating
system can determine which tasks are allowed to issue external access instructions and
when they are allowed to do so. The bit settings for the EAR are described in Table 2-18.
Interpretation of the physical address transmitted by the eciwx and ecowx instructions and
the 32-bit value transmitted by the ecowx instruction is not prescribed by the PowerPC
architecture but is determined by the target device.
For example, if the external control facility is used to support a graphics adapter, the ecowx
instruction could be used to send the translated physical address of a buffer containing
graphics data to the graphics device. The ecowx instruction could be used to load status
information from the graphics adapter.
0 E Enable bit
1 Enabled
0 Disabled
If this bit is set, the eciwx and ecowx instructions can perform the
specified external operation. If the bit is cleared, an eciwx or ecowx
instruction causes a data access exception.
1–27 — Reserved
This register can also be accessed by using the mtspr and mfspr instructions using the
value 282, b'01000 11010'. Synchronization requirements for the EAR are shown in
Table 2-15 and Table 2-16.
The EAR is cleared by hard reset.
Version Revision
0 151631
0 14 15 24 25 27 28 29 30 31
BLPI 0000000000 WIM Ks Ku PP
Reserved
Reserved
Upper 0–14 BLPI Block logical page index. This field is compared with bits 0–14 of the logical
BAT address to determine if there is a hit in that BAT array entry.
Registers
15–24 — Reserved
28 Ks Supervisor mode key. This bit interacts with MSR[PR] and the PP field to
determine the protection for the block. For more information, see Section 6.4,
“General Memory Protection Mechanism.”
29 Ku User mode key. This bit also interacts with MSR[PR] and the PP field to
determine the protection for the block. For more information, see Section 6.4,
“General Memory Protection Mechanism.”
30–31 PP Protection bits for block. This field interacts with MSR[PR] and the Ks or Ku to
determine the protection for the block as described in Section 6.4, “General
Memory Protection Mechanism.”
Lower 0–14 PBN Physical block number. This field is used in conjunction with the BSM field to
BAT generate bits 0–14 of the physical address of the block.
Registers
15–24 — Reserved
26–31 BSM Block size mask (0...5). BSM is a mask that encodes the size of the block.
Values for this field are listed in Table 2-20.
1 Mbyte 00 0111
2 Mbytes 00 1111
4 Mbytes 01 1111
8 Mbytes 11 1111
1008 11111 10000 Checkstop sources and enables register (HID0) Supervisor
For additional information about the mtspr and mfspr instructions, refer to Chapter 10,
“Instruction Set.”
2.3.3.13.1 Checkstop Sources and Enables Register—HID0
The checkstop sources and enables register (HID0), shown in Figure 2-25, is a supervisor-
level register that defines enable and monitor bits for each of the checkstop sources in the
601. The SPR number for HID0 is 1008.
CE S M TD CD SH DT BA BD CP IU PP 000 ES EM LM
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
DRF
DRL
PAR
Reserved EMC EHP
Table 2-22 defines the bits in HID0. The enable bits (bits 15–31) can be used to mask
individual checkstop sources, although these are provided primarily to mask off any false
reports of such conditions for debugging purposes. Bit 0 (HID0[CE]) is a master checkstop
enable; if it is cleared, all checkstop conditions are disabled; if it is set, individual
conditions can be enabled separately. HID0[EM] (bit 16) enables and disables machine
check checkstops; clearing this bit masks machine check checkstop conditions that occur
when MSR[ME] is cleared. Bits 1–11 are the checkstop source bits, and can be used to
determine the specific cause of a checkstop condition.
Table 2-22. Checkstop Sources and Enables Register (HID0) Definition
Bit Name Description
0 CE Master checkstop enable. Enabled if set. If this bit is cleared and the TEA signal is asserted,
a machine check exception is taken, regardless of the setting of MSR[ME].
12–14 — Reserved
16 EM Enable machine check checkstop. Disabled by hard reset. Enabled if set. If this bit is cleared
and the TEA signal is asserted, a machine check exception is taken, regardless of the setting
of MSR[ME].
19 ESH Enable sequencer time out checkstop. Disabled by hard reset. Enabled if set.
20 EDT Enable dispatch time out checkstop. Disabled by hard reset. Enabled if set.
21 EBA Enable bus address parity checkstop. Disabled by hard reset. Enabled if set.
22 EBD Enable bus data parity checkstop. Disabled by hard reset. Enabled if set.
23 ECP Enable cache parity checkstop. Disabled by hard reset. Enabled if set.
24 EIU Enable for invalid ucode instruction checkstop. Enabled by hard reset. Enabled if set.
25 EPP Enable for I/O controller interface access protocol checkstop. Disabled by hard reset.
Enabled if set.
31 EHP 0 The HP_SNP_REQ signal is disabled. Use of the WRS queue position is restricted to a
snoop hit that occurs when a read is pending. That is, its address tenure is complete but
the data tenure has not begun.
1 The HP_SNP_REQ signal is enabled. Use of the WRS queue position is restricted to a
snoop hit on an address tenure that had HP_SNP_REQ asserted.
All enable bits except 15 and 24 are disabled at start up. The operating system should enable
these checkstop conditions before the power-on reset sequence is complete.
Checkstop enable bits can be set or cleared without restriction. If a checkstop source bit is
set, it can be cleared; however, if the corresponding checkstop condition is still present on
the next clock, the bit will be set again. A checkstop source bit can only be set when the
corresponding checkstop condition occurs and the checkstop enable bit is set; it cannot be
set via an mtspr instruction. That is, you cannot manually cause a checkstop.
HID1
Reserved
Table 2-23 shows bit settings for the HID1 register. Note that if both the single instruction
step option is specified for the M field (b'100') and the trap to run mode exception option is
specified in the RM field (b'10'), the processor iterates in an infinite loop.
Table 2-23. HID1 Register Definition
Bit Name Description
0 — Reserved
4–7 — Reserved
17 TL When set, this bit disables the broadcast of the tlbie instruction.
HID2
Reserved
CEA 00
0 29 30 31
Table 2-24 lists HID2 register definitions. The HID2 register is cleared by the hard reset
operation.
The SPR number for HID2 is 1010.
Table 2-24. HID2 Register Definition
Bit Name Description
HID5
Reserved
DAB 0 SA
0 28 29 30 31
0–28 DAB Data address breakpoint (EA). This field is set to the double-word EA to compare with
enabled load or store EAs.
29 — Reserved, although on an mfspr (DABR), the value returned is the value last written.
Load If any part of the load access touches the double word specified in the DABR, and the appropriate
instructions enable bit is set, then the DAE occurs. In this case, the memory read operation is inhibited and
register rD is not updated. If the operation is a load with update, the update to register rA is also
inhibited.
Store If any part of the store access touches the double word specified in the DABR and the appropriate
instructions enable bit is set, the DAE occurs and the memory access is inhibited.
If the operation is a store with update, then the update to register rA is also inhibited.
If the operation is a Store Conditional instruction and the reservation bit is not set at the time of the
DABR compare (at the end of execution as soon as the EA is calculated), the DAE is not taken.
Load and store These instructions are sequenced one register (one word) at a time through the IU for EA
string and calculation. Each access is checked against the DABR as it is presented to the ATU. If a match
multiple occurs, the instruction is aborted and a DAE is taken.
instructions If the initial EA for the string or multiple is not word-aligned, some individual accesses may cross a
double-word boundary. If either double word hits in the DABR, the access is inhibited and the DAE
occurs.
lscbx This instruction is not supported by the DABR Feature. No DAE occurs, even if the EA matches.
instruction
Cache control These instructions are not supported by the DABR Feature. No DAE occurs even if the EA
instructions matches.
PIR
Reserved
00000000000000000000000000000 PID
0 27 28 31
The concept of alignment is also applied more generally to data in memory. For example,
12 bytes of data are said to be word-aligned if its address is a multiple of four.
Some instructions require their memory operands to have certain alignment. In addition,
alignment may affect performance. For single-register memory access instructions, the best
performance is obtained when memory operands are aligned. Additional effects of data
placement on performance are described in Chapter 7, “Instruction Timing.”
Instructions are four bytes long and word-aligned.
2.4.2.2 Atomicity
All aligned accesses are atomic. Instructions causing multiple accesses (for example,
load/store multiple and move assist instructions) are not atomic.
MSB
0 1 2 n
Big-Endian Bit Ordering
The same instruction sequence can be used to go from little- to big-endian mode by clearing
HID0[28].
8 No change
The modified physical address is passed to the data cache or the main memory and the
specified width of the data is transferred between a GPR or FPR and the (as modified)
addressed memory locations. Although the data is stored using big-endian byte ordering
(but not in the same bytes within double words as with LM = 0), the modification of the EA
makes it appear to the processor that it is stored in little-endian mode.
The structure S would be placed in memory as shown in Figure 2-34.
11 12 13 14
00
00 01 02 03 04 05 06 07
21 22 23 24 25 26 27 28
08
08 09 0A 0B 0C 0D 0E 0F
‘D’ ‘C’ ‘B’ ‘A’ 31 32 33 34
10
10 11 12 13 14 15 16 17
(*) (*) 51 52 (*) ‘G’ ‘F’ ‘E’
18
18 19 1A 1B 1C 1D 1E 1F
(*) (*) (*) (*) 61 62 63 64
20
20 21 22 23 24 25 26 27
11 12 13 14
00
07 06 05 04 03 02 01 00
21 22 23 24 25 26 27 28
08
0F 0E 0D 0C 0B 0A 09 08
‘D’ ‘C’ ‘B’ ‘A’ 31 32 33 34
10
17 16 15 14 13 12 11 10
(*) (*) 51 52 (*) ‘G’ ‘F’ ‘E’
18
1F 1E 1D 1C 1B 1A 19 18
61 62 63 64
20
23 22 21 20
Note that as seen by the program executing in the processor, the mapping for the structure
S is identical to the little-endian mapping shown in Figure 2-33. From outside of the
processor, the addresses of the bytes making up the structure S are as shown in Figure 2-34.
These addresses match neither the big-endian mapping of Figure 2-32 or the little-endian
mapping of Figure 2-33. This must be taken into account when performing I/O operations
in little-endian mode; this is discussed in Section 2.4.7, “PowerPC Input/Output in Little-
Endian Mode.”
Note that the misaligned word in this example spans two double words. The two parts of
the misaligned word are not contiguous in the big-endian addressing space.
2.4.5.3 Non-Scalars
The PowerPC architecture has two types of instructions that handle non-scalars (multiple
instances of scalars). Neither type can deal with the modified EAs required in little-endian
mode and both types cause alignment exceptions.
2.4.5.3.1 String Operations
The load and store string instructions, listed in Table 2-29, cause alignment exceptions
when they are executed in little-endian mode (HID0[LM] = 1).
Table 2-29. Load/Store String Instructions that Take Alignment Exceptions if LM = 1
Mnemonic Description
String accesses are inherently byte-based operations, which, for improved performance, the
601 handles as a series of word-aligned accesses.
Note that the system software must determine whether to emulate the excepting instruction
or treat it as an illegal operation. Because little-endian mode programs are new with respect
to the PowerPC architecture—that is, they are not POWER binaries—having the compiler
generate these instructions in little-endian mode would be slower than processing the string
in-line or by using a subroutine call.
Although the words addressed by these instructions are on word boundaries, each word is
in the half of its containing double word opposite from where it would be in big-endian
mode.
Note that the system software must determine whether to emulate the excepting instruction
or treat it as an illegal operation. Because little-endian mode programs are new with respect
to the PowerPC architecture—that is, they are not POWER binaries—having the compiler
generate these instructions in little-endian mode would be slower than processing the string
in-line or by using a subroutine call.
Assuming the program starts at address 0, these instructions are mapped into memory for
big-endian execution as shown in Figure 2-38.
If this same program is assembled for and executed in little-endian mode, the mapping seen
by the processor appears as shown in Figure 2-39.
Each machine instruction appears in memory as a 32-bit integer containing the value
described in the instruction description, regardless of whether LM is set. This is because
scalars are always mapped in memory in big-endian byte order.
When little-endian mapping is used, all references to the instruction stream must follow
little-endian addressing conventions, including addresses saved in system registers when
the exception is taken, return addresses saved in the link register, and branch displacements
and addresses.
• An instruction address placed in the link register by branch and link, or an
instruction address saved in an SPR when an exception is taken is the address that a
program executing in little-endian mode would use to access the instruction as a
word of data using a load instruction.
• An offset in a relative branch instruction reflects the difference between the
addresses of the instructions, where the addresses used are those that a program
executing in little-endian mode would use to access the instructions as data words
using a load instruction.
0 0 0 IR is exact
0 0 1
0 1 0 IR closer to NL
0 1 1
1 0 1
1 1 0 IR closer to NH
1 1 1
The significand of the intermediate result is made up of the L bit, the FRACTION, and the
G, R, and X bits.
The infinitely precise intermediate result of an operation is the result normalized in bits L,
FRACTION, G, R, and X of the floating-point accumulator.
Before results are stored into an FPR, the significand is rounded if necessary, using the
rounding mode specified by FPSCR[RN]. If rounding causes a carry into C, the significand
is shifted right one position and the exponent is incremented by one. This may yield an
inexact result and possibly exponent overflow. Fraction bits to the left of the bit position
used for rounding are stored into the FPR, and low-order bit positions, if any, are set to zero.
Four rounding modes are provided which are user-selectable through FPSCR[RN] as
described in Section 2.5.6, “Rounding.” For rounding, the conceptual guard, round, and
sticky bits are defined in terms of accumulator bits.
Table 2-32 shows the positions of the guard, round, and sticky bits for double-precision and
single-precision floating-point numbers.
Table 2-32. Location of the Guard, Round and Sticky Bits
Format Guard Round Sticky
Rounding can be treated as though the significand were shifted right, if required, until the
least significant bit to be retained is in the low-order bit position of the FRACTION. If any
of the guard, round, or sticky bits are nonzero, the result is inexact.
S EXP FRACTION
0 1 8 9 31
S EXP FRACTION
0 1 11 12 63
The exponent is expressed as an 8-bit value for single-precision numbers or an 11-bit value
for double-precision numbers. These bits hold the biased exponent; the true value of the
exponent can be determined by subtracting 127 for single-precision numbers and 1023 for
double-precision values. This is shown in Figure 2-42. Note that using a bias eliminates the
need for a sign bit. The highest-order bit is used both to generate the number, and is an
implicit sign bit. Note also that two values are reserved—all bits set indicates that the
number is an infinity or NaN and all bits cleared indicates that the number is either zero or
denormalized.
. . .
10. . . . .00 1 1
01. . . . .10 –1 –1
. . .
Negative . . .
. . .
Tiny Tiny
0 0 0 Nonzero +Denormalized
0 0 0 Zero +0
1 0 0 Zero –0
1 0 0 Nonzero –Denormalized
SIGN OF MANTISSA, 0 OR 1
The ranges covered by the magnitude (M) of a normalized floating-point number are
approximately equal to the following:
Single-precision format:
1.2x10–38 ≤ M ≤ 3.4x1038
Double-precision format:
2.2x10–308 ≤ M ≤ 1.8x10308
EXPONENT = 0 FRACTION = 0
(BIASED)
SIGN OF MANTISSA, 0 OR 1
SIGN OF MANTISSA, 0 OR 1
EXPONENT = MAXIMUM
(BIASED) FRACTION = 0
SIGN OF MANTISSA, 0 OR 1
The fraction value is zero. Infinities are used to approximate values greater in magnitude
than the maximum normalized value. Infinity arithmetic is defined as the limiting case of
real arithmetic, with restricted operations defined between numbers and infinities. Infinities
and the reals can be related by ordering in the affine sense:
–∞ < every finite number < +∞
Arithmetic using infinite numbers is always exact and does not signal any exception, except
when an exception occurs due to the invalid operations as described in Section 5.4.7.2,
“Invalid Operation Exception Conditions.”
Signaling NaNs signal exceptions when they are specified as arithmetic operands.
0 111...1 1000....0
Bit 35
S EXP x x x x x x x x x xx x x x x x x x x x x x x 00000000000000000000000000000
0 1 11 12 63
The frspx instruction allows conversion from double- to single-precision with appropriate
exception checking and rounding. This instruction should be used to convert double-
precision floating-point values (produced by double-precision load and arithmetic
instructions) to single-precision values before storing them into single-format memory
elements or using them as operands for single-precision arithmetic instructions. Values
produced by single-precision load and arithmetic instructions can be stored directly, or used
directly as operands for single-precision arithmetic instructions, without preceding the
store, or the arithmetic instruction, by frspx.
A single-precision value can be used in double-precision arithmetic operations. The reverse
is true only if the double-precision value can be represented in single-precision format.
Some implementations may execute single-precision arithmetic instructions faster than
double-precision arithmetic instructions. Therefore, if double-precision accuracy is not
required, using single-precision data and instructions can speed operations.
2.5.6 Rounding
All arithmetic instructions defined by the PowerPC architecture produce an intermediate
result considered infinitely precise. This result must then be written with a precision of
finite length into an FPR. After normalization or denormalization, if the infinitely precise
intermediate result cannot be represented in the precision required by the instruction, it is
rounded before being placed into the target FPR.
The instructions that potentially round their result are the arithmetic, multiply-add, and
rounding and conversion instructions. As shown in Figure 2-51, whether rounding occurs
depends on the source values.
Yes
FI = 1
Fraction No
Incremented FR = 0
Yes
FI = 1
Each of these instructions sets FPSCR bits FR and FI, according to whether rounding
occurs (FI) and whether the fraction was incremented (FR). If rounding occurs, FI is set to
one and FR may be either zero or one. If rounding does not occur, both FR and FI are
cleared. Other floating-point instructions do not alter FR and FI. Four modes of rounding
are provided that are user-selectable through the floating-point rounding control field in the
FPSCR. See Section 2.2.3, “Floating-Point Status and Control Register (FPSCR).” These
are encoded as follows in Table 2-35.
Table 2-35. FPSCR Bit Settings—RN Field
RN Rounding Mode
00 Round to nearest
Let Z be the infinitely precise intermediate arithmetic result or the operand of a conversion
operation. If Z can be represented exactly in the target format, no rounding occurs and the
result in all rounding modes is equivalent to truncation of Z. If Z cannot be represented
exactly in the target format, let Z1 and Z2 be the next larger and next smaller numbers
representable in the target format that bound Z; then Z1 or Z2 can be used to approximate
the result in the target format.
By incrementing LSB of Z
Infinitely precise value
By truncating after LSB
Z2 Z1 0 Z2 Z1
Z Z
Negative values Positive values
Does Z fit
Yes
Rounding = Truncation
target format?
No
Z1 ≤ Z ≤ Z2
No
Yes
Round
Choose Z2
toward –∞?
No
Yes
Round
Choose Z1
toward +∞?
No
Yes
Round
Choose Z1
toward 0?
No
Round Choose best approxi-
to nearest mation (Z1 or Z2)
if tie
Choose even value (Z1
or Z2 whose lsb is 0)
2.7 Reset
The following sections describe hard reset and soft reset in the 601 processor. For more
information about the reset exception see Section 5.4.1, “Reset Exceptions (x'00100').”
SRR0 00000000
Notes: 1 In the earliest release of the 601 (DD1), this is 00010000. Later versions of the hardware may be different.
2 Master checkstop enable on, sequencer GPR self-test checkstop invalid microcode instruction checkstop on.
3 Note that if external clock is connected to RTC for the 601, then the RTCL, RTCU, and DEC can change from
their initial value of 0s without receiving instructions to load those registers.
This chapter describes instructions and address modes supported by the PowerPC 601
microprocessor. These instructions are divided into the following categories:
• Integer instructions—These include arithmetic and logical instructions.
• Floating-point instructions—These include floating-point arithmetic instructions, as
well as instructions that affect the floating-point status and control register.
• Load/store instructions—These include integer and floating-point load and store
instructions.
• Flow control instructions—These include branching instructions, condition register
logical instructions, trap instructions, and other instructions that affect the
instruction flow.
• Processor control instructions—These instructions are used for synchronizing
memory accesses and management of caches, TLBs, and the segment registers.
This grouping of the instructions does not necessarily indicate the execution unit that
processes a particular instruction or group of instructions. This information, which is useful
in taking full advantage of the 601’s superscalar parallel instruction execution, is provided
in Chapter 10, “Instruction Set.”
Integer instructions operate on byte, half-word, and word operands. Floating-point
instructions operate on single-precision and double-precision floating-point operands. The
PowerPC architecture uses instructions that are four bytes long and word-aligned. It
provides for byte, half-word, and word operand fetches and stores between memory and a
set of 32 general-purpose registers (GPRs). It also provides for word and double-word
operand fetches and stores between memory and a set of 32 floating-point registers (FPRs).
Arithmetic and logical instructions do not read or modify memory. To use the contents of a
memory location in a computation and then modify the same or another memory location,
the memory contents must be loaded into a register, modified, and then written back to the
target location using load or store instructions.
Add addi rD,rA,SIMM The sum (rA|0) + SIMM is placed into register rD.
Immediate
Add addis rD,rA,SIMM The sum (rA|0) + (SIMM || x '0000') is placed into register rD.
Immediate
Shifted
Add add rD,rA,rB The sum (rA) + (rB) is placed into register rD.
add.
add Add
addo
add. Add with CR Update. The dot suffix enables the update of
addo.
the condition register.
addo Add with Overflow Enabled. The o suffix enables the
overflow bit (OV) in the XER.
addo. Add with Overflow and CR Update. The o. suffix enables
the update of the condition register and enables the
overflow bit (OV) in the XER.
Subtract subf rD,rA,rB The sum ¬ (rA) + (rB) +1 is placed into rD.
from subf.
subf Subtract from
subfo
subf. Subtract from with CR Update. The dot suffix enables the
subfo.
update of the condition register.
subfo Subtract from with Overflow Enabled. The o suffix enables
the overflow. The o suffix enables the overflow bit (OV) in
the XER.
subfo. Subtract from with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Add addic rD,rA,SIMM The sum (rA) + SIMM is placed into register rD.
Immediate
Carrying
Add addic. rD,rA,SIMM The sum (rA) + SIMM is placed into rD. The condition register is
Immediate updated.
Carrying
and Record
Subtract subfic rD,rA,SIMM The sum ¬ (rA) + SIMM + 1 is placed into register rD.
from
Immediate
Carrying
Add addc rD,rA,rB The sum (rA) + (rB) is placed into register rD.
Carrying addc.
addc Add Carrying
addco
addc. Add Carrying with CR Update. The dot suffix enables the
addco.
update of the condition register.
addco Add Carrying with Overflow Enabled. The o suffix enables
the overflow bit (OV) in the XER.
addco. Add Carrying with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Subtract subfc rD,rA,rB The sum ¬ (rA) + (rB) + 1 is placed into register rD.
from subfc.
subfc Subtract from Carrying
Carrying subfco
subfc. Subtract from Carrying with CR Update. The dot suffix
subfco.
enables the update of the condition register.
subfco Subtract from Carrying with Overflow. The o suffix enables
the overflow bit (OV) in the XER.
subfco. Subtract from Carrying with Overflow and CR Update.
The o. suffix enables the update of the condition register
and enables the overflow bit (OV) in the XER.
Add adde rD,rA,rB The sum (rA) + (rB) + XER[CA] is placed into register rD.
Extended adde.
adde Add Extended
addeo
adde. Add Extended with CR Update. The dot suffix enables the
addeo.
update of the condition register.
addeo Add Extended with Overflow. The o suffix enables the
overflow bit (OV) in the XER.
addeo. Add Extended with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Subtract subfe rD,rA,rB The sum ¬ (rA) + (rB) + XER[CA] is placed into register rD.
from subfe.
subfe Subtract from Extended
Extended subfeo
subfe. Subtract from Extended with CR Update. The dot suffix
subfeo.
enables the update of the condition register.
subfeo Subtract from Extended with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
subfeo. Subtract from Extended with Overflow and CR Update.
The o. suffix enables the update of the condition register
and enables the overflow (OV) bit in the XER.
Add to addme rD,rA The sum (rA) + XER[CA] + x'FFFFFFFF' is placed into register rD.
Minus One addme.
addme Add to Minus One Extended
Extended addmeo
addme. Add to Minus One Extended with CR Update. The dot
addmeo.
suffix enables the update of the condition register.
addmeo Add to Minus One Extended with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
addmeo. Add to Minus One Extended with Overflow and CR
Update. The o. suffix enables the update of the condition
register and enables the overflow (OV) bit in the XER.
Subtract subfme rD,rA The sum ¬ (rA) + XER(CA) + x'FFFFFFFF' is placed into register rD.
from Minus subfme.
subfme Subtract from Minus One Extended
One subfmeo
subfme. Subtract from Minus One Extended with CR Update. The
Extended subfmeo.
dot suffix enables the update of the condition register.
subfmeo Subtract from Minus One Extended with Overflow. The o
suffix enables the overflow bit (OV) in the XER.
subfmeo. Subtract from Minus One Extended with Overflow and CR
Update. The o. suffix enables the update of the condition
register and enables the overflow bit (OV) in the XER.
Add to Zero addze rD,rA The sum (rA) + XER[CA] is placed into register rD.
Extended addze.
addze Add to Zero Extended
addzeo
addze. Add to Zero Extended with CR Update. The dot suffix
addzeo.
enables the update of the condition register.
addzeo Add to Zero Extended with Overflow. The o suffix enables
the overflow bit (OV) in the XER.
addzeo. Add to Zero Extended with Overflow and CR Update. The
o. suffix enables the update of the condition register and
enables the overflow bit (OV) in the XER.
Subtract subfze rD,rA The sum ¬ (rA) + XER[CA] is placed into register rD.
from Zero subfze.
subfze Subtract from Zero Extended
Extended subfzeo
subfze. Subtract from Zero Extended with CR Update. The dot
subfzeo.
suffix enables the update of the condition register.
subfzeo Subtract from Zero Extended with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
subfzeo. Subtract from Zero Extended with Overflow and CR
Update. The o. suffix enables the update of the condition
register and enables the overflow bit (OV) in the XER.
Negate neg rD,rA The sum ¬ (rA) + 1 is placed into register rD.
neg.
neg Negate
nego
neg. Negate with CR Update. The dot suffix enables the update
nego.
of the condition register.
nego Negate with Overflow. The o suffix enables the overflow bit
(OV) in the XER.
nego. Negate with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Multiply mulli rD,rA,SIMM The low-order 32 bits of the 48-bit product (rA)∗SIMM are placed into
Low register rD. The low-order 32 bits of the product are the correct 32-bit
Immediate product. The low-order bits are independent of whether the operands
are treated as signed or unsigned integers. However, XER[OV] is set
based on the result interpreted as a signed integer.
The high-order bits are lost. This instruction can be used with
mulhwx to calculate a full 64-bit product.
Multiply mullw rD,rA,rB The low-order 32 bits of the 64-bit product (rA)∗(rB) are placed into
Low mullw. register rD. The low-order 32 bits of the product are the correct 32-bit
mullwo product. The low-order bits are independent of whether the operands
mullwo. are treated as signed or unsigned integers. However, XER[OV] is set
based on the result interpreted as a signed integer.
The high-order bits are lost. This instruction can be used with
mulhwx to calculate a full 64-bit product.This instruction may execute
faster if rB contains the operand having the smaller absolute value.
mullw Multiply Low
mullw. Multiply Low with CR Update. The dot suffix enables the
update of the condition register.
mullwo Multiply Low with Overflow. The o suffix enables the
overflow bit (OV) in the XER.
mullwo. Multiply Low with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Multiply mulhw rD,rA,rB The contents of rA and rB are interpreted as 32-bit signed integers.
High Word mulhw. The 64-bit product is formed. The high-order 32 bits of the 64-bit
product are placed into rD.
Both operands and the product are interpreted as signed integers.
This instruction may execute faster if rB contains the operand having
the smaller absolute value.
mulhw Multiply High Word
mulhw. Multiply High Word with CR Update. The dot suffix enables
the update of the condition register.
Multiply mulhwu rD,rA,rB The contents of rA and of rB are extracted and interpreted as 32-bit
High Word mulhwu. unsigned integers. The 64-bit product is formed. The high-order 32
Unsigned bits of the 64-bit product are placed into rD.
Both operands and the product are interpreted as unsigned integers.
This instruction may execute faster if rB contains the operand having
the smaller absolute value.
mulhwu Multiply High Word Unsigned
mulhwu. Multiply High Word Unsigned with CR Update. The dot
suffix enables the update of the condition register.
Divide Word divw rD,rA,rB The dividend is the signed value of (rA). The divisor is the signed
divw. value of (rB). The quotient is placed into rD. The remainder is not
divwo supplied as a result.
divwo. Both operands are interpreted as signed integers. The quotient is the
unique signed integer that satisfies the following:
dividend = (quotient * divisor) + r
where 0 ≤ r < |divisor| if the dividend is non-negative, and –|divisor| <
r ≤ 0 if the dividend is negative.
If an attempt is made to perform any of the divisions
x'8000_0000' /–1
or
<anything> / 0
Divide divwu rD,rA,rB The dividend is the value of (rA). The divisor is the value of (rB). The
Word divwu. 32-bit quotient is placed into rD. The remainder is not supplied as a
Unsigned divwuo result.
divwuo. Both operands are interpreted as unsigned integers. The quotient is
the unique unsigned integer that satisfies the following:
dividend = (quotient * divisor) + r
where 0 ≤ r < divisor.
If an attempt is made to perform the division
<anything> / 0
the contents of register rD are undefined, as are the contents of the
LT, GT, and EQ bits of the condition register field CR0 if the
instruction has the condition register updating enabled. In these
cases, if instruction overflow is enabled, then XER[OV] is set.
The 32-bit unsigned remainder of dividing (rA) by (rB) can be
computed as follows:
divwu rD,rA,rB rD = quotient
mullw rD,rD,rB rD = quotient * divisor
subf rD,rD,rA rD = remainder
Difference dozi rD,rA,SIMM This is a POWER instruction, and is not part of the PowerPC
or Zero architecture. This instruction will not be supported by other
Immediate PowerPC implementations.
The sum ¬ (rA) + SIMM + 1 is placed into register rD if greater than 0;
if the sum is less than or equal to 0, register rD is cleared to 0.
This instruction is specific to the 601.
Difference doz rD,rA,rB This is a POWER instruction, and is not part of the PowerPC
or Zero doz. architecture. This instruction will not be supported by other
dozo PowerPC implementations.
dozo.
The sum ¬ (rA) + (rB) + 1 is placed into register rD. If the value in
register rA is algebraically greater than the value in register rB,
register rD is cleared.
If the instruction has condition register updating enabled, condition
register field CR0 is set to reflect the result placed in register rD (i.e.,
if register rD is set to zero, EQ is set to 1).
If the instruction has overflow enabled, XER[OV] is only set on
positive overflows.
doz Difference or Zero
doz. Difference or Zero with CR Update. The dot suffix enables
the update of the condition register.
dozo Difference or Zero with Overflow. The o suffix enables the
overflow bit (OV) in the XER.
dozo. Difference or Zero with Overflow and CR Update. The o.
suffix enables the update of the condition register and
enables the overflow bit (OV) in the XER.
This instruction is specific to the 601.
Absolute abs rD,rA This is a POWER instruction, and is not part of the PowerPC
abs. architecture. This instruction will not be supported by other
abso PowerPC implementations.
abso.
The absolute value |(rA)| is placed into register rD. If register rA
contains the most negative number (i.e., x ‘80000000'), the result of
the instruction is the most negative number and sets the XER[OV] bit
if enabled.
abs Absolute
abs. Absolute with CR Update. The dot suffix enables the
update of the condition register.
abso Absolute with Overflow. The o suffix enables the overflow
bit (OV) in the XER
abso. Absolute with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
This instruction is specific to the 601.
Negative nabs rD,rA This is a POWER instruction, and is not part of the PowerPC
Absolute nabs. architecture. This instruction will not be supported by other
nabso PowerPC implementations.
nabso.
The negative absolute value –|(rA)| is placed into register rD.
Note: nabs never overflows. If the instruction is overflow enabled,
then XER[OV] is cleared to zero and XER[SO] is not changed.
nabs Negative Absolute
nabs. Negative Absolute with CR Update. The dot suffix enables
the update of the condition register.
nabso Negative Absolute with Overflow. The o suffix enables the
overflow bit (OV) in the XER
nabso. Negative Absolute with Overflow and CR Update. The o.
suffix enables the update of the condition register and
enables the overflow bit (OV) in the XER.
This instruction is specific to the 601.
Multiply mul rD,rA,rB This is a POWER instruction, and is not part of the PowerPC
mul. architecture. This instruction will not be supported by other
mulo PowerPC implementations.
Bits 0–31 of the product (rA)∗(rB) are placed into register rD. Bits
mulo.
32–63 of the product (rA)∗(rB) are placed into the MQ register.
If the condition register updating is enabled, then LT, GT, and EQ
reflect the result in the low-order 32 bits (contents of MQ register). If
the instruction is overflow enabled, then the XER[SO] and XER[OV]
bits are set to one if the product cannot be represented in 32 bits.
mul Multiply
mul. Multiply with CR Update. The dot suffix enables the update
of the condition register.
mulo Multiply with Overflow. The o suffix enables the overflow
bit (OV) in the XER.
mulo. Multiply with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
This instruction is specific to the 601.
Divide div rD,rA,rB This is a POWER instruction, and is not part of the PowerPC
div. architecture. This instruction will not be supported by other
divo PowerPC implementations.
divo.
The quotient [(rA) || (MQ)]/(rB) is placed into register rD. The
remainder is placed in the MQ register. The remainder has the same
sign as the dividend, except that a zero quotient or a zero remainder
is always positive. The results obey the equation:
dividend = (divisor ∗ quotient) + remainder
where dividend is the original (rA) || (MQ), divisor is the original (rB),
quotient is the final (rD), and remainder is the final (MQ).
If the condition register updating is enabled, condition register field
CR0 bits LT, GT, and EQ reflect the remainder. If the instruction is
overflow enabled, then the XER[SO] and XER[OV] bits are set to one
if the quotient cannot be represented in 32 bits.
For the case of –231/–1, the MQ register is cleared to zero and –231 is
placed in register rD. For all other overflows, (MQ), (rD), and
condition register field CR0 (if condition register updating is enabled)
are undefined.
div Divide
div. Divide with CR Update. The dot suffix enables the update
of the condition register.
divo Divide with Overflow. The o suffix enables the overflow bit
(OV) in the XER.
divo. Divide with Overflow and CR Update. The o. suffix enables
the update of the condition register and enables the
overflow bit (OV) in the XER.
This instruction is specific to the 601.
Divide Short divs rD,rA,rB This is a POWER instruction, and is not part of the PowerPC
divs. architecture. This instruction will not be supported by other
divso PowerPC implementations.
divso.
The quotient (rA)/(rB) is placed into register rD. The remainder is
placed in MQ. The remainder has the same sign as the dividend,
except that a zero quotient or a zero remainder is always positive.
The results obey the equation:
dividend = (divisor ∗ quotient) + remainder
where the dividend is the original (rA), divisor is the original (rB),
quotient is the final (rD), and remainder is the final (MQ).
If the condition register updating is enabled, then the condition
register field CR0 bits LT, EQ, and GT reflect the remainder. If the
instruction is overflow enabled, then the XER[SO] and XER[OV] bits
are set to one if the quotient cannot be represented in 32 bits (e.g., as
is the case when the divisor is zero, or the dividend is –231 and the
divisor is –1). For the case of –231/–1, the MQ register is cleared and
–231 is placed in register rD. For all other overflows, (MQ), (rD), and
condition register field CR0 (if condition register updating is enabled)
are undefined.
divs Divide Short
divs. Divide Short with CR Update. The dot suffix enables the
update of the condition register.
divso Divide Short with Overflow. The o suffix enables the
overflow bit (OV) in the XER.
divso. Divide Short with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
This instruction is specific to the 601.
In addition to supporting all of the PowerPC integer arithmetic instructions, the 601
supports the POWER arithmetic instructions summarized in Table 3-1 and Table 3-2 and
described in detail in Chapter 10, “Instruction Set.” Note that in order to achieve full
compatibility with future PowerPC implementations, it is up to software to either emulate
these operations in the program exception handler, or to completely avoid their use.
absx Absolute
mulx Multiply
divx Divide
Compare cmpi crfD,L,rA,SIMM The contents of register rA is compared with the sign-extended
Immediate value of the SIMM operand, treating the operands as signed
integers. The result of the comparison is placed into the CR field
specified by operand crfD.
Compare cmp crfD,L,rA,rB The contents of register rA is compared with register rB, treating
the operands as signed integers. The result of the comparison is
placed into the CR field specified by operand crfD.
Compare cmpli crfD,L,rA,UIMM The contents of register rA is compared with x'0000' || UIMM,
Logical treating the operands as unsigned integers. The result of the
Immediate comparison is placed into the CR field specified by operand crfD.
Compare cmpl crfD,L,rA,rB The contents of register rA is compared with register rB, treating
Logical the operands as unsigned integers. The result of the comparison is
placed into the CR field specified by operand crfD.
While the PowerPC architecture specifies that the value in the L field determines whether
the operands are treated as 32- or 64-bit values, the 601 ignores the value in the L field and
treats the operands as 32-bit values. The simplified mnemonics for integer compare
instructions as shown in Table 3-4 correctly clear the L value in the instruction rather than
requiring it to be coded as a numeric operand.
The following examples demonstrate the use of the simplified word compare mnemonics:
• Compare 32 bits in register rA with immediate value 100 and place result in
condition register field CR0.
cmpwi rA,100 (equivalent to cmpi 0,0,rA,100)
• Same as (1), but place results in condition register field CR4.
cmpwi cr4,rA,100 (equivalent to cmpi 4,0,rA,100)
• Compare registers rA and rB as logical 32-bit quantities and place result in
condition register field CR0.
cmplw rA,rB (equivalent to cmpl 0,0,rA,rB)
AND andi. rA,rS,UIMM The contents of rS is ANDed with x'0000' || UIMM and the result is
Immediate placed into rA.
AND andis. rA,rS,UIMM The contents of rS is ANDed with UIMM || x'0000' and the result is
Immediate placed into rA.
Shifted
OR ori rA,rS,UIMM The contents of rS is ORed with x'0000' || UIMM and the result is
Immediate placed into rA.
The preferred no-op is ori 0,0,0
OR oris rA,rS,UIMM The contents of rS is ORed with UIMM ||x'0000' and the result is
Immediate placed into rA.
Shifted
XOR xori rA,rS,UIMM The contents of rS is XORed with x'0000' || UIMM and the result is
Immediate placed into rA.
XOR xoris rA,rS,UIMM The contents of rS is XORed with UIMM ||x'0000' and the result is
Immediate placed into rA.
Shifted
AND and rA,rS,rB The contents of rS is ANDed with the contents of register rB and the
and. result is placed into rA.
and AND
and. AND with CR Update. The dot suffix enables the update of
the condition register.
OR or rA,rS,rB The contents of rS is ORed with the contents of rB and the result is
or. placed into rA.
or OR
or. OR with CR Update. The dot suffix enables the update of the
condition register.
XOR xor rA,rS,rB The contents of rS is XORed with the contents of rB and the result is
xor. placed into register rA.
xor XOR
xor. XOR with CR Update. The dot suffix enables the update of
the condition register.
NAND nand rA,rS,rB The contents of rS is ANDed with the contents of rB and the one’s
nand. complement of the result is placed into register rA.
nand NAND
nand. NAND with CR Update. The dot suffix enables the update of
the condition register.
NAND with rA = rB can be used to obtain the one's complement.
NOR nor rA,rS,rB The contents of rS is ORed with the contents of rB and the one’s
nor. complement of the result is placed into register rA.
nor NOR
nor. NOR with CR Update. The dot suffix enables the update of
the condition register.
NOR with rA = rB can be used to obtain the one's complement.
Equivalent eqv rA,rS,rB The contents of rS is XORed with the contents of rB and the
eqv. complemented result is placed into register rA.
eqv Equivalent
eqv. Equivalent with CR Update. The dot suffix enables the
update of the condition register.
AND with andc rA,rS,rB The contents of rS is ANDed with the complement of the contents of
Complement andc. rB and the result is placed into rA.
andc AND with Complement
andc. AND with Complement with CR Update. The dot suffix
enables the update of the condition register.
OR with orc rA,rS,rB The contents of rS is ORed with the complement of the contents of rB
Complement orc. and the result is placed into rA.
orc OR with Complement
orc. OR with Complement with CR Update. The dot suffix
enables the update of the condition register.
Extend Sign extsb rA,rS Register r S[24–31] are placed into rA[24–31]. Bit 24 of rS is placed
Byte extsb. into rA[0–23].
extsb Extend Sign Byte
extsb. Extend Sign Byte with CR Update. The dot suffix enables the
update of the condition register.
Extend Sign extsh rA,rS Register r S[16–31] are placed into rA[16–31]. Bit 16 of rS is placed
Half Word extsh. into rA[0–15].
extsh Extend Sign Half Word
extsh. Extend Sign Half Word with CR Update. The dot suffix
enables the update of the condition register.
Count cntlzw rA,rS A count of the number of consecutive zero bits of rS is placed into rA.
Leading cntlzw. This number ranges from 0 to 32, inclusive.
Zeros Word
cntlzw Count Leading Zeros Word
cntlzw. Count Leading Zeros Word with CR Update. The dot suffix
enables the update of the condition register.
When the Count Leading Zeros Word instruction has condition
register updating enabled, the LT field is cleared to zero in CR0.
Extract Select a field of n bits starting at bit position b in the source register, right or left justify this field in the
target register, and clear all other bits of the target register to zero.
Insert Select a field of n bits in the source register, insert this field starting at bit position b of the target
register, and leave other bits of the target register unchanged. (No simplified mnemonic is provided for
insertion of a field when operating on double words; such an insertion requires more than one
instruction.)
Rotate Rotate the contents of a register right or left n bits without masking.
Shift Shift the contents of a register right or left n bits, clearing vacated bits to 0 (logical shift).
Clear left Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can be used to
and shift scale a known non-negative array index by the width of an element.
left
The IU performs rotation operations on data from a GPR and returns the result, or a portion
of the result, to a GPR. Rotation operations rotate a 32-bit quantity left by a specified
number of bit positions. Bits that exit from position 0 enter at position 31. A rotate right
operation can be accomplished by specifying a rotation of 32-n bits, where n is the right
rotation amount.
Rotate and shift instructions employ a mask generator. The mask is 32 bits long and consists
of “1” bits from a start bit, MB, through and including a stop bit, ME, and “0” bits
elsewhere. The values of MB and ME range from 0 to 31. If MB > ME, the “1” bits wrap
around from position 31 to position 0. Thus the mask is formed as follows:
if MB ≤ ME then
mask[mstart–mstop] = ones
mask[all other bits] = zeros
else
mask[mstart–31] = ones
mask[0–mstop] = ones
mask[all other bits] = zeros
It is not possible to specify an all-zero mask. The use of the mask is described in the
following sections.
If condition register updating is enabled, rotate and shift instructions set condition register
field CR0 according to the contents of rA at the completion of the instruction. Rotate and
shift instructions do not change the values of XER[OV] and XER[SO] bits. Rotate and shift
instructions, except algebraic right shifts, do not change the XER[CA] bit.
Rotate Left rlwinm rA,rS,SH,MB,ME The contents of register rS are rotated left by the number of bits
Word rlwinm. specified by operand SH. A mask is generated having “1” bits from
Immediate the bit specified by operand MB through the bit specified by
then AND operand ME and “0” bits elsewhere. The rotated data is ANDed
with Mask with the generated mask and the result is placed into register rA.
rlwinm Rotate Left Word Immediate then AND with Mask
rlwinm. Rotate Left Word Immediate then AND with Mask with
CR Update. The dot suffix enables the update of the
condition register.
Simplified mnemonics:
extlwi rA,rS,n,b rlwinm rA,rS,b,0,n-1
srwi rA,rS,n rlwinm rA,rS,32-n,n,31
clrrwi rA,rS,n rlwinm rA,rS,0,0,31-n
Note: The rlwinm instruction can be used for extracting, clearing
and shifting bit fields using the methods shown below:
To extract an n-bit field that starts at bit position b in register rS,
right-justified into rA (clearing the remaining 32-n bits of rA), set
SH=b +n, MB=32-n, and ME=31.
To extract an n-bit field that starts at bit position b in rS,
left-justified into rA, set SH=b, MB=0, and ME=n–1.
To rotate the contents of a register left (right) by n bits, set SH=n
(32–n), MB=0, and ME=31.
To shift the contents of a register right by n bits, set SH=32-n,
MB=n, and ME=31.
To clear the high-order b bits of a register and then shift the result
left by n bits, set SH=n, MB=b–n and ME=31-n.
To clear the low-order n bits of a register, set SH=0, MB=0, and
ME=31–n.
Rotate Left rlwnm rA,rS,rB,MB,ME The contents of rS are rotated left by the number of bits specified
Word then rlwnm. by rB[27–31]. A mask is generated having “1” bits from the bit
AND with specified by operand MB through the bit specified by operand ME
Mask and “0” bits elsewhere. The rotated data is ANDed with the
generated mask and the result is placed into rA.
rlwinm Rotate Left Word then AND with Mask
rlwinm. Rotate Left Word then AND with Mask with CR Update.
The dot suffix enables the update of the condition
register.
Simplified mnemonics:
rotlw rA,rS,rB rlwnm rA,rS,rB,0,31
Note: The rlwinm instruction can be used to extract and rotate bit
fields using the methods shown below:
To extract an n-bit field that starts at the variable bit position b in
the register specified by operand rS, right-justified into rA (clearing
the remaining 32-n bits of rA), set r B[27–31]=b+n, MB=32-n, and
ME=31.
To extract an n-bit field that starts at variable bit position b in the
register specified by operand rS, left-justified into rA (clearing the
remaining 32-n bits of rA), set rB[27–31]=b, MB=0, and ME=n-1.
To rotate the contents of the low-order 32 bits of a register left
(right) by variable n bits, set rB[27–31]=n (32-n), MB=0, and
ME=31.
Rotate Left rlwimi rA,rS,SH,MB,ME The contents of rS are rotated left by the number of bits specified
Word rlwimi. by operand SH. A mask is generated having “1” bits from the bit
Immediate specified by MB through the bit specified by ME and “0” bits
then Mask elsewhere. The rotated data is inserted into rA under control of the
Insert generated mask.
rlwimi Rotate Left Word Immediate then Mask
rlwimi. Rotate Left Word Immediate then Mask Insert with CR
Update. The dot suffix enables the update of the
condition register.
Simplified mnemonic:
inslwi rA,rS,n,b rlwimi rA,rS,32-b,b,b+n-1
Note: The opcode rlwimi can be used to insert a bit field into the
contents of register specified by operand rA using the methods
shown below:
To insert an n-bit field that is left-justified in rS into rA starting at bit
position b, set SH=32-b, MB=b, and ME=(b+n)-1.
To insert an n-bit field that is right-justified in rS into rA starting at
bit position b, set SH=32-(b+n), MB=b, and ME=(b+n)-1.
Simplified mnemonics are provided for both of these methods.
Rotate Left rlmi rA,rS,rB,MB,ME This is a POWER instruction, and is not part of the PowerPC
then Mask rlmi. architecture. This instruction will not be supported by other
Insert PowerPC implementations.
The contents of rS is rotated left the number of positions specified
by bits 27–31 of rB. The rotated data is inserted into rA under
control of the generated mask.
rlmi Rotate Left then Mask Insert
rlmi. Rotate Left then Mask Insert with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Rotate rrib rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Right and rrib. architecture. This instruction will not be supported by other
Insert Bit PowerPC implementations.
Bit 0 of rS is rotated right the amount specified by bits 27–31 of rB.
The bit is then inserted into rA.
rrib Rotate Right and Insert Bit
rrib. Rotate Right and Insert Bit with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Mask maskg rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Generate maskg. architecture. This instruction will not be supported by other
PowerPC implementations.
Let mstart = rS[27–31], specifying the starting point of a mask of
ones. Let mstop = rB[27–31], specifying the end point of the mask
of ones.
If mstart < mstop+1 then
MASK(mstart…mstop) = ones
MASK(all other bits) = zeros
If mstart = mstop+1 then
MASK(0-31) = ones
If mstart > mstop+1 then
MASK(mstop+1…mstart-1) = zeros
MASK(all other bits) = ones
Mask maskir rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Insert from maskir. architecture. This instruction will not be supported by other
Register PowerPC implementations.
Register rS is inserted into rA under control of the mask in rB.
maskir Mask Insert from Register
maskir. Mask Insert from Register with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Left slw rA,rS,rB The contents of rS are shifted left the number of bits specified by
Word slw. rB[27–31]. Bits shifted out of position 0 are lost. Zeros are supplied to
the vacated positions on the right. The 32-bit result is placed into rA.
If rB[26] = 1, then rA is filled with zeros.
slw Shift Left Word
slw. Shift Left Word with CR Update. The dot suffix enables the
update of the condition register.
Shift Right srw rA,rS,rB The contents of rS are shifted right the number of bits specified by
Word srw. rB[27–31]. Zeros are supplied to the vacated positions on the left.
The 32-bit result is placed into rA.
If rB[26] = 1, then rA is filled with zeros.
srw Shift Right Word
srw. Shift Right Word with CR Update. The dot suffix enables
the update of the condition register.
Shift Right srawi rA,rS,SH The contents of rS are shifted right the number of bits specified by
Algebraic srawi. operand SH. Bits shifted out of position 31 are lost. The 32-bit result
Word is sign extended and placed into rA. XER[CA] is set if r S contains a
Immediate negative number and any “1” bits are shifted out of position 31;
otherwise XER[CA] is cleared. An operand SH of zero causes rA to
be loaded with the contents of rS and XER[CA] to be cleared to 0.
srawi Shift Right Algebraic Word Immediate
srawi. Shift Right Algebraic Word Immediate with CR Update.
The dot suffix enables the update of the condition register.
Shift Right sraw rA,rS,rB The contents of rS are shifted right the number of bits specified by
Algebraic sraw. rB[27–31]. If rB[26] = 1, then rA is filled with 32 sign bits (bit 0) from
Word rS. If rB[26] = 0, then rA is filled from the left with sign bits. XER[CA]
is set to 1 if rS contains a negative number and any “1” bits are
shifted out of position 31; otherwise XER[CA] is cleared to 0. An
operand (rB) of zero causes rA to be loaded with the contents of rS,
and XER[CA] to be cleared to 0. Condition register field CR0 is set
based on the value written into rA.
sraw Shift Right Algebraic Word
sraw. Shift Right Algebraic Word with CR Update. The dot suffix
enables the update of the condition register.
Shift Left slq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
with MQ slq. architecture. This instruction will not be supported by other
PowerPC implementations.
Register rS is rotated left n bits where n is the shift amount specified
in bits 27–31 of register rB. The rotated word is placed in the MQ
register.
When bit 26 of register rB is a zero, a mask of 32 – n ones followed
by n zeros is generated.
When bit 26 of register rB is a one, a mask of all zeros is generated.
The logical AND of the rotated word and the generated mask is
placed into register rA.
slq Shift Left with MQ
slq. Shift Left with MQ with CR Update. The dot suffix enables
the update of the condition register.
This instruction is specific to the 601.
Shift Right srq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
with MQ srq. architecture. This instruction will not be supported by other
PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified in bits 27–31 of register rB. The rotated word is placed into
the MQ register. When bit 26 of register rB is a zero, a mask of n
zeros followed by 32-n ones is generated.
When bit 26 of register rB is a one, a mask of all zeros is generated.
The logical AND of the rotated word and the generated mask is
placed in rA.
srq Shift Right with MQ
srq. Shift Right with MQ with CR Update. The dot suffix
enables the update of the condition register.
This instruction is specific to the 601.
Shift Left sliq rA,rS,SH This is a POWER instruction, and is not part of the PowerPC
Immediate sliq. architecture. This instruction will not be supported by other
with MQ PowerPC implementations.
Register rS is rotated left n bits where n is the shift amount specified
by operand SH. The rotated word is placed in the MQ register. A
mask of 32 – n ones followed by n zeros is generated. The logical
AND of the rotated word and the generated mask is placed into
register rA.
sliq Shift Left Immediate with MQ
sliq. Shift Left Immediate with MQ with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Right sriq rA,rS,SH This is a POWER instruction, and is not part of the PowerPC
Immediate sriq. architecture. This instruction will not be supported by other
with MQ PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified by operand SH. The rotated word is placed into the MQ
register. A mask of n zeros followed by 32 – n ones is generated. The
logical AND of the rotated word and the generated mask is placed in
register rA.
sriq Shift Right Immediate with MQ
sriq. Shift Right Immediate with MQ with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Left slliq rA,rS,SH This is a POWER instruction, and is not part of the PowerPC
Long slliq. architecture. This instruction will not be supported by other
Immediate PowerPC implementations.
with MQ
Register rS is rotated left n bits where n is the shift amount specified
by SH. A mask of 32 – n ones followed by n zeros is generated. The
rotated word is then merged with the contents of MQ, under control of
the generated mask. The merged word is placed into rA. The rotated
word is placed into the MQ register.
slliq Shift Left Long Immediate with MQ
slliq. Shift Left Long Immediate with MQ with CR Update. The
dot suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Right srliq rA,rS,SH This is a POWER instruction, and is not part of the PowerPC
Long srliq. architecture. This instruction will not be supported by other
Immediate PowerPC implementations.
with MQ
Register rS is rotated left 32 – n bits where n is the shift amount
specified by operand SH. A mask of n zeros followed by 32 – n ones
is generated. The rotated word is then merged with the contents of
the MQ register, under control of the generated mask. The merged
word is placed in register rA. The rotated word is placed into the MQ
register.
srliq Shift Right Long Immediate with MQ
srliq. Shift Right Long Immediate with MQ with CR Update. The
dot suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Left sllq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Long with sllq. architecture. This instruction will not be supported by other
MQ PowerPC implementations.
Register rS is rotated left n bits where n is the shift amount specified
in bits 27–31 of register rB.
When bit 26 of register rB is a zero, a mask of 32 – n ones followed
by n zeros is generated. The rotated word is then merged with the
contents of the MQ register, under control of the generated mask.
When bit 26 of register rB is a one, a mask of 32 – n zeros followed
by n ones is generated. A word of zeros is then merged with the
contents of the MQ register, under control of the generated mask.
The merged word is placed in register rA. The MQ register is not
altered.
sllq Shift Left Long with MQ
sllq. Shift Left Long with MQ with CR Update. The dot suffix
enables the update of the condition register.
This instruction is specific to the 601.
Shift Right srlq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Long with srlq. architecture. This instruction will not be supported by other
MQ PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified in bits 27–31 of register rB.
When bit 26 of register rB is a zero, a mask of n zeros followed by
32 – n ones is generated. The rotated word is then merged with the
contents of the MQ register, under control of the generated mask.
When bit 26 of register rB is a one, a mask of n ones followed by
32 – n zeros is generated. A word of zeros is then merged with the
contents of the MQ register, under control of the generated mask.
The merged word is placed in register rA. The MQ register is not
altered.
srlq Shift Right Long with MQ
srlq. Shift Right Long with MQ with CR Update. The dot suffix
enables the update of the condition register.
This instruction is specific to the 601.
Shift Left sle rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Extended sle. architecture. This instruction will not be supported by other
PowerPC implementations.
Register rS is rotated left n bits where n is the shift amount specified
in bits 27–31 of register rB. The rotated word is placed in the MQ
register. A mask of 32 – n ones followed by n zeros is generated.
The logical AND of the rotated word and the generated mask is
placed in register rA.
sle Shift Left Extended
sle. Shift Left Extended with CR Update. The dot suffix
enables the update of the condition register.
This instruction is specific to the 601.
Shift Right sre rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Extended sre. architecture. This instruction will not be supported by other
PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified in bits 27–31 of register rB. The rotated word is placed into
the MQ register. A mask of n zeros followed by 32 – n ones is
generated.
The logical AND of the rotated word and the generated mask is
placed in register rA.
sre Shift Right Extended
sre. Shift Right Extended with CR Update. The dot suffix
enables the update of the condition register.
This instruction is specific to the 601.
Shift Left sleq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Extended sleq. architecture. This instruction will not be supported by other
with MQ PowerPC implementations.
Register rS is rotated left n bits where n is the shift amount specified
in bits 27–31 of register rB. A mask of 32 – n ones followed by n
zeros is generated. The rotated word is then merged with the
contents of the MQ register, under control of the generated mask.
The merged word is placed in register rA. The rotated word is placed
in the MQ register.
sleq Shift Left Extended with MQ
sleq. Shift Left Extended with MQ with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Right sreq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Extended sreq. architecture. This instruction will not be supported by other
with MQ PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified in bits 27–31 of register rB. A mask of n zeros followed by
32 – n ones is generated. The rotated word is then merged with the
contents of the MQ register, under control of the generated mask.
The merged word is placed in register rA. The rotated word is placed
into the MQ register.
sreq Shift Right Extended with MQ
sreq. Shift Right Extended with MQ with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Right sraiq rA,rS,SH This is a POWER instruction, and is not part of the PowerPC
Algebraic sraiq. architecture. This instruction will not be supported by other
Immediate PowerPC implementations.
with MQ
Register rS is rotated left 32 – n bits where n is the shift amount
specified by the operand SH. A mask of n zeros followed by 32 – n
ones is generated. The rotated word is placed in the MQ register.
The rotated word is then merged with a word of 32 sign bits from
register rS, under control of the generated mask. The merged word is
placed in register rA. The rotated word is ANDed with the
complement of the generated mask. This 32-bit result is ORed
together and then ANDed with bit 0 of register rS to produce
XER[CA].
Shift Right Algebraic instructions can be used for a fast divide by 2n if
followed with addze.
sraiq Shift Right Algebraic Immediate with MQ
sraiq. Shift Right Algebraic Immediate with MQ with CR Update.
The dot suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Right sraq rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Algebraic sraq. architecture. This instruction will not be supported by other
with MQ PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified in bits 27–31 of register rB. When bit 26 of register rB is a
zero, a mask of n zeros followed by 32 – n ones is generated. When
bit 26 of register rB is a one, a mask of all zeros is generated. The
rotated word is placed in the MQ register. The rotated word is then
merged with a word of 32 sign bits from register rS, under control of
the generated mask.
The merged word is placed in register rA.
The rotated word is ANDed with the complement of the generated
mask. This 32-bit result is ORed together and then ANDed with bit 0
of register rS to produce XER[CA].
Shift Right Algebraic instructions can be used for a fast divide by 2n if
followed with addze.
sraq Shift Right Algebraic with MQ
sraq. Shift Right Algebraic with MQ with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Shift Right srea rA,rS,rB This is a POWER instruction, and is not part of the PowerPC
Extended srea. architecture. This instruction will not be supported by other
Algebraic PowerPC implementations.
Register rS is rotated left 32 – n bits where n is the shift amount
specified in bits 27–31 of register rB. A mask of n zeros followed by
32 – n ones is generated. The rotated word is placed in the MQ
register.
The rotated word is then merged with a word of 32 sign bits from
register rS, under control of the generated mask.
The merged word is placed in register rA.
The rotated word is ANDed with the complement of the generated
mask. This 32-bit result is ORed together and then ANDed with bit 0
of register rS to produce XER[CA].
srea Shift Right Extended Algebraic
srea. Shift Right Extended Algebraic with CR Update. The dot
suffix enables the update of the condition register.
This instruction is specific to the 601.
Floating- fadd frD,frA,frB The floating-point operand in register frA is added to the
Point Add fadd. floating-point operand in register frB. If the most significant bit of the
resultant significand is not a one the result is normalized. The result
is rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR and placed into register frD.
Floating-point addition is based on exponent comparison and
addition of the two significands. The exponents of the two operands
are compared, and the significand accompanying the smaller
exponent is shifted right, with its exponent increased by one for each
bit shifted, until the two exponents are equal. The two significands
are then added algebraically to form an intermediate sum. All 53 bits
in the significand as well as all three guard bits (G, R, and X) enter
into the computation.
If a carry occurs, the sum's significand is shifted right one bit position
and the exponent is increased by one.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fadd Floating-Point Add
fadd. Floating-Point Add with CR Update. The dot suffix enables
the update of the condition register.
Floating- fadds frD,frA,frB The floating-point operand in register frA is added to the
Point Add fadds. floating-point operand in register frB. If the most significant bit of the
Single- resultant significand is not a one, the result is normalized. The result
Precision is rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR and placed into register frD.
Floating-point addition is based on exponent comparison and
addition of the two significands. The exponents of the two operands
are compared, and the significand accompanying the smaller
exponent is shifted right, with its exponent increased by one for each
bit shifted, until the two exponents are equal. The two significands
are then added algebraically to form an intermediate sum. All 53 bits
in the significand as well as all three guard bits (G, R, and X) enter
into the computation.
If a carry occurs, the sum's significand is shifted right one bit position
and the exponent is increased by one.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fadds Floating-Point Single-Precision
fadds. Floating-Point Single-Precision with CR Update. The dot
suffix enables the update of the condition register.
Floating- fsub frD,frA,frB The floating-point operand in register frB is subtracted from the
Point fsub. floating-point operand in register frA. If the most significant bit of the
Subtract resultant significand is not a 1, the result is normalized. The result is
rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR and placed into register frD.
The execution of the Floating-Point Subtract instruction is identical to
that of Floating-Point Add, except that the contents of register frB
participates in the operation with its sign bit (bit 0) inverted.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fsub Floating-Point Subtract
fsub. Floating-Point Subtract with CR Update. The dot suffix
enables the update of the condition register.
Floating- fsubs frD,frA,frB The floating-point operand in register frB is subtracted from the
Point fsubs. floating-point operand in register frA. If the most significant bit of the
Subtract resultant significand is not a 1, the result is normalized. The result is
Single- rounded to the target precision under control of the floating-point
Precision rounding control field RN of the FPSCR and placed into register frD.
The execution of the Floating-Point Subtract instruction is identical to
that of Floating-Point Add, except that the contents of register frB
participates in the operation with its sign bit (bit 0) inverted.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fsubs Floating-Point Subtract Single-Precision
fsubs. Floating-Point Subtract Single-Precision with CR Update.
The dot suffix enables the update of the condition register.
Floating- fmul frD,frA,frC The floating-point operand in register frA is multiplied by the
Point fmul. floating-point operand in register frC.
Multiply
If the most significant bit of the resultant significand is not a 1, the
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
Floating-point multiplication is based on exponent addition and
multiplication of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmul Floating-Point Multiply
fmul. Floating-Point Multiply with CR Update. The dot suffix
enables the update of the condition register.
Floating- fmuls frD,frA,frC The floating-point operand in register frA is multiplied by the
Point fmuls. floating-point operand in register frC.
Multiply
If the most significant bit of the resultant significand is not a 1, the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
Floating-point multiplication is based on exponent addition and
multiplication of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmuls Floating-Point Multiply Single-Precision
fmuls. Floating-Point Multiply Single-Precision with CR Update.
The dot suffix enables the update of the condition register.
Floating- fdiv frD,frA,frB The floating-point operand in register frA is divided by the
Point Divide fdiv. floating-point operand in register frB. No remainder is preserved.
If the most significant bit of the resultant significand is not a 1, the
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
Floating-point division is based on exponent subtraction and division
of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1 and zero divide
exceptions when FPSCR[ZE] = 1.
fdiv Floating-Point Divide
fdiv. Floating-Point Divide with CR Update. The dot suffix
enables the update of the condition register.
Floating- fdivs frD,frA,frB The floating-point operand in register frA is divided by the
Point fdivs. floating-point operand in register frB. No remainder is preserved.
Divide
If the most significant bit of the resultant significand is not a 1, the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
Floating-point division is based on exponent subtraction and division
of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1 and zero divide
exceptions when FPSCR[ZE] = 1.
fdivs Floating-Point Divide Single-Precision
fdivs. Floating-Point Divide Single-Precision with CR Update.
The dot suffix enables the update of the condition register.
Floating- fmadd frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fmadd. floating-point operand in register frC. The floating-point operand in
Multiply- register frB is added to this intermediate result.
Add
If the most significant bit of the resultant significand is not a one the
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmadd Floating-Point Multiply-Add
fmadd. Floating-Point Multiply-Add with CR Update. The dot suffix
enables the update of the condition register.
Floating- fmadds frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fmadds. floating-point operand in register frC. The floating-point operand in
Multiply- register frB is added to this intermediate result.
Add
If the most significant bit of the resultant significand is not a one the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmadds Floating-Point Multiply-Add Single-Precision
fmadds. Floating-Point Multiply-Add Single-Precision with CR
Update. The dot suffix enables the update of the condition
register.
Floating- fmsub frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fmsub. floating-point operand in register frC. The floating-point operand in
Multiply- register frB is subtracted from this intermediate result.
Subtract
If the most significant bit of the resultant significand is not a one the
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmsub Floating-Point Multiply-Subtract
fmsub. Floating-Point Multiply-Subtract with CR Update. The dot
suffix enables the update of the condition register.
Floating- fmsubs frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fmsubs. floating-point operand in register frC. The floating-point operand in
Multiply- register frB is subtracted from this intermediate result.
Subtract
If the most significant bit of the resultant significand is not a one the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmsubs Floating-Point Multiply-Subtract Single-Precision
fmsubs. Floating-Point Multiply-Subtract Single-Precision with CR
Update. The dot suffix enables the update of the condition
register.
Floating- fnmadd frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fnmadd. floating-point operand in register frC. The floating-point operand in
Negative register frB is added to this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Add
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-add instruction and then negating the
result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a "sign" bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the "sign" bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmadd Floating-Point Negative Multiply-Add
fnmadd. Floating-Point Negative Multiply-Add with CR Update. The
dot suffix enables the update of the condition register.
Floating- fnmadds frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fnmadds. floating-point operand in register frC. The floating-point operand in
Negative register frB is added to this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Add
result is normalized. The result is rounded to the target precision
Single-
under control of the floating-point rounding control field RN of the
Precision
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-add instruction and then negating the
result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a “sign” bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the “sign” bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmadds Floating-Point Negative Multiply-Add Single-Precision
fnmadds. Floating-Point Negative Multiply-Add Single-Precision with
CR Update. The dot suffix enables the update of the
condition register.
Floating- fnmsub frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fnmsub. floating-point operand in register frC. The floating-point operand in
Negative register frB is subtracted from this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Subtract
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-subtract instruction and then negating
the result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a sign bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the sign bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmsub Floating-Point Negative Multiply-Subtract
fnmsub. Floating-Point Negative Multiply-Subtract with CR Update.
The dot suffix enables the update of the condition register.
Floating- fnmsubs frD,frA,frC,frB The floating-point operand in register frA is multiplied by the
Point fnmsubs. floating-point operand in register frC. The floating-point operand in
Negative register frB is subtracted from this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Subtract
result is normalized. The result is rounded to the target precision
Single-
under control of the floating-point rounding control field RN of the
Precision
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-subtract instruction and then negating
the result, with the following exceptions:
• QNaNs propagate with no effect on their "sign" bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a "sign" bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the "sign" bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmsubs Floating-Point Negative Multiply-Subtract Single-Precision
fnmsubs. Floating-Point Negative Multiply-Subtract
Single-Precision with CR Update. The dot suffix enables
the update of the condition register.
Floating- fctiw frD,frB The floating-point operand in register frB is converted to a 32-bit
Point fctiw. signed integer, using the rounding mode specified by FPSCR[RN],
Convert to and placed in bits 32–63 of register frD. Bits 0–31 of register frD are
Integer undefined.
Word
If the operand in register frB is greater than 231 – 1, bits 32–63 of
register frD are set to x'7FFF_FFFF'.
If the operand in register frB is less than –231, bits 32–63 of register
frD are set to x '8000_0000'.
The conversion is described fully in Appendix F, “Floating-Point
Models.”
Except for trap-enabled invalid operation exceptions, FPSCR[FPRF]
is undefined. FPSCR[FR] is set if the result is incremented when
rounded. FPSCR[FI] is set if the result is inexact.
fctiw Floating-Point Convert to Integer Word
fctiw. Floating-Point Convert to Integer Word with CR Update.
The dot suffix enables the update of the condition register.
Floating- fctiwz frD,frB The floating-point operand in register frB is converted to a 32-bit
Point fctiwz. signed integer, using the rounding mode Round toward Zero, and
Convert to placed in bits 32–63 of register frD. Bits 0–31 of register frD are
Integer undefined.
Word with
If the operand in frB is greater than 231 – 1, bits 32–63 of frD are set
Round
to x'7FFF_FFFF'.
If the operand in register frB is less than –231, bits 32–63 of register
frD are set to x '8000_0000'.
The conversion is described fully in Appendix F, “Floating-Point
Models.”
Except for trap-enabled invalid operation exceptions, FPSCR[FPRF]
is undefined. FPSCR[FR] is set if the result is incremented when
rounded. FPSCR[FI] is set if the result is inexact.
fctiwz Floating-Point Convert to Integer Word with Round Toward
Zero
fctiwz. Floating-Point Convert to Integer Word with Round Toward
Zero with CR Update. The dot suffix enables the update of
the condition register.
2 FE (frA) = (frB)
The PowerPC architecture defines CR1 and the CR field specified by operand crfD as
undefined when executing the Floating-Point Compare Unordered (fcmpu) and
Floating-Point Compare Ordered (fcmpo) instructions with condition register updating
enabled.
Floating- fcmpu crfD,frA,frB The floating-point operand in register frA is compared to the
Point floating-point operand in register frB. The result of the compare is
Compare placed into CR field crfD and the FPCC.
Unordered
If an operand is a NaN, either quiet or signaling, CR field crfD and the
FPCC are set to reflect unordered. If an operand is a Signaling NaN,
VXSNAN is set.
Floating- fcmpo crfD,frA,frB The floating-point operand in register frA is compared to the
Point floating-point operand in register frB. The result of the compare is
Compare placed into CR field crfD and the FPCC.
Ordered
If an operand is a NaN, either quiet or signalling, CR field crfD and
the FPCC are set to reflect unordered. If an operand is a Signalling
NaN, VXSNAN is set, and if invalid operation is disabled (VE = 0)
then VXVC is set. Otherwise, if an operand is a Quiet NaN, VXVC is
set.
Move from mffs frD The contents of the FPSCR are placed into bits 32–63 of register frD.
FPSCR mffs. In the 601, bits 0–31 of floating-point register frD are set to the value
x'FFFF_FFFF'.
mffs Move from FPSCR
mffs. Move from FPSCR with CR Update. The dot suffix enables
the update of the condition register.
Move to mcrfs crfD,crfS The contents of FPSCR field specified by operand crfS are copied to
Condition the CR field specified by operand crfD. All exception bits copied are
Register cleared to zero in the FPSCR.
from FPSCR
Move to mtfsfi crfD,IMM The value of the IMM field is placed into FPSCR field crfD. All other
FPSCR mtfsfi. FPSCR fields are unchanged.
Field
mtfsfi Move to FPSCR Field Immediate
Immediate
mtfsfi. Move to FPSCR Field Immediate with CR Update. The dot
suffix enables the update of the condition register.
When FPSCR[0–3] is specified, bits 0 (FX) and 3 (OX) are set to the
values of IMM[0] and IMM[3] (that is, even if this instruction causes
OX to change from 0 to 1, FX is set from IMM[0] and not by the usual
rule that FX is set to 1 when an exception bit changes from 0 to 1).
Bits 1 and 2 (FEX and VX) are set according to the usual rule
described in Section 2.2.3, “Floating-Point Status and Control
Register (FPSCR),” and not from IMM[1–2].
Move to mtfsf FM,frB Bits 32–63 of register frB are placed into the FPSCR under control of
FPSCR mtfsf. the field mask specified by FM. The field mask identifies the 4-bit
Fields fields affected. Let i be an integer in the range 0–7. If FM = 1 then
FPSCR field i (FPSCR bits 4∗i through 4∗i+3) is set to the contents
of the corresponding field of the low-order 32 bits of register frB.
mtfsf Move to FPSCR Fields
mtfsf. Move to FPSCR Fields with CR Update. The dot suffix
enables the update of the condition register.
In other PowerPC implementations, the mtfsf instruction may
perform more slowly when only a portion of the fields are updated.
This is not the case in the 601.
When FPSCR[0–3] is specified, bits 0 (FX) and 3 (OX) are set to the
values of frB[32] and frB[35] (that is, even if this instruction causes
OX to change from 0 to 1, FX is set from frB[32] and not by the usual
rule that FX is set to 1 when an exception bit changes from 0 to 1).
Bits 1 and 2 (FEX and VX) are set according to the usual rule
described in Section 2.2.3, “Floating-Point Status and Control
Register (FPSCR),” and not from frB[33–34].
Move to mtfsb0 crbD The bit of the FPSCR specified by operand crbD is cleared to 0.
FPSCR Bit 0 mtfsb0.
Bits 1 and 2 (FEX and VX) cannot be explicitly reset.
mtfsb0 Move to FPSCR Bit 0
mtfsb0. Move to FPSCR Bit 0 with CR Update. The dot suffix
enables the update of the condition register.
Move to mtfsb1 crbD The bit of the FPSCR specified by operand crbD is set to 1.
FPSCR Bit 1 mtfsb1.
Bits 1 and 2 (FEX and VX) cannot be reset explicitly.
mtfsb1 Move to FPSCR Bit 1
mtfsb1. Move to FPSCR Bit 1 with CR Update. The dot suffix
enables the update of the condition register.
0 67 1112 16 17 31
Instruction Encoding: Opcode rD/rS rA d
0 16 17 31
Sign Extension d
Yes
rA=0? 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 31 Store Memory
GPR (rD/rS) Load Interface
0 6 7 1112 16 17 21 22 30 31
Reserved Instruction Encoding: Opcode rD/rS rA rB Subopcode 0
0 31
GPR (rB)
Yes
rA=0? 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 31 Store Memory
GPR (rD/rS) Load Interface
0 6 7 11 12 16 17 21 22 30 31
Reserved Instruction Encoding: Opcode rD/rS rA NB Subopcode 0
0 31
Yes
rA=0? 0 0 0 0 0•••••••••••••••••••••••••••••••••••••••••••••••••••• 0 0 0 0 0
No
0 31
GPR (rA)
0 31
Effective Address
0 31 Store Memory
GPR (rD/rS) Load Interface
Load Byte lbz rD,d(rA) The effective address is the sum (rA|0)+d. The byte in memory
and Zero addressed by the EA is loaded into register rD[24–31]. The remaining
bits in register rD are cleared to 0.
Load Byte lbzx rD,rA,rB The effective address is the sum (rA|0)+(rB). The byte in memory
and Zero addressed by the EA is loaded into register rD[24–31]. The remaining
Indexed bits in register rD are cleared to 0.
Load Byte lbzu rD,d(rA) The effective address (EA) is the sum (rA|0)+d. The byte in memory
and Zero addressed by the EA is loaded into register rD[24–31]. The remaining
with Update bits in register rD are cleared to 0. The EA is placed into register rA. If
operand rA = 0 the 601 does not update r0, or if rA = rD the load data
is loaded into register rD and the register update is suppressed.
Although the PowerPC architecture defines load with update
instructions with operand rA = 0 or rA = rD as invalid forms, the 601
allows these cases.
Load Byte lbzux rD,rA,rB The effective address (EA)is the sum (rA|0)+(rB). The byte
and Zero addressed by the EA is loaded into register rD[24–31]. The remaining
with bits in register rD are cleared to 0. The EA is placed into register rA. If
Update operand rA = 0 the 601 does not update register r0, or if rA = rD the
Indexed load data is loaded into register rD and the register update is
suppressed. Although the PowerPC architecture defines load with
update instructions with operand rA = 0 or rA = rD as invalid forms,
the 601 allows these cases.
Load lhz rD,d(rA) The effective address is the sum (rA|0)+d. The half word in memory
Half Word addressed by the EA is loaded into register rD[16–31]. The remaining
and Zero bits in rD are cleared to 0.
Load lhzx rD,rA,rB The effective address is the sum (rA|0)+(rB). The half word in
Half Word memory addressed by the EA is loaded into register rD[16–31]. The
and Zero remaining bits in register rD are cleared.
Indexed
Load lhzu rD,d(rA) The effective address is the sum (rA|0)+d. The half word in memory
Half Word addressed by the EA is loaded into register rD[16–31]. The remaining
and Zero bits in register rD are cleared.
with Update
The EA is placed into register rA.
If operand rA = 0 the 601 does not update register r0, or if rA = rD the
load data is loaded into register rD and the register update is
suppressed. Although the PowerPC architecture defines load with
update instructions with operand rA = 0 or rA = rD as invalid forms,
the 601 allows these cases.
Load lhzux rD,rA,rB The effective address is the sum (rA|0)+(rB). The half word in
Half Word memory addressed by the EA is loaded into register rD[16–31]. The
and Zero remaining bits in register rD are cleared. The EA is placed into
with register rA. Although the PowerPC architecture defines load with
Update update instructions with operand rA = 0 or rA = rD as invalid forms,
Indexed the 601 allows these cases.
Load lha rD,d(rA) The effective address is the sum (rA|0)+d. The half word in memory
Half Word addressed by the EA is loaded into register rD[16–31]. The remaining
Algebraic bits in register rD are filled with a copy of the most significant bit of
the loaded half word.
Load lhax rD,rA,rB The effective address is the sum (rA|0)+(rB). The half word in
Half Word memory addressed by the EA is loaded into register rD[16–31]. The
Algebraic remaining bits in register rD are filled with a copy of the most
Indexed significant bit of the loaded half word.
Load lhau rD,d(rA) The effective address is the sum (rA|0)+d. The half word in memory
Half Word addressed by the EA is loaded into register rD[16–31]. The remaining
Algebraic bits in register rD are filled with a copy of the most significant bit of
with Update the loaded half word. The EA is placed into register rA. If operand
rA = 0 the 601 does not update register r0, or if rA = rD the load data
is loaded into register rD and the register update is suppressed.
Although the PowerPC architecture defines load with update
instructions with operand rA = 0 or rA = rD as invalid forms, the 601
allows these cases.
Load lhaux rD,rA,rB The effective address is the sum (rA|0)+(rB). The half word in
Half Word memory addressed by the EA is loaded into register rD[16–31]. The
Algebraic remaining bits in register rD are filled with a copy of the most
with significant bit of the loaded half-word. The EA is placed into register
Update rA. If operand rA=0 the 601 does not update r0, or if rA = rD the load
Indexed data is loaded into register rD and the register update is suppressed.
Although the PowerPC architecture defines load with update
instructions with operand rA = 0 or rA = rD as invalid forms, the 601
allows these cases.
Load Word lwz rD,d(rA) The effective address is the sum (rA|0)+d. The word in memory
and Zero addressed by the EA is loaded into register rD[0–31].
Load Word lwzx rD,rA,rB The effective address is the sum (rA|0)+(rB). The word in memory
and Zero addressed by the EA is loaded into register rD[0–31].
Indexed
Load Word lwzu rD,d(rA) The effective address is the sum (rA|0)+d. The word in memory
and Zero addressed by the EA is loaded into register rD[0–31]. The EA is
with Update placed into register rA. If operand rA = 0 the 601 does not update
register r0, or if rA = rD the load data is loaded into register rD and
the register update is suppressed. Although the PowerPC
architecture defines load with update instructions with operand rA = 0
or rA = rD as invalid forms, the 601 allows these cases.
Load Word lwzux rD,rA,rB The effective address is the sum (rA|0)+(rB). The word in memory
and Zero addressed by the EA is loaded into register rD[0–31]. The EA is
with placed into register rA. If operand rA = 0 the 601 does not update
Update register r0, or if rA = rD the load data is loaded into register rD and
Indexed the register update is suppressed. Although the PowerPC
architecture defines load with update instructions with operand rA = 0
or rA = rD as invalid forms, the 601 allows these cases.
Store Byte stb rS,d(rA) The effective address is the sum (rA|0) + d. Register rS[24–31] is
stored into the byte in memory addressed by the EA.
Store Byte stbx rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[24–31] is stored
Indexed into the byte in memory addressed by the EA.
Store Byte stbu rS,d(rA) The effective address is the sum (rA|0) + d. rS[24–31] is stored into
with Update the byte in memory addressed by the EA. The EA is placed into
register rA.
Store Byte stbux rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[24–31] is stored
with into the byte in memory addressed by the EA. The EA is placed into
Update register rA.
Indexed
Store sth rS,d(rA) The effective address is the sum (rA|0) + d. rS[16–31] is stored into
Half Word the half word in memory addressed by the EA.
Store sthx rS,rA,rB The effective address (EA) is the sum (rA|0) + (rB). rS[16–31] is
half Word stored into the half word in memory addressed by the EA.
Indexed
Store sthu rS,d(rA) The effective address is the sum (rA|0) + d. rS[16–31] is stored into
Half Word the half word in memory addressed by the EA. The EA is placed into
with Update register rA.
Store sthux rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[16–31] is stored
Half Word into the half word in memory addressed by the EA. The EA is placed
with into register rA.
Update
Indexed
Store Word stw rS,d(rA) The effective address is the sum (rA|0) + d. Register rS is stored into
the word in memory addressed by the EA.
Store Word stwx rS,rA,rB The effective address is the sum (rA|0) + (rB). rS is stored into the
Indexed word in memory addressed by the EA.
Store Word stwu rS,d(rA) The effective address is the sum (rA|0) + d.
with Update Register rS is stored into the word in memory addressed by the EA.
The EA is placed into register rA.
Store Word stwux rS,rA,rB The effective address is the sum (rA|0) + (rB). Register rS is stored
with into the word in memory addressed by the EA. The EA is placed into
Update register rA.
Indexed
Load lhbrx rD,rA,rB The effective address is the sum (rA|0) + (rB). Bits 0–7 of the half
Half Word word in memory addressed by the EA are loaded into rD[24–31].
Byte- Bits 8–15 of the half word in memory addressed by the EA are
Reverse loaded into rD[16–23]. The rest of the bits in rD are cleared to 0.
Indexed
Load Word lwbrx rD,rA,rB The effective address is the sum (rA|0) + (rB). Bits 0–7 of the
Byte- word in memory addressed by the EA are loaded into rD[24–31].
Reverse Bits 8–15 of the word in memory addressed by the EA are loaded
Indexed into rD[16–23]. Bits 16–23 of the word in memory addressed by
the EA are loaded into rD[8–15]. Bits 24–31 of the word in
memory addressed by the EA are loaded into rD[0–7].
Store sthbrx rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[24–31] are
Half Word stored into bits 0–7 of the half word in memory addressed by the
Byte- EA. rS[16–23] are stored into bits 8–15 of the half word in
Reverse memory addressed by the EA.
Indexed
Store Word stwbrx rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[24–31] are
Byte- stored into bits 0–7 of the word in memory addressed by EA.
Reverse rS[16–23] are stored into bits 8–15 of the word in memory
Indexed addressed by the EA. rS[8–15] are stored into bits 16–23 of the
word in memory addressed by the EA. rS[0–7] are stored into bits
24–31 of the word in memory addressed by the EA.
Load String lscbx rD,rA,rB This is a POWER instruction, and is not part of the PowerPC
and lscbx. architecture. This instruction will not be supported by other
Compare PowerPC implementations.
Byte
The EA is the sum (rA|0)+(rB). XER[25–31] contains the byte count.
Indexed
Register rD is the starting register. n = XER[25–31], which is the
number of bytes to be loaded. nr = CEIL(n/4), which is the number of
registers to receive data. Starting with the leftmost byte in rD,
consecutive bytes in memory addressed by the EA are loaded into rD
through rD+nr-1, wrapping around back through GPR 0 if required,
until either a byte match is found with XER[16–23] or n bytes have
been loaded. If a byte match is found, that byte is also loaded.
Bytes are always loaded left to right in the register. In the case when
a match was found before n bytes were loaded, the contents of the
rightmost byte(s) not loaded of that register and the contents of all
succeeding registers up to and including rD+nr-1 are undefined. Also,
no reference is made to memory after the matched byte is found. In
the case when a match was not found, the contents of the rightmost
byte(s) not loaded of rD+nr-1 is undefined.
When XER[25–31]=0, the content of rD is unchanged. The count of
the number of bytes loaded up to and including the matched byte, if a
match was found, is placed in XER[25–31].
lscbx Load String and Compare Byte Indexed
lscbx. Load String and Compare Byte Indexed with CR Update.
The dot suffix enables the update of the condition register.
This instruction is specific to the 601.
Load string and store string instructions may involve operands that are not word-aligned.
As described in Section 5.4.6, “Alignment Exception (x'00600'),” a misaligned string
operation suffers a performance penalty compared to an aligned operation of the same type.
A non-word-aligned string operation that crosses a 4-Kbyte boundary, or a word-aligned
string operation that crosses a 256-Mbyte boundary always causes an alignment exception.
Enforce eieio The eieio instruction provides an ordering function for the effects of
In-Order load and store instructions executed by a given processor. Executing
Execution of an eieio instruction ensures that all memory accesses previously
I/O initiated by the given processor are complete with respect to main
memory before allowing any memory accesses subsequently initiated
by the given processor to access main memory.
The eieio instruction orders load and store operations to cache-
inhibited memory, and store operations to write-through cache
memory.
The eieio instruction performs the same function as a sync
instruction when executed by the 601.
Instruction isync This instruction waits for all previous instructions to complete, and
Synchronize then discards any fetched instructions, causing subsequent
instructions to be fetched (or refetched) from memory and to execute
in the context established by the previous instructions. This
instruction has no effect on other processors or on their caches.
Load Word lwarx rD,rA,rB The effective address is the sum (rA|0)+(rB). The word in memory
and addressed by the EA is loaded into register rD.
Reserve
This instruction creates a reservation for use by an stwcx. instruction.
Indexed
An address computed from the EA is associated with the reservation,
and replaces any address previously associated with the reservation.
The EA must be a multiple of 4. If it is not, the alignment exception
handler will be invoked if the word loaded crosses a page boundary,
or the results may be undefined.
Store Word stwcx. rS,rA,rB The effective address is the sum (rA|0)+(rB).
Conditional
If a reservation exists, register rS is stored into the word in memory
Indexed
addressed by the EA and the reservation is cleared.
If a reservation does not exist, the instruction completes without
altering memory or the contents of the cache.
The EQ bit in the condition register field CR0 is modified to reflect
whether the store operation was performed (i.e., whether a
reservation existed when the stwcx. instruction began execution). If
the store was completed successfully, the EQ bit is set to one.
The EA must be a multiple of 4; otherwise, the alignment exception
handler will be invoked if the word stored crosses a page boundary,
or the results may be undefined.
Synchronize sync Executing a sync instruction ensures that all instructions previously
initiated by the given processor appear to have completed before any
subsequent instructions are initiated by the given processor. When
the sync instruction completes, all memory accesses initiated by the
given processor prior to the sync will have been performed with
respect to all other mechanisms that access memory. The sync
instruction can be used to ensure that the results of all stores into a
data structure, performed in a “critical section” of a program, are seen
by other processors before the data structure is seen as unlocked.
The eieio instruction may be more appropriate than sync for cases in
which the only requirement is to control the order in which memory
references are seen by I/O devices.
0 6 7 11 12 16 17 31
Instruction Encoding: Opcode frD/frS rA d
0 16 17 31
Sign Extension d
Yes
rA=0 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 63 Store Memory
FPR (frD/frS) Load Access
0 31
GPR (rB)
Yes
rA=0? 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 63 Store Memory
FPR (frD/frS) Load Access
The PowerPC architecture defines floating-point load and store with update instructions
(lfsu, lfsux, lfdu, lfdux, stfsu, stfsux, stfdu, stfdux) with operand rA = 0 as invalid forms
of the instructions, but the POWER architecture does not. To maintain compatibility with
the POWER architecture, the 601 accesses memory for these cases but inhibits the update
of the integer register r0.
In addition, the PowerPC architecture defines floating-point load and store instructions with
the condition register update option enabled to be an invalid form. For compatibility with
the POWER architecture, the 601 executes the instruction normally, but also writes an
undefined value into the condition register field CR1.
The PowerPC architecture defines that the FPSCR[UE] bit should not be used to determine
whether denormalization should be performed on floating-point stores. The 601 complies
with this definition, although this is different from some POWER architecture
implementations.
Load lfsx frD,rA,rB The effective address is the sum (rA|0)+(r B).
Floating- Point
The word in memory addressed by the EA is interpreted as a
Single-
floating-point single-precision operand. This word is converted to
Precision
floating-point double-precision and placed into register frD.
Indexed
Load lfsux frD,rA,rB The effective address is the sum (rA|0)+(r B).
Floating-Point
The word in memory addressed by the EA is interpreted as a
Single-
floating-point single-precision operand. This word is converted to
Precision with
floating-point double-precision (see Section 3.5.9.1,
Update
“Double-Precision Conversion for Floating-Point Load Instructions,”)
Indexed
and placed into register frD.
The EA is placed into the register specified by rA.
Load lfdx frD,rA,rB The effective address is the sum (rA|0)+(r B).
Floating-Point
The double word in memory addressed by the EA is placed into
Double-
register frD.
Precision
Indexed
Load lfdux frD,rA,rB The effective address is the sum (rA|0)+(r B).
Floating-Point
The double word in memory addressed by the EA is placed into
Double-
register frD.
Precision with
Update The EA is placed into the register specified by rA.
Indexed
Floating- fmr frD,frB The contents of register frB is placed into frD.
Point Move fmr.
fmr Floating-Point Move Register
Register
fmr. Floating-Point Move Register with CR Update. The dot
suffix enables the update of the condition register.
Floating- fneg frD,frB The contents of register frB with bit 0 inverted is placed into register
Point fneg. frD.
Negate
fneg Floating-Point Negate
fneg. Floating-Point Negate with CR Update. The dot suffix
enables the update of the condition register.
Floating- fabs frD,frB The contents of frB with bit 0 cleared to 0 is placed into frD.
Point fabs.
fabs Floating-Point Absolute Value
Absolute
fabs. Floating-Point Absolute Value with CR Update. The dot
Value
suffix enables the update of the condition register.
Floating- fnabs frD,frB The contents of frB with bit 0 set to one is placed into frD.
Point fnabs.
fnabs Floating-Point Negative Absolute Value
Negative
fnabs. Floating-Point Negative Absolute Value with CR Update.
Absolute
The dot suffix enables the update of the condition register.
Value
0 6 7 29 30 31
Instruction Encoding: 18 LI AA LK
0 6 7 29 30 31
Sign Extension LI 0 0
0 31
Current Instruction Address +
0 31
Reserved Branch Target Address
0 6 7 1112 16 17 30 31
Instruction Encoding: 16 BO BI BD AA LK Reserved
0 31
No
Condition Next Sequential Instruction Address
Met?
Yes
0 16 17 29 30 31
Sign Extension BD 0 0
0 31
Current Instruction Address +
0 31
Branch Target Address
0 6 7 29 30 31
Instruction Encoding: 18 LI AA LK
0 6 7 29 30 31
Sign Extension LI 1 0
0 29 30 31
Branch Target Address 0 0
0 6 7 1112 16 17 29 30 31
Instruction Encoding: 16 BO BI BD AA LK
0 31
No
Condition Next Sequential Instruction Address
Met?
Yes
0 16 17 29 30 31
Sign Extension BD 1 0
0 29 30 31
Branch Target Address 0 0
0 31
No
Condition Next Sequential Instruction Address
Met?
Yes
0 29 30 31
LR || 0 0
0 31
Branch Target Address
0 31
No
Condition Next Sequential Instruction Address
Met?
Yes
0 29 30 31
CTR || 0 0
0 31
Branch Target Address
0000y Decrement the CTR, then branch if the decremented CTR ≠ 0 and the condition is
FALSE.
0001y Decrement the CTR, then branch if the decremented CTR = 0 and the condition is
FALSE.
0100y Decrement the CTR, then branch if the decremented CTR ≠ 0 and the condition is
TRUE.
0101y Decrement the CTR, then branch if the decremented CTR = 0 and the condition is
TRUE.
The z indicates a bit that must be zero; otherwise, the instruction form is invalid.
The y bit provides a hint about whether a conditional branch is likely to be taken and is used by the
601 to improve performance. Other implementations may ignore the y bit.
The “branch always” encoding of the BO operand does not have a “y” bit.
Setting the “y” bit to 0 indicates a predicted behavior for the branch instruction:
• For bcx with a negative value in the displacement operand, the branch is taken.
• In all other cases (bcx with a non-negative value in the displacement operand, bclrx,
or bcctrx), the branch is not taken.
Setting the “y” bit to 1 reverses the preceding indications.
The sign of the displacement operand is used as described above even if the target is an
absolute address. The default value for the “y” bit should be 0, and should only be set to 1
if software has determined that the prediction corresponding to “y” = 1 is more likely to be
correct than the prediction corresponding to “y” = 0. Software that does not compute branch
predictions should set the “y” bit to zero.
In most cases, the branch should be predicted to be taken if the value of the following
expression is 1, and to fall through if the value is 0.
((BO[0] & BO[2]) | S) ⊕ BO[4]
In the expression above, S (bit 16 of the branch conditional instruction coding) is the sign
bit of the displacement operand if the instruction has a displacement operand and is 0 if the
operand is reserved. BO[4] is the “y” bit, or 0 for the “branch always” encoding of the BO
operand. (Advantage is taken of the fact that, for bclrx and bcctrx, bit 16 of the instruction
is part of a reserved operand and therefore must be 0.)
The 5-bit BI operand in branch conditional instructions specifies which of the 32 bits in the
CR represents the condition to test.
When the branch instructions contain immediate addressing operands, the target addresses
can be computed sufficiently ahead of the branch instruction that instructions can be
fetched along the target path. If the branch instructions use the link and count registers,
instructions along the target path can be fetched if the link or count register is loaded
sufficiently ahead of the branch instruction.
Branch if condition true bt bta btlr btctr btl btla btlrl btctrl
Table 3-26 provides the abbreviated set of simplified mnemonics for the most commonly
performed conditional branches. Unusual cases of conditional branches can be coded using
a basic branch conditional mnemonic (bc, bclr, bcctr) with the condition to be tested
specified as a numeric first operand.
Instructions using a mnemonic from Table 3-26 that tests a condition specify the condition
as the first operand of the instruction. Table 3-27 summarizes the mnemonic symbols and
the equivalent numeric values used to interpret a condition register CR field during a branch
conditional instruction compare operation.
lt 0 Less than
gt 1 Greater than
eq 2 Equal
so 3 Summary overflow
un 3 Unordered (after
floating-point comparison)
Table 3-28 summarizes the mnemonic symbols and the equivalent numeric values used to
identify the condition register CR field to be evaluated by the compare operation.
Table 3-28. Condition Register CR Field Identification Symbols
Symbol Value Meaning
cr0 0 CR0
cr1 4 CR1
cr2 8 CR2
cr3 12 CR3
cr4 16 CR4
cr5 20 CR5
cr6 24 CR6
cr7 28 CR7
The simplified branch mnemonics and the symbols in Table 3-27 and Table 3-28 are
combined in an expression that identifies the bit (0–31) of CR to be tested, as follows:
Examples:
• Decrement CTR and branch if it is still nonzero (closure of a loop controlled by a
count loaded into CTR).
bdnz target (equivalent to bc 16,0,target)
• Same as (1) but branch only if CTR is nonzero and condition in CR0 is “equal.”
bdnzt eq,target (equivalent to bc 8,2,target)
• Same as (2), but “equal” condition is in CR5.
bdnzt 4*cr5+eq,target (equivalent to bc 8,22,target)
lt Less than
eq Equal
gt Greater than
ne Not equal
so Summary overflow
These codes are reflected in the simplified mnemonics shown in Table 3-30.
Branch if less than blt blta bltlr bltctr bltl bltla bltlrl bltctrl
Branch if less than or equal ble blea blelr blectr blel blela blelrl blectrl
Branch if equal beq beqa beqlr beqctr beql beqla beqlrl beqctrl
Branch if greater than or bge bgea bgelr bgectr bgel bgela bgelrl bgectrl
equal
Branch if greater than bgt bgta bgtlr bgtctr bgtl bgtla bgtlrl bgtctrl
Branch if not less than bnl bnla bnllr bnlctr bnll bnlla bnllrl bnlctrl
Branch if not equal bne bnea bnelr bnectr bnel bnela bnelrl bnectrl
Branch if not greater than bng bnga bnglr bngctr bngl bngla bnglrl bngctrl
Branch if summary bso bsoa bsolr bsoctr bsol bsola bsolrl bsoctrl
overflow
Branch if not summary bns bnsa bnslr bnsctr bnsl bnsla bnslrl bnsctrl
overflow
Branch if unordered bun buna bunlr bunctr bunl bunla bunlrl bunctrl
Branch if not unordered bnu bnua bnulr bnuctr bnul bnula bnulrl bnuctrl
Instructions using the mnemonics in Table 3-30 specify the condition register field in an
optional first operand. If the CR field being tested is CR0, this operand need not be
specified. Otherwise, one of the CR field symbols listed in Table 3-28 is coded as the first
operand.
Examples:
• Branch if CR0 reflects condition “not equal.”
bne target (equivalent to bc 4,2,target)
• Same as (1), but condition is in CR3.
bne cr3,target (equivalent to bc 4,14,target)
• Branch to an absolute target if CR4 specifies “greater than,” setting the link register.
This is a form of conditional “call”, as the return address is saved in the link register.
bgtla cr4,target (equivalent to bcla 12,17,target)
• Same as (3), but target address is in the count register.
bgtctrl cr4 (equivalent to bcctrl 12,17)
Branch bc BO,BI, The BI operand specifies the bit in the condition register (CR) to be
Conditional bca target_addr used as the condition of the branch. The BO operand is used as
bcl described in Table 3-25.
bcla
bc Branch Conditional. Branch conditionally to the address
computed as the sum of the immediate address and the
address of the current instruction.
bca Branch Conditional Absolute. Branch conditionally to the
absolute address specified.
bcl Branch Conditional then Link. Branch conditionally to the
address computed as the sum of the immediate address
and the address of the current instruction. The instruction
address following this instruction is placed into the link
register.
bcla Branch Conditional Absolute then Link. Branch
conditionally to the absolute address specified. The
instruction address following this instruction is placed into
the link register.
Branch bclr BO,BI The BI operand specifies the bit in the condition register to be used
Conditional bclrl as the condition of the branch. The BO operand is used as described
to Link in Table 3-25.
Register
bclr Branch Conditional to Link Register. Branch conditionally
to the address in the link register.
bclrl Branch Conditional to Link Register then Link. Branch
conditionally to the address specified in the link register.
The instruction address following this instruction is then
placed into the link register.
Branch bcctr BO,BI The BI operand specifies the bit in the condition register to be used
Conditional bcctrl as the condition of the branch. The BO operand is used as described
to Count in Table 3-25.
Register
bcctr Branch Conditional to Count Register. Branch
conditionally to the address specified in the count register.
bcctrl Branch Conditional to Count Register then Link. Branch
conditionally to the address specified in the count register.
The instruction address following this instruction is placed
into the link register.
Note: If the “decrement and test CTR” option is specified (BO[2]=0),
the instruction form is invalid. For the 601, the decremented count
register is tested for zero and branches based on this test, but
instruction fetching is directed to the address specified by the
nondecremented version of the count register. Use of this invalid form
of this instruction is not recommended.
Condition crand crbD,crbA,crbB The bit in the condition register specified by crbA is ANDed with
Register the bit in the condition register specified by crbB. The result is
AND placed into the condition register bit specified by crbD.
Condition cror crbD,crbA,crbB The bit in the condition register specified by crbA is ORed with
Register OR the bit in the condition register specified by crbB. The result is
placed into the condition register bit specified by crbD.
Condition crxor crbD,crbA,crbB The bit in the condition register specified by crbA is XORed with
Register the bit in the condition register specified by crbB. The result is
XOR placed into the condition register bit specified by crbD.
Condition crnand crbD,crbA,crbB The bit in the condition register specified by crbA is ANDed with
Register the bit in the condition register specified by crbB. The
NAND complemented result is placed into the condition register bit
specified by crbD.
Condition crnor crbD,crbA,crbB The bit in the condition register specified by crbA is ORed with
Register the bit in the condition register specified by crbB. The
NOR complemented result is placed into the condition register bit
specified by crbD.
Condition creqv crbD,crbA, The bit in the condition register specified by crbA is XORed with
Register crbB the bit in the condition register specified by crbB. The
Equivalent complemented result is placed into the condition register bit
specified by crbD.
Condition crandc crbD,crbA, The bit in the condition register specified by crbA is ANDed with
Register crbB the complement of the bit in the condition register specified by
AND with crbB and the result is placed into the condition register bit
Complement specified by crbD.
Condition crorc crbD,crbA, The bit in the condition register specified by crbA is ORed with
Register crbB the complement of the bit in the condition register specified by
OR with crbB and the result is placed into the condition register bit
Complement specified by crbD.
Move mcrf crfD,crfS The contents of crfS are copied into crfD. No other condition
Condition register fields are changed.
Register
Field
System Call sc — When executed, the effective address of the instruction following the
sc instruction is placed into SRR0. Bits 16–31 of the MSR are placed
into bits 16–31 of SRR1, and bits 0–15 of SRR1 are set to undefined
values. Then a system call exception is generated. The exception
causes the MSR to be altered as described in Section 5.4, “Exception
Definitions.”
The exception causes the next instruction to be fetched from offset
x'C00' from the base physical address indicated by the new setting of
MSR[IP]. For a discussion of POWER compatibility with respect to
instruction bits 16–29, refer to Appendix B, Section B.10, “System
Call/Supervisor Call.” To ensure compatibility with future versions of
the PowerPC architecture, bits 16–29 should be coded as zero and
bit 30 should be coded as a 1. The PowerPC architecture defines bit
31 as reserved, and thereby cleared to 0; in order for the 601 to
maintain compatibility with the POWER architecture, the execution of
an sc instruction with bit 31 (the LK bit) set to 1 will cause an update
of the Link register with the address of the instruction following the sc
instruction.
This instruction is context synchronizing.
Return rfi — Bits 16–31 of SRR1 are placed into bits 16–31 of the MSR, then the
from next instruction is fetched, under control of the new MSR value, from
Interrupt the address SRR0[0–29] || b'00'.
This instruction is a supervisor-level instruction and is context
synchronizing.
Trap Word twi TO,rA,SIMM The contents of rA is compared with the sign-extended SIMM
Immediate operand. If any bit in the TO operand is set to 1 and its corresponding
condition is met by the result of the comparison, then the system trap
handler is invoked.
Trap Word tw TO,rA,rB The contents of rA is compared with the contents of rB. If any bit in
the TO operand is set to 1 and its corresponding condition is met by
the result of the comparison, then the system trap handler is invoked.
0 Less than
1 Greater than
2 Equal
A standard set of codes has been adopted for the most common combinations of trap
conditions, as shown in Table 3-36. The mnemonics defined in Table 3-37 are variations of
the trap instructions, with the most useful values of the trap instruction TO operand
represented as a mnemonic rather than specified as a numeric operand.
lt Less than 16 1 0 0 0 0
eq Equal 4 0 0 1 0 0
gt Greater than 8 0 1 0 0 0
ne Not equal 24 1 1 0 0 0
(none) Unconditional 31 1 1 1 1 1
Examples:
• Trap if Rx, considered as a 32-bit quantity, is logically greater than x'7FF'.
twlg rA, x'7FF' (equivalent to twi 1,rA, x'7FF')
• Trap unconditionally
trap (equivalent to tw 31,0,0)
Move to mtcrf CRM,rS The contents of rS are placed into the condition register under control
Condition of the field mask specified by operand CRM. The field mask identifies
Register the 4-bit fields affected. Let i be an integer in the range 0–7. If
Fields CRM(i) = 1, then CR field i (CR bits 4*i through 4*i+3) is set to the
contents of the corresponding field of r S.
In some PowerPC implementations, this instruction may perform
more slowly when only a portion of the fields are updated as opposed
to all of the fields. This is not true for the 601.
Move to mcrxr crfD The contents of XER[0–3] are copied into the condition register field
Condition designated by crfD. All other fields of the condition register remain
Register unchanged. XER[0–3] is cleared to 0.
from XER
Move from mfcr rD The contents of the condition register are placed into rD.
Condition
Register
Move from mfmsr rD The contents of the MSR are placed into rD. This is a
Machine supervisor-level instruction.
State
Register
Move to mtspr SPR,rS The SPR field denotes a special purpose register, encoded as shown
Special in Table 3-40. The contents of rS are placed into the designated SPR.
Purpose
Simplified mnemonic examples:
Register
mtxer rA mtspr 1,rA
mtlr rA mtspr 8,rA
mtctr rA mtspr 9,rA
Move from mfspr rD,SPR The SPR field denotes a special purpose register, encoded as shown
Special in Table 3-40. The contents of the designated SPR are placed into rD.
Purpose
Simplified mnemonic examples:
Register
mfxer rA mfspr rA,1
mflr rA mfspr rA,8
mfctr rA mfspr rA,9
For mtspr and mfspr instructions, the SPR number coded in assembly language does not
appear directly as a 10-bit binary number in the instruction. The number coded is split into
two 5-bit halves that are reversed in the instruction, with the high-order 5 bits appearing in
bits 16–20 of the instruction and the low-order 5 bits in bits 11–15.
Table 3-40 summarizes the SPR encodings to which the 601 permits user-level access.
Table 3-40. User-Level SPR Encodings
Decimal
Register
Value in rD SPR[0–4] SPR[5–9] Description
Name
If the SPR field contains any value other than one of the values shown in Table 3-40, the
instruction form is invalid. For an invalid instruction form in which SPR[0]=1, the system
supervisor-level instruction error handler will be invoked if the instruction is executed by a
user-level program. If the instruction is executed by a supervisor-level program, the result
is a no-op.
SPR[0]=1 if and only if writing the register is supervisor-level. Execution of this instruction
specifying a defined and supervisor-level register when MSR[PR]=1 results in a privilege
violation type program exception.
SPR encodings for the DEC, MQ, RTCL, and RTCU registers are not part of the PowerPC
architecture. For forward compatability with other members of the PowerPC
microprocessor family the mftb instruction should be used to obtain the contents of the
RTCL and RTCU registers. The mftb instruction is a PowerPC instruction unimplemented
by the 601, and will be trapped by the illegal instruction exception handler, which can then
issue the appropriate mfspr instructions for reading the RTCL and RTCU registers
The PVR (processor version register) is a read-only register.
SPR encodings shown in Table 3-40 can also be used while at the supervisor level.
The mtspr and mfspr instructions specify a special purpose register (SPR) as a numeric
operand. Simplified mnemonics are provided that represent the SPR in the mnemonic rather
than requiring it to be coded as an operand. Table 3-42 below specifies the simplified
mnemonics provided on the 601 for SPR operations.
Table 3-42. SPR Simplified Mnemonics
Move to SPR Move from SPR
Special Purpose Move to SPR Move from SPR
Simplified Simplified
Register Instruction Instruction
Mnemonic Mnemonic
BAT register, upper mtibatu n, rS mtspr 528+(2*n),rS mfibatu rD, n mfspr rD,528+(2*n)
Bat register, lower mtibatl n, rS mtspr 529+ (2*n),rS mfibatl rD, n, mfspr rD,529+(2*n)
Cache Line clcs rD,rA This is a POWER instruction, and is not part of the PowerPC
Compute architecture. This instruction will not be supported by other
Size PowerPC implementations.
This instruction places the cache line size specified by operand rA
into register rD. The rA operand is encoded as follows:
Move to mtsr SR,rS The contents of rS is placed into segment register specified by
Segment operand SR.
Register
This is a supervisor-level instruction.
Move to mtsrin rS,rB The contents of rS are copied to the segment register selected by bits
Segment 0–3 of rB.
Register
This is a supervisor-level instruction.
Indirect
Move from mfsr rD,SR The contents of the segment register specified by operand SR are
Segment placed into rD.
Register
This is a supervisor-level instruction.
Move from mfsrin rD,rB The contents of the segment register selected by bits 0–3 of rB are
Segment copied into rD.
Register
This is a supervisor-level instruction.
Indirect
Translation tlbie rB The effective address is the contents of rB. If the TLB contains an
Lookaside entry corresponding to the EA, that entry is removed from the TLB.
Buffer The TLB search is done regardless of the settings of MSR[IT] and
Invalidate MSR[DT]. Also, a TLB invalidate operation is broadcast on the
Entry system bus unless disabled by setting bit 17 in HID1.
Block address translation for the EA, if any, is ignored.
Because the 601 supports broadcast of TLB entry invalidate
operations, the following must be observed:
• The tlbie instruction must be contained in a critical section of
memory controlled by software locking, so that the tlbie is issued
on only one processor at a time.
• A sync instruction must be issued after every tlbie and at the end
of the critical section. This causes hardware to wait for the effects
of the preceding tlbie instructions(s) to propagate to all
processors.
A processor detecting a TLB invalidate broadcast does the following:
1. Prevents execution of any new load, store, cache control or tlbie
instructions and prevents any new reference or change bit
updates
2. Waits for completion of any outstanding memory operations
(including updates to the reference and change bits associated
with the entry to be invalidated)
3. Invalidates the two entries (both associativity classes) in the UTLB
indexed by the matching address
4. Resumes normal execution
This is a supervisor-level instruction.
Nothing is guaranteed about instruction fetching in other processors if
tlbie deletes the page in which another processor is executing.
Because the presence, absence, and exact semantics of the translation lookaside buffer
management instruction is implementation dependent, system software should encapsulate
uses of the instruction into subroutines to minimize the impact of migrating from one
implementation to another.
The PowerPC 601 microprocessor contains a 32-Kbyte, eight-way set associative, unified
(instruction and data) cache. The cache line size is 64 bytes, divided into two eight-word
sectors, each of which can be snooped, loaded, cast-out, or invalidated independently. The
cache is designed to adhere to a write-back policy, but the 601 allows control of
cacheability, write policy, and memory coherency at the page and block level. The cache
uses a least recently used (LRU) replacement policy.
The 601’s on-chip cache is nonblocking. Burst operations to the cache are the result of a
cache sector reload caused by a cache miss, and are buffered such that the cache update is
reduced to two single-cycle operations of four words. That is, the results of the first two and
the last two beats are buffered and written to the cache in single cycles apiece. This frees
the cache to perform lower priority operations in the meantime.
System operations, including cache operations, connect to the system interface through the
memory unit, which includes a two-element read queue and a three-element write queue.
As shown in Figure 1-1, the cache provides an eight-word interface to the instruction
fetcher and load/store unit. The surrounding logic selects, organizes, and forwards the
requested information to the requesting unit. Write operations to the cache can be
performed on a byte basis, and a complete read-modify-write operation to the cache can
occur in each cycle.
The cache unit and the memory unit coordinate cache reload and cast-out operations so that
a cache miss does not block the use of the cache for other operations during the next cycle.
Cache reload operations always occur on a sector basis, with the option of reloading the
additional sector as a low-priority operation. On load operations and fetch operations, the
critical data is forwarded to the requesting unit without waiting for the entire cache line to
be loaded.
The 601 maintains cache coherency in hardware by coordinating activity between the
cache, the memory unit, and the bus interface logic. As bus operations are performed on the
bus by other processors, the 601 bus snooping logic monitors the addresses that are
referenced. These addresses are compared with the addresses resident in the cache. The
cache unit uses a second port into its tag directory to check for a matching entry and the
If address requested is in double word A or B then the address placed on the bus are that of
quad-word A, and the four data beats are ordered in the following manner:
Beat
0 1 2 3
A B C D
If address requested is in double word C or D then the address placed on the bus will be that
of quad-word C, and the four data beats are ordered in the following manner:
Beat
0 1 2 3
C D A B
Modified (M) The addressed sector is valid in the cache and in only this cache. The sector is modified with
respect to system memory—that is, the modified data in the sector has not been written back to
memory.
Exclusive (E) The addressed sector is in this cache only. The data in this sector is consistent with system
memory.
Shared (S) The addressed sector is valid in the cache and in at least one other cache. This sector is always
consistent with system memory. That is, the shared state is shared-unmodified; there is no shared-
modified state.
Invalid (I) This state indicates that the addressed sector is not resident in the cache.
System Memory
Valid Data
SHR
INVALID SHW
(On a miss, the
replaced line is first RMS SHARED
cast out to memory
if modified) RH
WM
(sector write-back)
RME WH
SHW
SHR
SHW
SHR SHW
MODIFIED EXCLUSIVE
WH
RH RH
WH
BUS TRANSACTIONS
RH = Read Hit = Snoop Push
RMS = Read Miss, Shared
RME = Read Miss, Exclusive = Invalidate Transaction
WH = Write Hit
WM = Write Miss = Read-with-Intent-to-Modify
SHR = Snoop Hit on a Read
SHW = Snoop Hit on a Write or = Read
Read-with-Intent-to-Modify
000 Set 0
001 Set 1
010 Set 2
011 Set 3
100 Set 4
101 Set 5
110 Set 6
111 Set 7
Noncacheable cases are not part of this table. The first three cases also involve selecting a
replacement class and casting-out modified data that may have resided in that replacement
class.
Table 4-4 provides an overview of memory coherency actions on store operations. This
table does not include noncacheable or write-through cases nor does it completely describe
the exact mechanisms for the operations described. It describes generally what happens
within the chip. The read-with-intent-to-modify (RWITM) examples involve selecting a
replacement class and casting-out modified data that may have resided in that replacement
class.
Clean block The clean operation is an address-only bus transaction, initiated by executing a dcbst
instruction. This operation affects only sectors marked as modified (M). Assuming the
GBL signal is asserted, modified sectors are pushed out to memory, changing the state
to E.
Flush block The flush operation is an address-only bus transaction initiated by executing a dcbf
instruction. Assuming the GBL signal is asserted, the flush block operation results in the
following:
• If the addressed sector is shared or exclusive, an additional snoop action is
generated internally that invalidates the addressed sector.
• If the addressed sector is in the M state, ARTRY is asserted and an additional
internally generated snoop action is initiated that pushes the modified sector out of the
cache and invalidates the sector.
• If HID0[31] = 0, and any bus read operation is pending during this snoop operation,
the write-back of the modified sector is considered to be a high-priority bus operation
that may be enveloped within the pending load operation.
• If HID0[31] = 1, and the snoop flush was presented with HP_SNP_REQ asserted, the
write-back of the modified sector is considered to be a high-priority bus operation that
may be enveloped within the pending load operation.
• If the addressed sector hits any of the three entries in the write queue, that entry is
tagged as a high-priority push, after which it can be loaded from memory.
Write with flush Write-with-flush and write-with-flush-atomic operations occur after the processor issues
Write with flush atomic a store or stwcx. instruction, respectively.
• If the addressed sector is in the shared or exclusive state, the address snoop forces
the state of the addressed sector to invalid.
• If the addressed sector is in the modified state, the address snoop causes the ARTRY
to be asserted and initiates a push of the modified sector out of the cache and
changes the state of the sector to invalid.
• If HID0[31] = 0, and any bus read operation is pending during this snoop operation,
the write-back of the modified sector is considered to be a high-priority bus operation
that may be enveloped within the pending load operation.
• If HID0[31] = 1, and the snoop write was presented with HP_SNP_REQ asserted, the
write-back of the modified sector is considered to be a high-priority bus operation that
may be enveloped within the pending load operation.
• If the addressed sector hits any of the three entries in the write queue, that entry is
tagged as a high-priority push operation.
Kill block The kill-block operation is an address-only bus transaction initiated when one of the
following occurs:
• a dcbi instruction is executed
• a dcbz operation to a block marked S or I is executed
• a write operation to a block marked S occurs
If a snoop hit occurs, an additional snoop is initiated internally and the sector is forced to
the I state, effectively killing any modified data that may have been in the sector. The
three-entry write queue is also snooped, and if a queue entry hits, it is purged.
Write with kill In a write-with-kill operation, the processor snoops the cache for a copy of the
addressed sector. If one is found, an additional snoop action is initiated internally and
the sector is forced to the I state, killing modified data that may have been in the sector.
In addition to snooping the cache, the three-entry write queue is also snooped. A kill
operation that hits an entry in the write queue purges that entry from the queue.
Read The read operation is used by most single-beat and burst read operations on the bus. A
Read atomic read on the bus with the GBL signal asserted causes the following responses:
• If the addressed sector is in the cache but is invalid, the 601 takes no action.
• If the sector is in the shared state, the 601 asserts the shared snoop status indicator.
• If the sector is in the E state, the 601 asserts the shared snoop status indicator and
initiates an additional snoop action to change the state of that sector from E to S.
• If the sector is in the cache in the M state, the 601 asserts both the ARTRY and the
SHD snoop status signals. It also initiates an additional snoop action to push the
modified sector out of the chip and to mark that cache sector as shared.
Read atomic operations appear on the bus in response to lwarx instructions and
generate the same snooping responses as read operations.
Read with intent to modify An RWITM operation is issued to acquire exclusive use of a memory location for the
(RWITM) purpose of modifying it.
RWITM atomic • If the addressed sector is in the I state, the 601 takes no action.
• If the addressed sector is in the cache and in the S or E state, the 601 initiates an
additional snoop action to change the state of the cache sector to I.
• If the addressed sector is in the cache and in the M state, the 601 asserts both the
ARTRY and the SHD snoop status signals. It also initiates an additional snoop action
to push the modified sector out of the chip and to change the state of that sector in the
cache from M to I.
The RWITM atomic operations appear on the bus in response to stwcx. instructions
and are snooped like RWITM instructions.
sync The sync instruction causes an address-only bus transaction. The 601 asserts the
ARTRY snoop status if there are any TLB-related snoop operations pending in the chip.
This transaction is also generated by the eieio instruction on the 601.
TLB invalidate A TLB invalidation operation is caused by executing a tlbie instruction. This instruction
transmits the 601’s TLB index (bits 12–19 of the EA) onto the system bus. Other
processors on the bus invalidate TLB entries associated with EAs that match those bits.
I/O reply The I/O reply operation is part of the I/O controller interface operation. It serves as the
final bus operation in the series of bus operations that service an I/O controller interface
operation.
dcbf I, S, E I Flush —
dcbz S M Kill —
Table 4-6 does not include noncacheable or write-through cases, nor does it completely
describe the mechanisms for the operations described. For more information, see
Section 4.11, “MESI State Transactions.”
Chapter 3, “Addressing Modes and Instruction Set Summary,” and Chapter 10, “Instruction
Set,” describe the cache control instructions in detail. Several of the cache control
instructions broadcast onto the 601 interface so that all processors in a multiprocessor
system can take appropriate actions. The 601 contains snooping logic to monitor the bus for
these commands and the control logic required to keep the cache and the memory queues
coherent. For additional details about the specific bus operations performed by the 601, see
Chapter 9, “System Interface Operation.”
SYSTEM INTERFACE
The other two elements in the write queue are used for store operations and writing back
modified sectors that have been deallocated by updating the queue; that is, when a cache
sector is full, the least-recently used cache sector is deallocated by first being copied into
the write queue and from there to system memory if it is modified. Note that snooping can
occur after a sector has been pushed out into the write queue and before the data has been
written to system memory. Therefore, to maintain a coherent memory, the write queue
elements are compared to snooped addresses in the same way as the cache tags. If a snoop
hits a write queue element, the data is first stored in system memory before it can be loaded
into the cache of the snooping bus master. Full coherency checking between the cache and
the write queue prevents dependency conflicts.
For a detailed discussion about the retry signals and bus operations pertaining to snooping,
see Chapter 9, “System Interface Operation.”
Execution of a load or store instruction is considered complete when the associated address
translation completes, guaranteeing that the instruction has completed to the point where it
is known that it will not generate an internal exception. However, after address translation
is complete, a read or write operation can generate an external exception.
Load and store instructions are always issued and translated in program order with respect
to other load and store instructions. However, a load or store operation that hits in the cache
can complete ahead of those that miss in the cache; additionally, loads and stores that miss
the cache can be reordered as they arbitrate for the system bus.
If a load or store misses in the cache, the operation is managed by the memory unit which
prioritizes accesses to the system bus. Read requests, such as loads, RWITMs, and
instruction fetches have priority over single-beat write operations The priorities for
accessing the system bus are listed in Section 4.10.2, “Memory Unit Queuing Priorities.”
The 601 ensures memory consistency by comparing target addresses and prohibiting
Load or Fetch Read No x0x I Same 1 Cast out of modified Write with kill
(T = 0) sector 1 (as required)
Load or Fetch Read No x0x S,E,M Same Read data from cache —
(T = 0)
Load or Fetch Read No x1x M I CRTRY read (push Write with kill
T = 0 or Load sector to write queue)
(T = 1,
BUID = x'7F')
lwarx Read Acts like other reads but bus operation uses special encoding
Store Write No 00x I Same 1 Cast out of modified Write with kill
(T = 0) sector (if necessary)
Store ≠ stwcx. Write No 10x I Same Pass single-beat write Write with
(T = 0) to memory queue flush
stwcx. Conditional If the reserved bit is set, this operation is like other writes except the bus operation
write uses a special encoding.
dcbf Data cache No xxx M I Push sector to write Write with kill
block flush queue
dcbst Data cache No xxx M E Push sector to write Write with kill
block store queue
dcbt Data cache No x0x I Same 1 Cast out of modified Write with kill
block touch sector (as required)
3 No action —
3 No action —
Note that dcbt is presented to the cache as a load operation. The instructions tlbie and sync/eieio cause no
state transitions and are not cache operations but are included in the table to show how they are performed
by the memory unit queueing mechanism.
Note also that single-beat writes are not snooped in the write queue.
The PowerPC exception mechanism allows the processor to change to supervisor state as a
result of external signals, errors, or unusual conditions arising in the execution of
instructions. When exceptions occur, information about the state of the processor is saved
to certain registers and the processor begins execution at an address (exception vector)
predetermined for each exception. The exception handler at the specified vector is then
processed with the processor in supervisor mode.
Although multiple exception conditions can map to a single exception vector, a more
specific condition may be determined by examining a register associated with the
exception—for example, the DAE/source instruction service register (DSISR) and the
floating-point status and control register (FPSCR). Additionally, some exception conditions
can be explicitly enabled or disabled by software.
Except for the catastrophic asynchronous exceptions (machine check and system reset) the
PowerPC 601 microprocessor exception model is precise, defined as follows:
• The exception handler is given the address of the excepting instruction (or the next
instruction to execute in the case of asynchronous, precise exceptions).
• All instructions prior to the excepting instruction in the instruction stream have
completed execution and have written back their results.
• No instructions subsequent to the excepting instruction in the instruction stream
have been issued.
A detailed description of how the instruction flow is handled in a precise fashion is provided
in 7.3.1.4.4, “Synchronization Tags for the Precise Exception Model.”
The PowerPC architecture requires that exceptions be handled in program order; therefore,
although a particular implementation may recognize exception conditions out of order, they
are presented strictly in order. When an instruction-caused exception is recognized, any
unexecuted instructions that appear earlier in the instruction stream, including any that have
not yet entered the execute state, are required to complete before the exception is taken. Any
exceptions caused by those instructions are handled first. Likewise, exceptions that are
asynchronous and precise are recognized when they occur, but are not handled until all
instructions currently in the execute stage successfully complete execution and report their
results.
Although exceptions have other characteristics as well, such as whether they are maskable
or nonmaskable, the distinctions shown in Table 5-1 define categories of exceptions that the
601 recognizes. Note that Table 5-1 includes no synchronous imprecise instructions. While
the PowerPC architecture supports imprecise floating-point exceptions, they do not occur
in the 601.
Exceptions, and conditions that cause them, are listed in Table 5-2.
Reserved 00000 —
System reset 00100 A system reset is caused by the assertion of either SRESET or HRESET.
Machine check 00200 A machine check is caused by the assertion of the TEA signal during a data bus
transaction.
Data access 00300 The cause of a data access exception can be determined by the bit settings in
the DSISR, listed as follows:
1 Set if the translation of an attempted access is not found in the primary
hash table entry group (HTEG), or in the rehashed secondary HTEG, or in
the range of a BAT register; otherwise cleared.
4 Set if a memory access is not permitted by the page or BAT protection
mechanism described in Chapter 6, “Memory Management Unit”; otherwise
cleared.
5 Set if the access was to an I/O segment (SR[T] =1) by an eciwx, ecowx,
lwarx, stwcx., or lscbx instruction; otherwise cleared. Set by an eciwx or
ecowx instruction if the access is to an address that is marked as write-
through.
6 Set for a store operation and cleared for a load operation.
9 Set if an EA matches the address in the DABR while in one of the three
compare modes.
11 Set if eciwx or ecowx is used and EAR[E] is cleared.
Instruction 00400 An instruction access exception is caused when an instruction fetch cannot be
access performed for any of the following reasons:
• The effective (logical) address cannot be translated. That is, there is a page
fault for this portion of the translation, so an instruction access exception
must be taken to retrieve the translation from a storage device such as a
hard disk drive.
• The fetch access is to an I/O segment.
• The fetch access violates memory protection. If the key bits (Ks and Ku) in
the segment register and the PP bits in the PTE or BAT are set to prohibit
read access, instructions cannot be fetched from this location.
External 00500 An external interrupt occurs when the INT signal is asserted.
interrupt
Alignment 00600 An alignment exception is caused when the 601 cannot perform a memory
access for any of several reasons, such as when the operand of a floating-point
load or store operation is in an I/O segment (SR[T] = 1) or a scalar load/store
operand crosses a page boundary. Specific exception sources are described in
Section 5.4.6, “Alignment Exception (x'00600').”
Program 00700 A program exception is caused by one of the following exception conditions,
which correspond to bit settings in SRR1 and arise during execution of an
instruction:
• Floating-point enabled exception—A floating-point enabled exception
condition is generated when the following condition is met:
(MSR[FE0] | MSR[FE1]) & FPSCR[FEX] is 1.
FPSCR[FEX] is set by the execution of a floating-point instruction that
causes an enabled exception or by the execution of a “move to FPSCR”
instruction that results in both an exception condition bit and its
corresponding enable bit being set in the FPSCR.
• Illegal instruction—An illegal instruction program exception is generated
when execution of an instruction is attempted with an illegal opcode or illegal
combination of opcode and extended opcode fields (including PowerPC
instructions not implemented in the 601), or when execution of an optional
instruction not provided in the 601 is attempted (these do not include those
optional instructions that are treated as no-ops). The PowerPC instruction
set is described in Chapter 3, “Addressing Modes and Instruction Set
Summary.”
• Privileged instruction—A privileged instruction type program exception is
generated when the execution of a supervisor instruction is attempted and
the MSR register user privilege bit, MSR[PR], is set. In the 601, this
exception is generated for mtspr or mfspr with an invalid SPR field if
SPR[0] = 1 and MSR[PR] = 1. This may not be true for all PowerPC
processors.
• Trap—A trap type program exception is generated when any of the
conditions specified in a trap instruction is met.
Decrementer 00900 The decrementer exception occurs when the most significant bit of the
decrementer (DEC) register transitions from 0 to 1. Must also be enabled with
the MSR[EE] bit.
I/O controller 00A00 An I/O controller interface error exception is taken only when an operation to an
interface error I/O controller interface segment fails (such a failure is indicated to the 601 by a
particular bus reply packet). If an I/O controller interface exception is taken on a
memory access directed to an I/O segment, the SRR0 contains the address of
the instruction following the offending instruction. Note that this exception is not
implemented in other PowerPC processors.
Reserved 00B00 —
System call 00C00 A system call exception occurs when a System Call (sc) instruction is executed.
Reserved 00D00 Other PowerPC processors may use this vector for trace exceptions.
Reserved 00E00 The 601 does not generate an interrupt to this vector. Other PowerPC
processors may use this vector for floating-point assist exceptions.
Reserved 00E10–00FFF —
Run mode/ 02000 The run mode exception is taken depending on the settings of the HID1 register
trace exception and the MSR[SE] bit.
The following modes correspond with bit settings in the HID1 register:
• Normal run mode—No address breakpoints are specified, and the 601
executes from zero to three instructions per cycle
• Single instruction step mode—One instruction is processed at a time. The
appropriate break action is taken after an instruction is executed and the
processor quiesces.
• Limited instruction address compare—The 601 runs at full speed (in parallel)
until the EA of the instruction being decoded matches the EA contained in
HID2. Addresses for branch instructions and floating-point instructions may
never be detected.
• Full instruction address compare mode—Processing proceeds out of IQ0.
When the EA in HID2 matches the EA of the instruction in IQ0, the
appropriate break action is performed. Unlike the limited instruction address
compare mode, all instructions pass through the IQ0 in this mode. That is,
instructions cannot be folded out of the instruction stream.
The following mode is taken when the MSR[SE] bit is set.
• MSR[SE] trace mode—Note that in other PowerPC implementations, the
trace exception is a separate exception with its own vector x'00D00'.
Reserved 02001–03FFF —
Asynchronous, 1 System reset—The system reset exception has the highest priority of all exceptions.
imprecise If this exception exists, the exception mechanism ignores all other exceptions and
generates a system reset exception. Instructions issued before the generation of a
system reset exception cannot generate a nonmaskable exception.
Asynchronous, 4 External interrupt—The external interrupt mechanism waits for instructions currently
precise dispatched to complete execution. After all dispatched instructions are executed, and
any exceptions caused by those instructions are handled, the exception mechanism
generates this exception if no higher priority exception exists. This exception is
delayed while MSR[EE] is cleared.
Program Exception
(Illegal/Privileged Instruction)
Precise Mode FP Enabled2 Data Access I/O Cont I/F Data Access
Error
Floating Point1
External Interrupt
Decrementer
1Not all floating-point instructions can cause enabled exceptions.
2If
the MSR bits FE0 and FE1 are set such that precise mode floating-point enabled exceptions are
enabled and the FPSCR[FEX] bit is set, a program exception results.
3Floating-point precise exceptions are taken only when either MSR[FE0] or MSR[FE1] are set.
When an exception occurs, SRR0 is set to point to an instruction such that all prior
instructions have completed execution and no subsequent instruction has begun execution.
The instruction addressed by SRR0 may not have completed execution, depending on the
exception type. SRR0 addresses either the instruction causing the exception or the
immediately following instruction. The instruction addressed can be determined from the
exception type bits.
The SRR1 is a 32-bit register used to save machine status on exceptions and to restore
machine status when rfi is executed. The SRR1 is shown in Figure 5-3.
In general, when an exception occurs, bits 0–15 of SRR1 are loaded with exception-specific
information and bits 16–31 of the machine state register (MSR) are placed into bits 16–31
of SRR1. The machine state register is shown in Figure 5-4.
.
Reserved
0 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0–15 — Reserved
17 PR Privilege level
0 The processor can execute both user and supervisor instructions.
1 The processor can only execute user-level instructions.
18 FP Floating-point available
0 The processor prevents dispatch of floating-point instructions, including floating-point
loads, stores, and moves. Floating-point enabled program exceptions can still occur and
the FPRs can still be accessed.
1 The processor can execute floating-point instructions, and can take floating-point
enabled exception type program exceptions.
25 EP Exception prefix. The setting of this bit specifies whether an exception vector offset is
prepended with Fs or 0s. In the following description, nnnnn is the offset of the exception. See
Table 5-2.
0 Exceptions are vectored to the physical address x'000n_nnnn'.
1 Exceptions are vectored to the physical address x'FFFn_nnnn'.
28–29 — Reserved
*These reserved bits may be used by other PowerPC processors. Attempting to change these bits does
not affect the operation of the processor. These bit positions always return a zero value when read.
MSR bits 16–31 are guaranteed to be written to SRR1 when the first instruction of the
exception handler is encountered.
The data address register (DAR) is a 32-bit register used by several exceptions (data access,
I/O controller interface error, and alignment) to identify the address of a memory element.
Soft reset 0 0 0 — 0 0 0 — 0 0
Machine check 0 0 0 0 0 0 0 — 0 0
Data access 0 0 0 — 0 0 0 — 0 0
Instruction access 0 0 0 — 0 0 0 — 0 0
External 0 0 0 — 0 0 0 — 0 0
Alignment 0 0 0 — 0 0 0 — 0 0
Program 0 0 0 — 0 0 0 — 0 0
Floating-point 0 0 0 — 0 0 0 — 0 0
unavailable
Decrementer 0 0 0 — 0 0 0 — 0 0
System call 0 0 0 — 0 0 0 — 0 0
0 Bit is cleared
1 Bit is set
— Bit is not altered
Reserved bits are read as if written as 0.
The setting of the exception prefix (EP) bit in the MSR determines how exceptions are
vectored. If the bit is cleared, exceptions are vectored to the physical address x'000n_nnnn'
(where nnnnn is the vector offset); if EP is set, exceptions are vectored to the physical
address x'FFFn_nnnn'. Table 5-2 shows the exception vector offset of the first instruction
of the exception handler routine for each exception type.
SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next
if no exception conditions were present.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP —
ME — IT 0
FE0 0 DT 0
When a soft reset exception is taken, instruction execution resumes at offset x'00100' from
the physical base address indicated by MSR[EP].
Before returning to the main program, the exception handler should do the following:
1. SRR0 and SRR1 should be given the values used by the rfi instruction.
2. Execute rfi.
It is not guaranteed that execution is recoverable. Other registers and the MSR are not reset
by hardware.
GPRs All 0s
FPRs All 0s
FPSCR 00000000
CR All 0s
SRs All 0s
MSR 00001040
MQ 00000000
XER 00000000
RTCU 00000000
RTCL 000000003
LR 00000000
CTR 00000000
DSISR 00000000
DAR 00000000
DEC 00000000
SDR1 00000000
SRR0 00000000
SRR1 00000000
SPRGs 00000000
EAR 00000000
PVR 000100011
BATs All 0s
HID0 800100802
HID1 00000000
HID2 00000000
HID5 00000000
HID15 00000000
TLBs All 0s
Cache All 0s
Tag directory All 0s. (However, LRU bits are initialized so each side of the cache has a unique LRU value.)
1 Early releases (DD1) of the 601 hardware set this to x'00010000'. Other versions of silicon may be different
(see Section 2.3.3.11, “Processor Version Register (PVR)” for setting information).
3 Note that if external clock is connected to RTC for the 601, then the RTCL, RTCU, and DEC registers can
change from their initial value of 0s without receiving instructions to load those registers.
SRR0 Set to the address of the next instruction that would have been executed in the interrupted
instruction stream. Neither this instruction nor any others beyond it will have been executed. All
preceding instructions will have been completed.
MSR EE 0
PR 0
FP 0
ME 0
Note that when a machine check exception is taken, the exception handler should set MSR[ME]
as soon as it is practical to handle another TEA assertion. Otherwise, subsequent TEA assertions
cause the processor to automatically enter the checkstop state.
FE0 0
SE 0
FE1 0
EP Value is not altered
IT 0
DT 0
When a machine check exception is taken, instruction execution resumes at offset x'00200'
from the physical base address indicated by MSR[EP].
Before returning to the main program, the exception handler should do the following:
1. SRR0 and SRR1 should be given the values to be used by the rfi instruction.
2. Execute rfi.
SRR0 Set to the effective address of the instruction that caused the exception.
MSR EE 0 PR 0
FP 0 ME Value is not altered
FE0 0 SE 0
FE1 0 EP Value is not altered
IT 0 DT 0
DSISR 0 Reserved on the 601. The PowerPC architecture uses this bit for I/O controller interface error
exceptions, which are vectored to x'00A00' on the 601.
1 Set if the translation of an attempted access is not found in the primary hash table entry group
(HTEG), or in the rehashed secondary HTEG, or in the range of a BAT register; otherwise
cleared.
2–3 Cleared
4 Set if a memory access is not permitted by the page or BAT protection mechanism; otherwise
cleared.
5 Set if the eciwx, ecowx, lwarx, stwcx., or lscbx instruction is attempted to I/O controller
interface space, or if the lwarx or stwcx. instruction is used with addresses that are marked as
write-through.
6 Set for a store operation and cleared for a load operation.
7–8 Cleared
9 Set if an EA matches the address in the DABR while in one of the three compare modes.
10 Cleared.
11 Set if the instruction was an eciwx or ecowx and EAR[E] = 0.
12–31 Cleared
DAR Set to the effective address of a memory element as described in the following list:
• A byte in the first word accessed in the page that caused the data access exception, for a byte, half
word, or word memory access.
• A byte in the first double word accessed in the page that caused the data access exception, for a
double-word memory access.
When a data access exception is taken, instruction execution resumes at offset x'00300'
from the physical base address indicated by MSR[EP].
The architecture permits certain instructions to be partially executed when they cause a data
access exception. These are as follows:
• Load multiple or load string instructions—Some registers in the range of registers to
be loaded may have been loaded. On the 601, all of the first page is accessed and
none of the second page is accessed.
• Store multiple or store string instructions—Some bytes of memory in the range
addressed may have been updated. On the 601, all of the first page is accessed and
none of the second page is accessed.
In the cases above, the questions of how many registers and how much memory is altered
are instruction- and boundary-dependent. However, memory protection is not violated.
Furthermore, if some of the data accessed is in memory-forced I/O controller interface
space (SR[T] = 1) and BUID = x'7F', and the instruction is not supported for I/O controller
interface accesses, the locations in I/O controller interface space are not accessed.
For update forms, the update register (rA) is not altered.
SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next
if no exception conditions were present (if the exception occurs on attempting to fetch a branch target,
SRR0 is set to the branch target address).
SRR1 0 Cleared
1 Set if the translation of an attempted access is not found in the primary hash table entry group
(HTEG), or in the rehashed secondary HTEG, or in the range of a BAT register; otherwise
cleared.
2 Cleared
3 Cleared. Note that the PowerPC architecture defines this as set if the fetch access was to an I/O
controller interface segment (SR[T]=1). Note that this condition causes SRR1[0–15] to be
cleared in the 601.
4 Set if a memory access is not permitted by the page or BAT protection mechanism, described in
Chapter 6, “Memory Management Unit”; otherwise cleared.
5–9 Cleared
10 Set if the page table search fails to find a translation for the effective address; otherwise cleared.
11–15 Cleared
16–31 Loaded from bits 16–31 of the MSR
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next
if no interrupt conditions were present.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
SRR0 Set to the effective address of the instruction that caused the exception.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
DAR Set to the EA of the data access as computed by the instruction causing the
alignment exception.
00 0 0010 stw
00 0 0100 lhz
00 0 0101 lha
00 0 0110 sth
00 0 0111 lmw
00 0 1000 lfs
00 0 1001 lfd
00 0 1010 stfs
00 0 1011 stfd
00 1 0000 lwzu
00 1 0010 stwu
00 1 0100 lhzu
00 1 0101 lhau
00 1 0110 sthu
00 1 0111 stmw
00 1 1000 lfsu
00 1 1001 lfdu
00 1 1010 stfsu
00 1 1011 stfdu
01 0 0101 lwax
01 0 1000 lswx
01 0 1001 lswi
01 0 1010 stswx
01 0 1011 stswi
10 0 0010 stwcx.
10 0 1000 lwbrx
10 0 1010 stwbrx
10 0 1100 lhbrx
10 0 1110 sthbrx
10 1 1111 dcbz
11 0 0000 lwzx
11 0 0010 stwx
11 0 0100 lhzx
11 0 0101 lhax
11 0 0110 sthx
11 0 1000 lfsx
11 0 1001 lfdx
11 0 1010 stfsx
11 0 1011 stfdx
11 1 0000 lwzux
11 1 0010 stwux
11 1 0100 lhzux
11 1 0101 lhaux
11 1 0110 sthux
11 1 1000 lfsux
11 1 1001 lfdux
11 1 1010 stfsux
11 1 1011 stfdux
1 The instructions lwz and lwarx give the same DSISR bits (all zero). But if lwarx causes an alignment
exception, it is an invalid form, so it need not be emulated in any precise way. It is adequate for the
alignment exception handler to simply emulate the instruction as if it were an lwz. It is important that the
emulator use the address in the DAR, rather than computing it from rA/rB/D, because lwz and lwarx use
different addressing modes.
MSR EE 0 PR 0
FP 0 ME Value is not altered
FE0 0 SE 0
FE1 0 EP Value is not altered
IT 0 DT 0
When a program exception is taken, instruction execution resumes at offset x'00700' from
the physical base address indicated by MSR[EP].
FPSCR
Reserved
VXIDI VXZDZ VXSOFT
VXISI VXIMZ VXSQRT
VXSNAN VXVC VXCVI
FX FEX VX OX UX ZX XX FR FI FPRF VE OE UE ZE XE RN
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 19 20 21 22 23 24 25 26 27 28 29 30 31
0 Floating-point exception summary (FX). Every floating-point instruction implicitly sets FPSCR[FX] if that
instruction causes any of the floating-point exception bits in the FPSCR to transition from 0 to 1. The
mcrfs instruction implicitly clears FPSCR[FX] if the FPSCR field containing FPSCR[FX] is copied. The
mtfsf, mtfsfi, mtfsb0, and mtfsb1 instructions can set or clear FPSCR[FX] explicitly. This is a sticky bit.
1 Floating-point enabled exception summary (FEX). This bit signals the occurrence of any of the enabled
exception conditions. It is the logical OR of all the floating-point exception bits masked with their
respective enables. The mcrfs instruction implicitly clears FPSCR[FEX] if the result of the logical OR
described above becomes zero. The mtfsf, mtfsfi, mtfsb0, and mtfsb1 instructions cannot set or clear
FPSCR[FEX] explicitly. This is not a sticky bit.
2 Floating-point invalid operation exception summary (VX). This bit signals the occurrence of any invalid
operation exception. It is the logical OR of all of the invalid operation exceptions. The mcrfs implicitly
clears FPSCR[VX] if the result of the logical OR described above becomes zero. The mtfsf, mtfsfi,
mtfsb0, and mtfsb1 instructions cannot set or clear FPSCR[VX] explicitly. This is not a sticky bit.
7 Floating-point invalid operation exception for SNaN (VXSNAN). This is a sticky bit.
8 Floating-point invalid operation exception for ∞-∞ (VXISI). This is a sticky bit.
9 Floating-point invalid operation exception for ∞/∞ (VXIDI). This is a sticky bit.
10 Floating-point invalid operation exception for 0/0 (VXZDZ). This is a sticky bit.
11 Floating-point invalid operation exception for ∞*0 (VXIMZ). This is a sticky bit.
12 Floating-point invalid operation exception for invalid compare (VXVC). This is a sticky bit.
13 Floating-point fraction rounded (FR). The last floating-point instruction that potentially rounded the
intermediate result incremented the fraction.
14 Floating-point fraction inexact (FI). The last floating-point instruction that potentially rounded the
intermediate result produced an inexact fraction or a disabled exponent overflow.
15–19 Floating-point result flags (FPRF). This field is based on the value placed into the target register even if
that value is undefined. Refer to Table 2-2 for specific bit settings.
15 Floating-point result class descriptor (C). Floating-point instructions other than the compare
instructions may set this bit with the FPCC bits, to indicate the class of the result.
16–19 Floating-point condition code (FPCC). Floating-point compare instructions always set one of
the FPCC bits to one and the other three FPCC bits to zero. Other floating-point instructions
may set the FPCC bits with the C bit, to indicate the class of the result. Note that in this case the
high-order three bits of the FPCC retain their relational significance indicating that the value is
less than, greater than, or equal to zero.
16 Floating-point less than or negative (FL or <)
17 Floating-point greater than or positive (FG or >)
18 Floating-point equal or zero (FE or =)
19 Floating-point unordered or NaN (FU or ?)
20 Reserved
21 Floating-point invalid operation exception for software request (VXSOFT). This bit can be altered only by
the mcrfs, mtfsfi, mtfsf, mtfsb0, or mtfsb1 instructions. The purpose of VXSOFT is to allow software to
cause an invalid operation condition for a condition that is not necessarily associated with the execution of
a floating-point instruction. For example, it might be set by a program that computes a square root if the
source operand is negative. This is a sticky bit.
22 Floating-point invalid operation exception for invalid square root (VXSQRT). This is a sticky bit. This
guarantees that software can simulate fsqrt and frsqrte, and to provide a consistent interface to handle
exceptions caused by square-root operations.
23 Floating-point invalid operation exception for invalid integer convert (VXCVI). This is a sticky bit. See
Section 5.4.7.2, “Invalid Operation Exception Conditions."
26 Floating-point underflow exception enable (UE). This bit should not be used to determine whether
denormalization should be performed on floating-point stores
29 Reserved. This bit may be implemented as the non-IEEE mode bit (NI) in other PowerPC implementations.
The following conditions that can cause program exceptions are detected by the processor.
These conditions may occur during execution of floating-point arithmetic instructions. The
corresponding bits set in the FPSCR are indicated in parentheses.
• Invalid floating-point operation exception condition (VX)
— SNaN condition (VXSNAN)
— Infinity–infinity condition (VXISI)
— Infinity/infinity condition (VXIDI)
— Zero/zero condition (VXZDZ)
— Infinity*zero condition (VXIMZ)
— Illegal compare condition (VXVC)
These exception conditions are described in Section 5.4.7.2, “Invalid Operation
Exception Conditions.”
• Software request condition (VXSOFT). These exception conditions are described in
Section 5.4.7.2, “Invalid Operation Exception Conditions.”
• Illegal integer convert condition (VXCVI). These exception conditions are
described in Section 5.4.7.2, “Invalid Operation Exception Conditions.”
0 0 Ignore exceptions mode—Floating-point exceptions do not cause the program exception error
handler to be invoked.
0 1 Imprecise nonrecoverable mode—This mode is not applicable to the 601. FE0 and FE1 are ORed, so
setting either bit results in running the processor in precise mode. Note that in PowerPC processors
that support this mode, the system floating-point enabled exception error handler is invoked at some
point at or beyond the instruction that caused the enabled exception. The state of the processor may
include conditions and data affected by the exception (that is, hazards are not avoided). It may not be
possible to identify the excepting instruction or the data that caused the exception (that is, the data is
not recoverable).
1 0 Imprecise recoverable mode—This mode is not applicable to the 601. FE0 and FE1 are ORed, so
setting either bit results in running the processor in precise mode. Note that in PowerPC processors
that support this mode, the system floating-point enabled exception error handler is invoked at some
point at or beyond the instruction that caused the enabled exception. Sufficient information is
provided to the system floating-point enabled exception error handler that it can identify the excepting
instruction and the operands, and correct the result. All hazards caused by the exception are avoided
(for example, use of the data that would have been produced by the excepting instruction).
1 1 Precise mode—The system floating-point enabled exception error handler is invoked precisely at the
instruction that caused the enabled exception.
Note that in the 601, FE0 and FE1 are ORed; therefore, unless both FE0 and FE1 are
cleared, the 601 operates in precise mode. Whether a floating-point result is stored and what
value is stored is determined by the FPSCR exception enable bits, as described in
subsequent sections, and are not affected by any MSR bit settings.
Whenever the system floating-point enabled exception error handler is invoked, the
microprocessor ensures that all instructions logically residing before the excepting
instruction have completed, and no instruction after that instruction has been executed.
If exceptions are ignored, an FPSCR instruction can be used to force any exceptions, due
to instructions initiated before the FPSCR instruction, to be recorded in the FPSCR. A sync
instruction can also be used to force exceptions, but is likely to degrade performance more
than an FPSCR instruction.
SRR0 Set to the effective address of the instruction that caused the exception.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next
if no exception conditions were present.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
SRR0 Set to the effective address of the instruction following the instruction that caused the instruction. This
and subsequent instructions have not been executed. SRR0 contains the EA of the instruction
following the load or store that caused the exception.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
DSISR Unchanged.
DAR For scalar (nonmultiple or string) loads and stores, the DAR points to the first byte of the operand,
regardless of the alignment. For multiple and string loads and stores, the DAR points to the first byte in
the last word.
GPRs • On update form loads/stores, rA contains the updated EA. If rA=0, then R0 is not updated. If
rA = rD, the register gets the target data instead of the updated EA.
• On simple loads, the target register rD will have been updated with the word (or bytes) received
from the I/O controller interface operation.
• On simple stores the source register rS will have been sent to the I/O controller as store data.
Whether the store actually occurred at the I/O device depends on the controller implementation
and perhaps the specific type of error detected.
• On load multiples and load strings, all of the target registers will have been updated with data
received from the I/O controller. However, the addressing registers are not updated if they are in
the range of target registers specified by the instruction.
• On store multiples and store strings, all source registers will have been presented to the I/O
controller via normal extended transfer bus protocols.
When an I/O controller interface error exception is taken, instruction execution resumes at
offset x'00A00' from the physical base address indicated by MSR[EP].
SRR0 Set to the effective address of the instruction following the sc instruction
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
When a system call exception is taken, instruction execution resumes at offset x'00C00'
from the physical base address indicated by MSR[EP].
SRR0 Set to the address of the instruction that causes the run mode exception
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
The run mode is determined by the settings of HID1[1–3]. These settings are defined in
Table 5-24.
Table 5-24. Run Mode Exception Actions
HID1(1–3)
Mode Description
Setting
000 Normal Run Mode No address break points are specified and the 601 processes zero to three
instructions per cycle.
010 Limited Instruction The 601 runs at full speed until the EA of the instruction in the lowest position
Address Compare in the instruction queue (IQ0) matches the one specified in HID2. At this point
Mode the appropriate break action is performed. This is a limited compare in that
branches and floating-point operations and the addresses associated with
them may never be detected.
100 Single Instruction If you clear HID1[1:3] and set HID1[8:9] to 10, the processor branches to
Step Mode offset x'02000' and enters an infinite loop, executing the instruction at
x'02000'.
Unless the user needs this mode specifically, the trace exception should be
used.
110 Full Instruction In full instruction address compare mode, processing proceeds out of IQ0.
Address Compare When the EA in HID2 matches the EA of the instruction in IQ0, the
Mode appropriate break action is performed. Unlike the limited instruction address
compare mode, all instructions pass through the IQ0 in this mode. That is,
instructions cannot be folded out of the instruction stream.
111 Full Branch Target This mode is similar to full instruction address compare mode except that the
Address Compare branch target is compared against HID2. When addresses match, the
Mode appropriate break action is taken. This allows the programmer to see how a
program got to an address. This mode can be used with b, bc, bcr, and bcc
instructions.
SRR0 Set to the address of the next instruction to be executed in the program for which the trace
exception was generated.
MSR EE 0 SE 0
PR 0 FE1 0
FP 0 EP Value is not altered
ME Value is not altered IT 0
FE0 0 DT 0
When a run mode or trace exception is taken, instruction execution resumes as offset
x'02000' from the base address indicated by MSR[EP].
This chapter describes the PowerPC 601 microprocessor’s memory management unit
(MMU). The primary functions of the MMU are to translate logical (effective) addresses to
physical addresses for memory accesses, I/O accesses (most I/O accesses are assumed to
be memory-mapped), and I/O controller interface accesses, and to provide access
protection on a block or page basis.
There are three types of accesses generated by the 601 that require address translation:
instruction accesses, data accesses to memory generated by load and store instructions, and
I/O controller interface accesses generated by load and store instructions.
The 601 MMU provides 4 Gbytes of logical address space accessible to supervisor and user
programs with a 4-Kbyte page size and 256-Mbyte segment size. Block sizes range from
128 Kbyte to 8 Mbyte and are software selectable. In addition, the 601 uses an interim 52-
bit virtual address and hashed page tables in the generation of 32-bit physical addresses.
The MMU contains three translation lookaside buffers (TLBs). There is a 256-entry, two-
way set-associative unified (instruction and data address) TLB (UTLB) for storing recently-
used address translations, and a four-entry fully-associative first-level instruction TLB
(ITLB) that is used only by instruction accesses for storing recently used instruction
address translations. Additionally, there is a four-entry block address translation array (BAT
array) that stores the available block address translations (for instruction or data addresses).
BAT array entries are implemented as the block address translation (BAT) registers that are
accessible as supervisor special-purpose registers (SPRs). UTLB entries are generated
automatically by the 601 hardware via a search of the page tables in memory. The 601
maintains all the segment information on-chip in 16 supervisor-level segment registers.
This chapter describes the MMU address translation mechanisms, the MMU conditions
that cause 601 exceptions, the instructions used to program the MMU, and the
corresponding registers.
The 601 MMU relies on the exception processing mechanism for the implementation of the
paged virtual memory environment and for enforcing protection of designated memory
areas. Exception processing is described in Chapter 5, “Exceptions.” Section 2.3.1,
“Machine State Register (MSR),” describes the MSR of the 601, which controls some of
the critical functionality of the MMU.
Integer
BPU FPU
Unit
LA0–LA19
LA0–LA19
A20–A31
X ITLB
Miss
ITLB
BAT Array
Select
UTLB
0
Cache
8
0 TAGS
Select •
•
63 PA0–PA19
127
X 8
PA0–PA19
Compare
MMU Table
Search
Logic
+
SDR1 SPR25
Cache
Hit/Miss
PA0–PA31
0 51
Virtual Address
I/O Controller
Interface Trans-
lation
0 31 0 31 0 31 0 31
I/O Cont. I/F Address Physical Address Physical Address Physical Address
I/O controller interface address translation is enabled when the I/O controller interface
translation control bit (T-bit) in the selected segment register (segment register selected by
the highest-order address bits) is set. In this case, the remaining information in the segment
register is interpreted as identifier information that is used with the remaining logical
address bits to generate the packets used in an I/O controller interface access on the external
interface; additionally, no UTLB lookup or page table search is performed and the BAT
array lookup results are ignored. For more information about the I/O controller interface
operations, see Section 9.6, “Memory- vs. I/O-Mapped I/O Operations.”
A special case of I/O controller interface address translation (not shown in Figure 6-2) is
supported that forces an I/O controller interface address translation to be interpreted as a
memory access (that is, it uses the usual memory access protocol rather than the I/O
Both user/supervisor √ √ √ √
Each of these options is enforced at the block or page level. Thus, the supervisor-only
option allows only read and write operations generated while the 601 is operating in
supervisor mode (corresponding to MSR[PR] = 0) to use the selected address translation
(block or page). User accesses that map into these blocks or pages cause an exception to be
taken.
As shown in the table, the supervisor-write-only option allows both user and supervisor
accesses to read from the selected area of memory but only supervisor programs can update
(write to) that area. There is also an option that allows both supervisor and user programs
read and write access (both user/supervisor option), and finally, there is an option to
Block/Page Address
Translation
Compare Address
with BAT array
(BAT Registers)
Perform Table
Search Operation
Access Access
Protected Permitted
Continue Access
Load UTLB Entry Access Faulted to Cache
If the UTLB misses, the 601 automatically searches the page tables in memory. If the page
table entry (PTE) is successfully read, a new UTLB entry (and an ITLB entry for the
instruction access case) is created and the page translation is once again attempted. This
time, the UTLB (and ITLB for instruction access case) is guaranteed to hit. If the PTE is
not found by the table search operation, an instruction access or data access exception is
generated.
0 0 0
0 0 1
1 0 0
1 0 1
0 1 0
0 1 1
Page fault No matching PTE found in page I access: instruction access exception
tables SRR1[1] = 1
Instruction access to I/O Attempt to fetch instruction when Instruction access exception
controller interface space SR[T] = 1, SR[BUID] ≠ '07F' Causes no SRR1 bits to be set*
lwarx, stwcx., lscbx instruction Reservation instruction or load Data access exception
to I/O controller interface space string and compare byte DSISR[5] = 1
instruction when SR[T] = 1,
SR[BUID] ≠ '07F'
Instruction breakpoint match Instruction address matches the Run mode exception
address in HID2
Data breakpoint match Data address matches the Data access exception
address in HID5 DSISR[9] = 1
* This is only true for the 601; other PowerPC processors will set SRR1[3] for this case.
Table 6-5 summarizes the registers that the operating system uses to program the MMU.
These registers are accessible to supervisor-level software only. These registers are
described in detail in Chapter 2, “Registers and Data Types.”
Table 6-5. MMU Registers
Register Description
Segment registers The sixteen 32-bit segment registers are present only in 32-bit implementations of
(SR0–SR15) the PowerPC architecture. Figure 6-13 shows the format of a segment register. The
fields in the segment register are interpreted differently depending on the value of bit
0. The segment registers are accessed by the mtsr, mtsrin, mfsr, and mfsrin
instructions
BAT registers The 601 includes eight block-address translation registers (BATs), organized as four
(BAT0U–BAT3U and pairs (BAT0U–BAT3U and BAT0L–BAT3L). Figure 6-6 and Figure 6-7 show the
BAT0L–BAT3L) format of the upper and lower BAT registers. These are special-purpose registers
that are accessed by the mtspr and mfspr instructions.
Table search description The 32-bit table search description register 1 (SDR1) specifies the variables used in
register 1 accessing the page tables in memory. This is a special-purpose register that is
(SDR1) accessed by the mtspr and mfspr instructions.
If the system software maps the same physical page with multiple page table entries that
have different W, I, or M values, the results of the translation are undefined.
0 00 Read/write
0 01 Read/write
0 10 Read/write
0 11 Read only
1 00 No access
1 01 Read only
1 10 Read/write
1 11 Read only
1 Ks or Ku selected by state of MSR[PR]
2 PP protection option bits in BAT array entry or
PTE
Thus, the conditions that cause a protection violation are depicted in Table 6-8. Any access
attempted (read or write) when the key = 1 and PP = 00, results in a protection violation
exception condition. When key = 1 and PP = 01, an attempt to perform a write access causes
a protection violation exception condition. When PP = 10, all accesses are allowed, and
when PP = 11, write accesses always cause an exception. The 601 takes either the
instruction access exception or the data access exception (for an instruction or data access,
respectively) when there is an attempt to violate the memory protection.
Table 6-8 . Exception Conditions for Key and PP Combinations
Prohibited
Key PP
Accesses
1 00 Read/write
1 01 Write
x 10 None
x 11 Write
Although any combination of the Ks, Ku and PP bits is allowed, the Ks and Ku bits can be
programmed so that the value of the key bit for Table 6-7 directly matches the MSR[PR]
bit for the access. In this case, the encoding of Ks = 0 and Ku = 1 is used for the BAT array
entry or the PTE, and the PP bits then enforce the protection options shown in Table 6-9.
10 Both user/supervisor √ √ √ √
However, if the setting Ks = 1 is used, supervisor accesses are treated as user reads and
writes with respect to Table 6-9. Likewise, if the setting Ku = 0 is used, user accesses to the
block or page are treated as supervisor accesses in relation to Table 6-9. Therefore, by
modifying one of the key bits (in either the BAT register or the segment register), the way
the 601 interprets accesses (supervisor or user) in a particular block or segment can easily
be changed. Note, however, that only supervisor programs can modify the key bits for the
block or the segment as access to the BAT registers and the segment registers is privileged.
When the memory protection mechanism prohibits a reference, one of the following occurs,
depending on the type of access that was attempted:
• For data accesses, a data access exception is generated and bit 4 of DSISR is set. If
the access is a store, bit 6 of DSISR is also set.
• For instruction accesses, an instruction access exception is generated and bit 4 of
SRR1 is set.
See Chapter 5, “Exceptions,” for more information about these exceptions.
Compare
Compare BAT3U
BAT3L SPR535
BAT array
Each pair of BAT registers defines the starting address of a block in the logical address
space, the size of the block, and the start of the corresponding block in physical address
space. If a logical address is within the range defined by a pair of BAT registers, its physical
address is defined as the starting physical address of the block plus the lower order logical
address bits.
Blocks are restricted to a finite set of sizes, from 128 Kbytes (217 bytes) to 8 Mbytes (223
bytes). The starting address of a block in both logical address space and physical address
space is defined as a multiple of the block size.
Additionally, a block can be defined to overlay part of a segment such that the block portion
is non-paged although the rest of the segment is pageable. This allows non-paged areas to
be specified within a segment, and PTEs for the part of the segment overlaid by the block
are not required.
0 14 15 24 25 27 28 29 30 31
BLPI 0000000000 WIM KS KU PP
Reserved
Reserved
The BAT registers contain the logical to physical address mappings for blocks of memory.
This mapping information includes the logical address bits that are compared with the
logical address of the access, the memory/cache access mode bits (WIM) and the protection
bits for the block. In addition, the size of the block and the starting address of the block are
defined by the block page number and block size mask fields.
Table 6-11 describes the bits in the upper and lower BAT registers.
Upper 0–14 BLPI Block logical page index. This field is compared with bits 0–14 of the logical
BAT address to determine if there is a hit in that BAT array entry.
Registers
15–24 — Reserved
28 Ks Supervisor mode key. This bit interacts with MSR[PR] and the PP field to
determine the protection for the block. For more information, see Section 6.4,
“General Memory Protection Mechanism."
29 Ku User mode key. This bit also interacts with MSR[PR] and the PP field to
determine the protection for the block. For more information, see Section 6.4,
“General Memory Protection Mechanism."
30–31 PP Protection bits for block. This field interacts with MSR[PR] and the Ks or Ku
to determine the protection for the block as described in Section 6.4,
“General Memory Protection Mechanism."
Lower 0–14 PBN Physical block number. This field is used in conjunction with the BSM field to
BAT generate bits 0-14 of the physical address of the block.
Registers
15–24 — Reserved
26–31 BSM Block size mask (0...5). BSM is a mask that encodes the size of the block.
Values for this field are listed in Table 6-12.
The BSM field in the lower BAT register is a mask that encodes the size of the block.
Table 6-12 defines the bit encodings for the BSM field of the lower BAT register. Note that
the range of block sizes is a subset of that defined by the PowerPC architecture.
Table 6-12. Lower BAT Register Block Size Mask Encodings
Block Size BSM Encoding
1 Mbyte 00 0111
2 Mbytes 00 1111
4 Mbytes 01 1111
8 Mbytes 11 1111
0 8 9 14 15 31
MASK
Block Size
6-bit 17-bit
OR
0 8 9 14 15 31
Figure 6-10 further expands on the determination of a memory protection violation and the
subsequent actions taken by the processor in this case. Note that in the case of a memory
protection violation for the attempted execution of a dcbt of dcbtst instruction, the
translation is aborted and the instruction executes as a no-op (no violation is reported).
otherwise dcbt/dcbtst
Instruction
Abort Access
Instruction Data
Access Access
Data Access
Exception
LA0–LA3 Select
15 T VSID
LA4–LA12
UTLB VSID
V
0V
Set 1
Compare
Set 0
LA13–LA19 Select Compare
MUX
PA0–PA19
In the case of a UTLB miss, the table search hardware in the MMU automatically searches
for the required PTE in the page tables in memory. The MMU then automatically loads the
UTLB with the PTE and the address translation is performed. Note that for an instruction
access, the required PTE is also loaded into the ITLB for future use.
If the table search operations fail to locate the required PTE, then the appropriate exception
(instruction access exception or data access exception) is taken. See Section 6.9.2, “Page
Table Search Operation” for more information on the context for these exception
conditions.
0 Memory segment
The types of address translation used by the 601 MMU are shown in the flow diagram of
Figure 6-4.
Segment
Registers
0 23 24 39 40 51
Virtual Segment ID (VSID) Page Index Byte Offset
52-Bit Virtual Address (24-bit) (16-bit) (12-bit)
Virtual Page Number (VPN)
UTLB/Page
Table
PTE
Physical Page Number (PPN) Byte Offset
32-Bit Physical Address (20-bit) (12-bit)
Reserved
T Ks Ku 00000 VSID
0 1 2 3 7 8 31
Table 6-14 provides the definitions of the segment register bits for page address translation.
3–7 — Reserved
The Ks and Ku bits partially define the access protection for the pages within the segment.
The page protection provided in the 601 is described in Section 6.8.5, “Page Memory
Protection.” The virtual segment ID field is used as the high-order bits of the virtual page
number (VPN) as shown in Figure 6-12.
The segment registers are programmed with 601-specific instructions that implicitly
reference the segment registers. The 601 segment register instructions are summarized in
Table 6-15. These instructions are privileged in that they are executable only while
operating in supervisor mode. See Section 2.3.3.1, “Synchronization for Supervisor-Level
SPRs and Segment Registers” for information about the synchronization requirements
when modifying the segment registers. See Chapter 10, “Instruction Set,” for more detail
on the encodings of these instructions.
Table 6-15. Segment Register Instructions
Instruction Description
0 1 24 25 26 31
V VSID H API
PPN 000 R C WIM 00 PP
0 19 20 22 23 24 25 27 28 29 30 31
Table 6-16 lists the bit definitions for each word in a PTE.
Table 6-16. PTE Bit Definitions
Word Bit Name Description
20–22 — Reserved
23 R Reference bit
24 C Change bit
28–29 — Reserved
Note that the processor updates the C bit based only on the status of the C bit in the UTLB
entry in the case of a UTLB hit (the R bit is assumed to be set in the page tables if there is
a UTLB hit). Therefore, when software clears the R and C bits in the page tables in memory,
it must invalidate the UTLB entries associated with the pages whose reference and change
bits were cleared. See Section 6.9.3, “Page Table Updates,” for all of the constraints
imposed on the software when updating the reference and change bits in the page tables.
The R bit or the C bit for a page is not set by the execution of the Data Cache Block Touch
instructions (dcbt, or dcbtst).
Generate 52-bit
Virtual Address
from Segment Regis-
ter
Compare Virtual
Address with UTLB
Entries
UTLB Hit
Case
LA13–LA19 select a UTLB entry for each set (Set 0 and Set 1)
Segment Register [VSID] = VSID in UTLB entry
LA4–LA12 = PI in UTLB entry
UTLB entry [V] = 1
Select Key:
Alignment If MSR[PR] = 0, key = Ks
Exception If MSR[PR] = 1, key = Ku
Table Search
Continue Access to Cache Operation
with WIM from UTLB entry
Page Table
PTEGn
A given PTE can reside in one of two possible PTEGS. For each PTEG address, there is a
complementary PTEG address—one is the primary PTEG and the other is the secondary
PTEG. Additionally, a given PTE can reside in any of the PTE locations within an
addressed PTEG. Thus, a given PTE may reside in one of 16 possible locations within the
page table. If a given PTE is not resident within either the primary or secondary PTEG, a
page table miss occurs, corresponding to a page fault condition.
A table search operation is defined as the search of a PTE within a primary and secondary
PTEG. When a table search operation commences, a primary hashing function is performed
on the virtual address. The output of the hashing function is then concatenated with bits
(some of them masked) programmed into the SDR1 register by the operating system to
create the physical address of the primary PTEG. The PTEs in the PTEG are then checked,
one by one, to see if there is a hit within the PTEG. In case the PTE is not located during
this PTEG, a secondary hashing function is performed, a new physical address is generated
Reserved
HTABORG
16–22 — Reserved
The HTABORG field in SDR1 contains the high-order 7–16 bits of the 32-bit physical
address of the page table. Therefore, the beginning of the page table lies on a 216 byte (64
Kbyte) boundary at a minimum.
A page table can be any size 2n where 16 ≤ n ≤ 25. The HTABMASK field in SDR1
contains a mask value that determines how many bits from the output of the hashing
function are used as the page table index. This mask must be of the form b'00...011...1' (a
string of 0 bits followed by a string of 1 bits). As the table size increases, more bits are
used from the output of the hashing function to index into the table. The 1 bits in
HTABMASK determine how many additional bits (beyond the minimum of 10) from the
hash are used as the index; the HTABORG field must have the same number of lower-order
bits equal to 0 as the HTABMASK field has lower-order bits equal to 1.
8 Mbytes (223) 64 Kbytes (216) 213 210 x xxxx xxxx 0 0000 0000
16 Mbytes (224) 128 Kbytes (217) 214 211 x xxxx xxx0 0 0000 0001
32 Mbytes (225) 256 Kbytes (218) 215 212 x xxxx xx00 0 0000 0011
64 Mbytes (226) 512 Kbytes (219) 216 213 x xxxx x000 0 0000 0111
128 Mbytes (227) 1 Mbytes (220) 217 214 x xxxx 0000 0 0000 1111
256 Mbytes (228) 2 Mbytes (221) 218 215 x xxx0 0000 0 0001 1111
512 Mbytes (229) 4 Mbytes (222) 219 216 x xx00 0000 0 0011 1111
1 Gbytes (230) 8 Mbytes (223) 220 217 x x000 0000 0 0111 1111
2 Gbytes (231) 16 Mbytes (224) 221 218 x 0000 0000 0 1111 1111
4 Gbytes (232) 32 Mbytes (225) 222 219 0 0000 0000 1 1111 1111
As an example, if the physical memory size is 229 bytes (512 Mbyte), then there are 229–
212 (4 Kbyte page size) = 217 (128 Kbyte) total page frames. If this number of page frames
XOR
24 39
=
Output of Hashing Function 1 Hash Value 1
0 8 9 18
Secondary Hash:
0 18
Hash Value 1
When the secondary hashing function is required, the output of the primary hashing
function is complemented with one’s complement arithmetic, to provide hash value 2.
000 (16-bit)
Hash Function
SDR1
0 67 15 16 22 23 31 0 18
PAGE TABLE
OR PTE0 PTE7
8 bytes
PTEG0
0 67 15 16 25 26 31
000000
(7-bit) (9-bit) (10-bit) (6-bit)
PTEG Select
32-bit Physical Address of Page Table Entry PTEGn
64 Bytes
PTE
01 24 25 26 31 0 19 25 27 31
VSID API Physical Page Number (PPN)
(24-bit) (6-bit) (20-bit) 00 0 PP
V H R WIM
C
PPN Byte Offset
32-bit Physical Address (20-bit) (12-bit)
HTABORG HTABMASK
Example: 0 15 23 31
Given: SDR1 1010 0110 0000 0000 0000 0000 0000 0011
Base Address
Page Table
PTEG4095
0 14 25 31
0 14 25 31
Two example PTEG addresses are shown in the figure as PTEGaddr1 and PTEGaddr2. Bits
14–25 of each PTEG address in this example page table are derived from the output of the
hashing function (bits 26–31 are zero to start with PTE0 of the PTEG). In this example, the
'b' bits in PTEGaddr2 are the one’s complement of the 'a' bits in PTEGaddr1. The 'm' bits
Given: SDR1 0000 1111 1001 1000 0000 0000 0000 0111
0 4 19 20 25 31
LA = x'00FF A01B': 0000 0000 1111 1111 1010 0000 0001 1011
Segment Register Select Byte Offset
x'C A 7 0 1 C’
8 31
1100 1010 0111 0000 0001 1100 0000 1111 1111 1010 0000 0001 1011
5 24 25 39
XOR
Figure 6-22 shows the generation of the secondary PTEG address for this example. If the
secondary PTEG is required, the secondary hash function is performed and the lower order
13 bits of hash value 2 are then concatenated with the higher order 13 bits of HTABORG,
defining the address of the secondary PTEG (x'0F98 0640').
As described in Figure 6-19, the 10 lower-order bits of the page index field are always used
in the generation of a PTEG address (through the hashing function). This is why only the
abbreviated page index (API) is defined for a PTE (the entire page index field does not need
to be checked). For a given logical address, the lower order 10 bits of the page index (at
least) contribute to the PTEG address (both primary and secondary) where the
corresponding PTE may reside in memory. Therefore, if the higher order 6 bits (the API
field) of the page index match with the API field of a PTE within the specified PTEG, the
PTE mapping is guaranteed to be the unique PTE required.
One’s Complement
9 bits 10 bits
PTEG8191
Note that a given PTEG address does not map back to a unique logical address. Not only
can a given PTEG be considered both a primary and a secondary PTEG (as described in
Section 6.9.1.5.1, “Page Table Structure Example”), but in this example, bits 24–26 of the
page index field of the virtual address are not used to generate the PTEG address. Therefore,
any of the eight combinations of these bits will map to the same primary PTEG address.
(However, these bits are part of the API and are therefore compared for each PTE within
the PTEG to determine if there is a hit.) Furthermore, a logical address can select a different
segment register with a different value such that the output of the primary (or secondary)
hashing function happens to equal the hash values shown in the example. Thus these logical
addresses would also map to the same PTEG addresses shown.
otherwise
PA26-PA31=56
(Last PTE in PTEG) Secondary Table
Search Hit
PTE[R]=1 PTE[R]=0
Perform Secondary
Table Search
PTE[R] ← 1
R_Flag ← 1
Write PTE
into UTLB
Byte write to
otherwise Memory Protection update PTE[R] in
Violation memory
otherwise
Secondary Table
Search Hit
PA26–PA31=56
(Last PTE in PTEG)
(See Figure 6-23)
Page Fault
0 1 2 3 11 12 27 28 31
Table 6-20 shows the bit definitions for the segment registers when the T bit is set.
Table 6-20. Segment Register Bit Definitions for I/O Controller Interface
Bit Name Description
28–31 Packet 1(0–3) This field contains address bits 0–3 of the
packet 1 cycle (address-only).
PA PPPP
0 3 4 31
I/O Controller
Interface Translation
T=1
otherwise
Alignment Exception
lwarx, stwcx., or
lscbx Instruction otherwise
Cache Instruction
Set DSISR[5]=1 otherwise (dcbt, dcbtst, dcbf,
dcbi, dcbst, or dcbz)
* No SRR1 bits are set for this case; this differs from the PowerPC architecture, which specifies that SRR1[3]
is set for this condition.
This chapter describes how instructions flow through the PowerPC 601 microprocessor. A
logical model of the 601 pipeline is presented as a framework for understanding the
functionality and performance of the hardware. While this pipeline model is an abstraction
of the hardware implementation, it can yield accurate instruction timing information.
Pipeline stages ID
IE
IC
IWA
The first row in the timing tables indicates the number of cycles an instruction spends in
each pipeline stage. The second row shows the pipeline stages. For integer instructions, the
stages are—integer decode (ID), integer execute (IE), integer completion (IC), integer
writeback for ALU operations (IWA), integer writeback for load operations (IWL), and
cache access (CACC). Sub rows are included with a different pipeline stage in each sub-
row. Some instructions simultaneously occupy multiple stages, while some instructions
spend several cycles in the same stage. The classic RISC instruction flow is shown in
Table 7-1—the instruction moves from ID to IE to IWA spending one cycle in each stage.
The third row in the tables shows which resources are required nonexclusively. Typically
these resources are registers that are read by the instruction. The fourth row shows
CARB
CACC ISB
Cache (memory subsystem)
IQ7 FPSB
IQ6
Data Access
IQ5 Queueing Unit
Dispatch Unit
(Instructions in the IQ IQ4
are said to be in the
dispatch stage (DS))
BE IQ3 F1
IQ2
MR IQ1 FD
IQ0
ID FPM
1
FPA
IE FWA
FWL
Floating-Point
Unit (FPU)
BW IC IWA IWL
= Cycle Boundary
1 An integer instruction can be passed to the ID stage in the same cycle in which it enters IQ0.
FA Fetch Arbitration—During fetch arbitration, the address of the next instructions (fetch group) to be fetched
is generated and sent to the memory subsystem (the cache arbitration stage). Any instructions that are
going to arrive at the dispatch stage as a result of the cache access associated with the address
generated during the FA stage is considered to be in the FA stage. All instructions must pass through the
FA stage.
CARB Cache Arbitration—For most operations, the CARB stage of the cache is overlapped with one or more
other stages. The CARB and CACC stages may be used by the memory subsystem for cache reload
operations and for some snoop operations. These relationships are shown in the multiple instruction timing
diagrams in Appendix I.
CACC Cache Access—The cache is the only interface point between the memory subsystem and the processor
core; if the data being accessed by the instruction in the CACC stage is in the cache, it is passed to the
processor core during that cycle (that is, a single-cycle cache access). If the data is not in the cache, no
data is passed to the processor core. The 601 cache is nonblocking, that is, once an instruction misses in
the cache, the CACC stage is free to service another instruction. For more information, see Section 7.2.2,
“Memory Subsystem.” The CARB and CACC stages may be used by the memory subsystem for cache
reload operations, and some snoop operations. During a cache reload, the data being reloaded is brought
in from the memory system and is available for use in the processor core on the next cycle.
DS Dispatch—The dispatch stage is associated with the eight-entry instruction queue (IQ0–IQ7). The 601 can
dispatch as many as three instructions on every clock cycle, one to each of the processing units (the IU,
the BPU, and the FPU) from the same dispatch stage. As many as eight instructions can be in the dispatch
stage (instruction queue), but instructions can be dispatched only from IQ0–IQ3. Note that only three of
the first four (in program order) can be dispatched on a given cycle.
Note that the bottom element of the instruction queue (IQ0) can be viewed as part of the integer decode
(ID) stage.
Table 7-3 describes the stages in the integer pipeline. Not all integer instructions pass
through every IU stage.
ID Integer Decode—In the ID stage, integer instructions are decoded and the operands are fetched from the
GPRs. Note that an integer instruction typically enters the decode stage when it enters IQ0; when and
instruction stalls in the ID stage, a new instruction may move into IQ0; however, IQ0 and ID will be the
same after the instruction is no longer stalled in the ID stage.
IE Integer Execute—In the IE stage, integer ALU operations are executed, the EA for memory access
instructions is calculated and translated, and the request is made to the memory subsystem (that is, the
instruction simultaneously occupies the CARB and CACC stages).
There is feed-forwarding for ALU operations in the IE stage. This means that the results calculated in the
IE stage are available as sources to the instruction that enters IE stage in the next cycle. This eliminates
stalls due to data dependencies between consecutive integer ALU instructions. There is also feed-
forwarding from the CACC stage to the IE stage resulting in only a one-cycle stall for a dependent
operation directly following a load instruction. For more information, see Section 7.3.3, “Integer Pipeline
Stages.”
IC Integer Completion—In the IC stage, results of instructions are made available for use unless
synchronous exceptions are detected. Tags for branch and floating-point instructions must pass through
this stage.
IWA Integer Arithmetic Writeback—In the IWA stage, the general purpose registers (GPRs) are updated with
the results from integer arithmetic operations.
IWL Integer Load Writeback—In the IWL stage, integer load operations a write operation in the GPRs with from
the cache or from memory.
The data access queueing unit provides a buffer between the IU and the memory subsystem.
It contains two stages—the floating-point store buffer stage (FPSB), and the integer store
buffer stage (ISB). The stages in this unit are described in Table 7-4. The data access
queueing unit is described with the IU in Section 7.3.3, “Integer Pipeline Stages.”
Table 7-4. PowerPC 601 Microprocessor Pipeline Stages—Data Access Queueing
Unit
Stage Description
FPSB Floating-Point Store Buffer— The FPSB stage is used for floating-point store instructions that have been
committed (or are being committed) but for which the floating-point data is not yet available. All floating-
point store instructions must pass through this stage. This stage allows the instruction to free up the IE
stage. An instruction remains in this stage until it completes the CACC stage. The data in this stage is kept
memory-coherent by the processor.
ISB Integer Store Buffer—The ISB stage is used to buffer data accesses that were not arbitrated into the
cache due to a higher priority access (such as a cache reload). This stage allows the instruction to free up
the IE stage. An instruction remains in this stage until it completes the CACC stage.The buffer used in this
stage is kept memory-coherent by the processor.
The floating-point pipeline contains six stages below the DS stage. These stages are
described in Table 7-5. Note that all floating-point arithmetic instructions must pass
through each of these stages (with the exception of F1); details of how each type of
instruction makes use of each floating-point stage is provided in Section 7.3.4, “Floating-
Point Pipeline Stages.”
FD Floating-Point Decode—In the FD stage, instructions are decoded and operands are fetched from the
FPRs.
FPM Floating-Point Multiply—In the FPM stage, operands are fed through a multiplier that performs the first part
of a multiply operation. The multiplier performs a single-precision multiply with a throughput of one per
cycle or a double-precision multiply with a throughput of one per two cycles.
FPA Floating-Point Add—In the FPA stage, an addition is performed for add instructions or to complete multiply
or accumulate instructions.
FWA Floating-Point Arithmetic Writeback—In the FWA stage, normalization and rounding occur, FPRs are
updated, store data is sent to the memory subsystem, and bits are set in the FPSCR. Data written to the
FPRs during the FWA stage is available to the FD stage in the following cycle.
FWL Float Load Writeback—During the FWL stage, load data is written into the FPRs. Load data that is being
written to the FPRs is also available to the FD stage in the same cycle.
The BPU pipeline contains three stages below the DS stage—the branch execute stage
(BE), the mispredict recovery stage (MR), and the branch writeback stage (BW). These are
described in Table 7-6.
Table 7-6. PowerPC 601 Microprocessor Pipeline Stages—Branch Pipeline
Stage Description
BE Branch Execute—In the BE stage, the target address of a branch is calculated and the branch direction is
either determined or predicted depending on the state of the condition register (CR) and the type of branch
instruction. Note that the BE stage is parallel with the FA stage of the target instructions of a taken branch.
MR Mispredict Recovery—Conditional branches also go into the MR stage (in parallel with entering the BE
stage) and stay there until the branch is resolved (the CR is coherent). If a branch was predicted
incorrectly, the MR stage logic allows the fetcher to recover and start executing the correct path.
BW Branch Writeback—In the BW stage, branch instructions that update the link register (LR) or count register
(CTR) do so. Note that many branches can be in the BW stage at any given cycle, but that no more than
two can write back in one cycle. An infinite stream of taken branches that hit in the cache has a throughput
of one branch every two cycles.
Some integer and floating-point instructions must repeat stages in their respective pipelines,
which causes subsequent instructions in the pipeline to stall. In addition to the multicycle
operations, data dependencies can cause stalls as described in Section 7.3.3, “Integer
Pipeline Stages,” and Section 7.3.4, “Floating-Point Pipeline Stages.”
Synchronization is handled relative to the integer pipeline. To ensure that LR and CR are
updated in order, neither the BPU nor the FPU can perform their write-back operations until
their tags complete the IC stage in the integer pipeline. Instructions in the IU cannot get
ahead of instructions in the BPU, but may get ahead of instructions in the FPU by as many
Nonconditional/nonupdating No No Always*
* Conditions that can prevent branch folding are described in Section 7.3.1.4.5, “Dispatch Considerations Related
to IU/FPU Synchronization.”
All branch instructions use the BE stage, during which, the target address for the branch is
calculated and if the branch is either resolved to be or predicted to be taken, the address for
the branch is passed to the FA stage (in zero cycles).
Branches that are conditional on the CR are predicted in the BE stage (unless they can
already be resolved). The address of the nonpredicted path is stored in the MR stage (see
below) in case the prediction is incorrect. The 601 uses a static prediction scheme.
Branches that are conditional on the CR also enter the MR stage in parallel with BE stage.
The branch stays in the MR stage until it is resolved. If the prediction is incorrect, the
mispredict address (stored in the MR stage) is sent to the FA stage. The MR stage can hold
only one mispredicted branch address, so the 601 can only handle one unresolved
SYSTEM INTERFACE
The read queues may hold any combination of the following three cache misses—one fetch,
one load, and one cacheable write-back store operation.
For better overall processor performance, the queueing allows high priority operations
(such as read misses) to bypass low priority operations (such as cache write-back
operations). This prioritizing is discussed in Section 7.2.2.2.3, “Bus Interface Arbitration.”
The bus interface features include pipelining of up to two operations, data forwarding (after
two data beats have been received), bus parking, and optional loading of the adjacent sector
in a cache line to decrease the latency of memory accesses. This is described in detail in
Chapter 9, “System Interface Operation.”
The bus interface allows zero wait state data to stream into or out of the chip, providing a
maximum data bus bandwidth of 320 Mbytes/second (at 50 MHz). This high bandwidth is
achieved by using the 601’s zero wait state capability and the ability to burst data, to
pipeline two addresses onto the bus and overlap slave access time for the second tenure with
that of the first.
7.2.2.2.1 Write Queue
For each position in the write queue, there is space for the physical address and for the
associated data (each capable of holding a sector). One position, marked snoop in
Figure 7-2, is provided for high-priority bus operations when a snoop hits a modified sector.
This is described in Section 9.10, “Using DBWO (Data Bus Write Only).”
Burst writes are always sector aligned.
7.2.2.2.2 Read Queue
For burst read operations, the address is modified to be the quadword address of the
requested data. By requesting the quadword, the hardware can forward data to the internal
target after only two beats of the data have been received. On the other hand, the memory
system need only be able to provide two orderings of data coming back (first quadword of
a sector then second or second quadword then first).
BR
Processor
Not Parked BG
TS
BR
(Inactive)
Processor BG
Parked (Active)
TS
Figure 7-3. Bus Timing for Parked and Nonparked Bus Masters
For more information about bus parking refer to Section 9.3.1, “Address Bus Arbitration.”
Cycle: 0 1 2 Cycle: 0 1 2
Figure 7-4. Cache Hit Timing for a Load Access and a Fetch Access
If a cache hit is detected when a write access is arbitrated into the cache, the cache is
accessed and written in the same cycle. If necessary, the tag directory is also updated in the
same cycle (cycle 1) in Figure 7-4. Store operations have the same arbitration priority as
load operations. As long as they hit in the cache, a read access can immediately follow a
write access and a write access can immediately follow a read access unless there are data
dependencies. Data dependencies are shown in the examples in Appendix I, “Instruction
Timing Examples.”
7.2.2.3.3 Cache Miss Timing
The cache miss timing is a function of several variables, including bus speed, memory
access time and interleaving structure, and bus arbitration schemes. This section considers
the fastest possible system, then provides formulas for calculating numbers for any real
system. For the rest of this section, the terms processor cycle and bus cycle are used to
differentiate the actual amount of time elapsing. Bus cycles are integer multiples of
processor cycles, and the multiple is system-dependent.
7.2.2.3.4 Timings When the Processor Clock Frequency Equals the Bus
Clock Frequency
The following discussion assumes a 1:1 bus clock to processor clock frequency ratio. For
information about 601 clock operation, refer to Section 8.2.11, “Clock Signals.” Figure 7-5
shows the best-case timing for a cache miss with the processor and bus operating at the
same frequency. This example assumes a one bus cycle access time to main memory, bus
running at full speed, no delay between the data beats (that is the transfer acknowledge, TA,
signal is asserted without delay for each beat), no address retry (ARTRY) on the address
tenure, and the processor is parked on the bus.
Arbitrate Cache Load in 1st data 2nd data 3rd data 4th data beat, dependent Reload New
load miss. memory beat beat beat first two beats operation dump 2, sector
into the Receive queue, returned returned returned; of data written executes, cache usable
cache bus assert TS, arbitrate to cache and arbitrate tags in cache.
(CARB). grant receive reload forwarded to reload validated
(BG). data bus dump IE stage, and dump 2 for
grant into IQ. into cache. sector.
(DBG) cache.
Figure 7-5. Cache Miss Timing When the Processor Clock Equals the Bus Clock
(Best-Case Timing)
When a cache miss is detected it is immediately arbitrated onto the bus if nothing of higher
priority is queued (Clock 0). The bus grant signal (BG, is asserted in the next clock cycle
(Clock 1) if there is no higher priority transaction on the bus. However, if the processor is
parked on the bus (BG is already asserted), a transfer begins on the next bus cycle. In this
example, the transfer begins in Clock 2. The operation is also placed in the read queue
(shown in Figure 7-2), so the data can be pipelined back to the processor and the cache.
Sometime after the transfer has begun (perhaps as soon as the next bus cycle as shown here
in Clock 3), data begins to come into the processor. Data arrives in four double-word beats
(for a cache sector of data). After the second beat of data arrives (Clock 4), the bus interface
unit makes a request (Clock 5) to write the four words into the cache (a “reload dump”).
The first four words contain the critical word.
During the next processor cycle (Clock 6), data is written to the cache and forwarded to the
processor core if required (either to the register or to the IQ, depending on whether the
action was a load or a code fetch). After the last two beats of data arrive, a second request
to write to the cache is made (Clock 7). During Clock 8, the cache receives the remaining
four words of data (bursts three and four) and the cache tag is validated. In Clock 9, if an
access to this sector is arbitrated into the cache, the cache tags signal a cache hit, as shown
in Figure 7-5.
Figure 7-6 shows a formula for calculating the number of processor cycles from cache miss
(that is, cycle 1 in Figure 7-5) to dependent operation execution (cycle 7 in Figure 7-5)
assuming a 1:1 processor-to-bus-clock ratio.
Figure 7-7. Formula for Calculating Total Time from Cache Miss to Data Available
for the Next Cache Hit
7.2.2.3.5 Timings When the Processor Clock Frequency Does Not Equal the Bus
Clock Frequency
This section describes timings when the bus clock frequency does not equal the processor
clock. Information regarding configuring the clock signals is given in Section 8.2.11,
“Clock Signals.” Figure 7-8 shows the timing for a cache miss when the processor clock is
twice as fast as the bus clock.
Load in Load in Assert Receive First Second Third Data Fourth data
cache, memory TS DBG. data data data written to beat
cache queue, beat beat beat cache and returned,
miss. arbitrate returned returned returned forwarded dependent
to the . and to IE stage operation
bus, BG arbitrate or IQ executes
asserted reload
dump
into
50% of the time, this clock is not wasted because the miss occurs on the bus transition cycle
instead of the nontransition cycle as shown here.
12 13 14 15
Figure 7-8. Cache Miss Timing when the Processor Clock Frequency is Twice the
Bus Clock Frequency (Best-Case)
Processor cycles are indicated by both dashed and solid lines, while bus cycles are indicated
by the solid lines only.
Note that typically there is a one-half processor cycle penalty for synchronizing to the bus
clock when the processor clock frequency is twice the bus clock frequency (half the time
the cache miss occurs one processor clock before a bus transition processor clock, half the
time it occurs in a bus transition processor clock). This penalty is larger for larger processor
clock to bus clock ratios, as the average number of processor cycles of delay increases.
Also, all bus related parameters now have to be multiplied by the processor clock to bus
clock ratio.
Figure 7-9 shows these factors in calculating total processor cycles for cache-miss-to-
dependent-operation execution when the bus clock frequency does not equal the processor
clock frequency.
Figure 7-9. Formula for Calculating Total Time from Cache Miss to Dependent
Operation Execution (Processor Clock/Bus Clock ≠ 1:1)
Figure 7-10 shows a formula for calculating the total number of processor cycles from
cache miss to data available for next cache hit on a bus that is not running at the same
frequency as the processor.
Figure 7-10. Formula for Calculating Total Time from Cache Miss to Data Available
for Next Cache Hit (Processor Clock ≠ Bus Clock)
Integer 1;
IE: — Integer 2
TAGF1; TAGB1
Integer 1;
IC: — — TAGF1; TAGB1
All floating-point and branch instructions except unconditional branch instructions that do
not update the LR generate a tag that ensures that instructions will write back in an orderly
fashion. The three types of synchronization tags are as follows:
IC: — — Integer 1
The actual synchronization between units is performed relative to the integer completion
stage (IC). As an integer instruction completes the IC stage, the completion logic checks for
any tags.When tags are detected, any dependencies are checked, and writeback is withheld
until any dependencies related to the tags are resolved. Note that instructions are not forced
to complete in strict program order; the IC stage logic ensures that dependencies (such as
the CR) implied by program order are handled properly.
When a branch is predicted, (recall that the CR is not coherent on the cycle in which the
branch is in BE), the address of the nonpredicted path is held in the MR stage along with
the information required to determine whether the prediction is correct. If the branch is
resolved as correctly predicted, the MR stage is cleared and no recovery is performed.
If the branch is not resolved as predicted, the mispredict recovery address at MR is used as
the fetch address and any instructions from the nonpredicted path are purged from the IQ.
Note that the MR and BW stages are different and that a branch does not necessarily leave
the MR stage before it leaves the BW stage. Also note that only branches that are
conditional on the CR need to pass through the MR stage; unconditional branches and
branch-and-decrement branches that do not need the CR need not pass through the MR
stage and can execute even when the MR stage is busy.
When a branch is resolved as predicted, there are generally instructions behind it in
program order already in the DS stage. Because the processor could not determine whether
those instructions are needed before the branch is resolved, it waits until the target
instructions are loaded into the IQ before writing the instructions from the predicted path
over those from the nonpredicted path. This is referred to as a delayed purge. If the branch
is not resolved as predicted and resolution occurs before the target instructions are written
into the IQ, the sequential path is kept and the target instructions are not loaded into the IQ.
While the nonpredicted path is in the IQ, it is referred to as the nonpurged path. Examples
showing how these instructions are handled are provided in Appendix I, “Instruction
Timing Examples.”
The speculative instructions in the pipeline are marked with the predicted branch tag, which
marks the position of the predicted branch in program order in the integer pipeline. No
instructions can be dispatched onto a predicted branch tag (although a predicted branch tag
can be placed on top of other tags). Also, instructions cannot be dispatched onto conditional
branch instructions that are going to generate a predicted branch tag (that is, any branches
that are conditional on the CR). Speculative execution is supported for the FPU and the
BPU in the 601. The predicted branch tag is different from other tags in that it is not allowed
to progress beyond the IE stage; thus speculative integer instructions are not allowed past
the ID stage (they cannot go past the predicted branch tag in the IE stage).
CARB CARB
CACC
DS
BE
BW
a. Branch writeback is delayed n cycles until the branch’s tag completes. Note that only branches with LK=1
need to pass through the BW stage.
b. This is the FA stage for the instructions on the taken path for taken branches.
c. For branches with the link bit set (LK = 1), the new link value is stored in a link shadow register and is
written to the architected LR in the BW stage. The branch-and-link instruction use one of the two link
shadow registers from the cycle it is dispatched until the cycle after the LR is updated (BW).
Table 7-10 shows the instruction timing for the bc, bcl, bca, and bcla instructions. For
branches that are resolved k cycles after they are executed, k may be 0, in which case the
CR is accessed in the DS/BE/MR cycle—this corresponds to the case when the condition
upon which the branch depends is already resolved.
Branch writeback is delayed until the branch tag completes (n cycles). Note that only
branches that update either the CTR or LR need to pass through the BW stage. If n is less
than k, the branch remains in the MR stage but leaves the BW stage.
CARB CARB
CACC
DS
BE
MR MR
BW BW
a. For branches that are resolved k cycles after they are executed. k may be 0 in which case the architected CR
is accessed in the DS/BE/MR cycle—this corresponds to the case when the condition upon which the branch
depends is already resolved.
b. Branch writeback is delayed until the branch tag completes (n cycles). Note that only branches that update
either the CTR or the LR need to pass through the BW stage. If n < k, the branch remains in the MR stage but
leaves the BW stage.
c. This is the FA stage for the target instructions of the branch (only for taken branches).
d. For branches with BO field specifying ‘decrement count and branch.’
e. Conditional branches access the architected CR on the last cycle of the MR stage; this access is what ends
the MR stage.
f. For branches with the link bit set (LK=1), the new link value is stored in a link shadow register and is written to
the architected LR in the BW stage. The branch-and-link instruction will use one of the two link shadow
registers from the cycle it is dispatched until the cycle after the LR is updated (BW).
CARB CARB
CACC
BE
MR MR
BW BW
a. For branches that are resolved k cycles after they are executed. k may be 0 in which case the CR is accessed
in the DS/BE/MR cycle—this corresponds to the case when the condition upon which the branch depends is
already resolved.
b. Branch writeback is delayed n cycles—until the branch tag completes. Note that only branches that update
either the CTR or the LR need to pass through the BW stage. If n < k, then the branch remains in the MR stage
but leaves the BW stage.
c. This is the FA stage for the target instructions of the branch (only for taken branches).
d. For branches with BO field specifying ‘decrement count and branch.’
e. The bclr instruction needs the most recent LR (look-ahead state) and can access the link shadow registers.
f. Conditional branches access the architected CR on the last cycle of the MR stage; this access is what ends the
MR stage.
g. For branches with the link bit set (LK = 1), the new link value is stored in a link shadow register and is written to
the architected LR in the BW stage. The branch-and-link instruction will use one of the two link shadow
registers from the cycle it is dispatched until the cycle after the LR is updated (BW).
Table 7-12 shows the instruction timing for the bcctr and bcctrl instructions.
CARB CARB
CACC
BE
MR MR
BW BW
a. For branches that are resolved k cycles after they are executed. k may be 0 in which case the architected CR is
accessed in the DS/BE/MR cycle—this corresponds to the case when the condition upon which the branch
depends is already resolved.
b. Branch writeback is delayed n cycles—until the branch tag completes. Note that only branches that update
either the CTR or the LR need to pass through the BW stage. If n < k, then the branch remains in the MR stage
but leaves the BW stage.
c. This is the FA stage for the target instructions of the branch (only for taken branches).
d. The bcctr instruction needs the most recent CTR (look-ahead state) and can access the results of previous
‘decrement count branches’ before they writeback (leave BW).
e. For branches with the link bit set (LK = 1), the new link value is stored in a link shadow register and is written to
the architected LR in the BW stage. The branch-and-link instruction will use one of the two link shadow
registers from the cycle it is dispatched until the cycle after the LR is updated (BW).
= Instruction Flow
GPRs
= Data Flow
= Tag Flow
Feed-Forwarding
Cycle
IE
Boundary
From Floating-Point Unit
ISB FPSB
CARB
CACC
IC IWA IWL
Each block in the diagram represents a different stage (except for the box labeled “GPRs”,
which represents the IU general purpose registers). There are three different flows shown
in the diagram—the instruction flow; the tag flow, and the data flow.
In most cases, it takes one cycle to go from one stage to another, as shown by the marked
cycle boundaries in the diagram. However, an instruction that wishes to access the cache
(load or store) resides in the CARB stage (cache arbitration) on the same cycle it is in IE
(integer execute). Also, instructions that are in writeback can provide data to the ID/IE
boundary through the GPRs in the same cycle. Each instruction uses a subset of the stages
Pipeline stages ID
IE
IC
IWA
The instruction spends one cycle in IE. The status bits (XER[CA], XER[OV SO], and CR0)
are all written from IE. Resource usage of the various types of instructions is summarized
in Table 7-14.
addi No If !(rA = 0) No No No No
addis No If !(rA = 0) No No No No
The multiply instructions are shown in Table 7-15 (mul, mullw, mulhw, mulhwu) and
Table 7-16 (mulli). These instructions take 5, 9, or 10 cycles in IE as described in
Section 7.3.3.2.1, “IE Stage—ALU Instruction Operation.” All of the multiply instructions
in the 601 use the MQ register, which is undefined after an mulhw or mulhwu instruction.
Table 7-15 shows the timing for the mul, mullw, mulhw, and mulhwu instructions.
Pipeline stages ID
IE
IC
IWA
As shown in Table 7-16, the mulli instruction takes only five cycles in IE.
Table 7-16. Multiply Instruction Timing (mulli)
Number of Cycles 1 5 1
Pipeline stages ID
IE
IC
IWA
Table 7-17 shows the timing for divide instructions (div, divs, divw, and divwu). Each
divide instructions takes 36 cycles in the IE stage. They all use the MQ register. PowerPC
divide instructions do not return the remainder and the contents of the MQ after the
instruction completes is undefined. The 601 handles the MQ for POWER divide
instructions as defined in the POWER architecture. These instructions are described in
Chapter 10, “Instruction Set.”
Pipeline stages ID
IE
IC
IWA
Compare instructions spend one cycle in ID and one cycle in IE, but essentially have no
IWA stage. The compare results are written to the CR and forwarded to the BPU (for
conditional branch evaluation) in the middle of the IE cycle. The timing for the compare
instructions is shown in Table 7-18.
Table 7-18. Integer Compare Instruction Timing (cmp, cmpi, cmpl, cmpli)
Number of Cycles 1 1 1
Pipeline stages ID
IE
IC
IWA
The flow of the trap instructions (tw and twi) depends on whether they cause a program
exception. If they do not cause a program exception, one cycle is required in IWA. If they
cause a program exception, 22 cycles are required in the IWA stage. It is important to note
that this performance penalty occurs only when the trap condition is met. Typically, this is
a rare condition and performance is not critical.
Pipeline stages ID
IE
IC
IWA
a. One cycle is required if the trap is not taken; 22 cycles are required if it is taken.
b. Only the tw instruction uses rB.
Pipeline stages ID
IE
IC
IWA
Table 7-21 lists the resources required by the Boolean logic instructions.
Pipeline stages ID
IE
IC
IWA
rlimi No No No No If (RC=1)
rlinm No No No No If (RC=1)
sl Yes No No No If (RC=1)
sr Yes No No No If (RC=1)
Pipeline stages ID
IE
IC
IWA
The mtcrf instruction, shown in Table 7-25, copies the contents of a GPR into the CR on a
field-by-field basis. Although the PowerPC architecture allows performance to degrade if
more than one field is being written, the 601 has the same optimal performance regardless
of the field mask specified in the instruction. The CR is written during the IE stage.
Table 7-25. mtcrf Instruction Timing
Number of Cycles 1 1 1
Pipeline stages ID
IE
IC
IWA
a. Any combination of fields can be written as specified by the mask field in the
instruction.
Pipeline stages ID
IE
IC
IWA
The mcrf instruction, shown in Table 7-27, copies the contents of one specified field of the
CR into another. This instruction is unusual in that it reads the CR from the decode stage.
The timing for this instruction is the same regardless of which fields are specified.
Table 7-27. mcrf Timing
Number of Cycles 1 1 1
Pipeline stages ID
IE
IC
IWA
The mcrxr instruction, shown in Table 7-28, copies bits 0–3 from the XER to the specified
field in the CR. This instruction reads the XER in the ID stage and writes to the CR in the
IE stage. The performance is the same regardless of which bit field is specified.
Table 7-28. mcrxr Instruction Timing
Number of Cycles 1 1 1
Pipeline stages ID
IE
IC
IWA
Pipeline stages ID
IE
IC
IWA
The SDR1 register requires one cycle per stage, but is written from IE and can be
read from the IWA stage (see Table 7-29).
Table 7-30. mtspr Instruction Timing (SDR1)
Number of Cycles 1 1 1
Pipeline stages ID
IE
IC
IWA
• SPR accesses that require two cycles in the IE stage and one cycle in each of the
other stages—This group includes accesses to SPRGn, DSISR, DAR, RTCU, SRR0,
SRR1. These accesses require two cycles in IE and one cycle in IWA. Note that if the
next instruction is another of these instructions, it requires two additional cycles in
IE.
Pipeline stages ID
IE
IC
IWA
Pipeline stages ID
IE
IC
IWA
Pipeline stages ID
IE
IC
IWA
a. The mtmsr instruction remains in the decode stage (ID) until all previous
instructions in program order have completed processing. This latency is
variable for this reason.
b. Because the mtspr instruction is context-synchronizing, there are no other
instructions in the pipeline, so all resources are not considered to be required
exclusively or nonexclusively in the conventional sense.
Pipeline stages ID
IE
IC
IWA
Pipeline stages ID
IE
IC
IWA
• Some SPR accesses spend two cycles in IE and multiple cycles in IWA. Accesses to
the SPRGn, DSISR, DAR, HID0, HID1, HID2, RTCU, SRR0, and SRR1 registers
require four cycles in IWA; accesses to the PVR register take seven cycles. These
accesses take longer than mtspr accesses to the same registers. Timing for these
accesses is shown in Table 7-36.
Table 7-36. mfspr Instructions—Two Cycles in IE and Multiple Cycles in IWA
Number of Cycles 1 2 4 or 7 a
Pipeline stages ID
IE
IC
IWA
a. PVR is the only SPR that takes seven cycles in the IWA stage. The rest of the SPR’s
listed in this table take four cycles in IWA.
• As with the mtspr instruction, accesses to the SDR1 register with the mfspr
instruction have a unique timing. These accesses take one cycle in IE where the
SDR1 is read, and it spends one cycle in IWA where the GPR is written. Data cannot
be forwarded for this register; therefore, a dependent operation that follows
immediately after mfspr stalls for one cycle in IE. Timing for this instruction is
shown in Table 7-37.
Pipeline stages ID
IE
IC
IWA
Pipeline stages ID
IE
IC
IWA
Pipeline stages ID
IE
IC
IWA
7.3.3.5.11 System Call (sc) and Return from Interrupt (rfi) Instruction
Timings
Both the sc and rfi instructions are context synchronizing; they establish a new context and
then branch to that target. All prefetched instructions are discarded and fetching begins
again at the new address under the new machine state. This causes at least a three-cycle stall
before the next instruction can enter the IE stage. These instructions are held in the ID stage
until all preceding instructions have completed. The sc instruction takes 16 cycles in IWA,
and the rfi instruction takes 13 cycles in IWA. Timing for these instructions is shown in
Table 7-40.
Table 7-40. rfi and sc Instruction Timings
Number of Cycles 1 2 13 or 16a
Pipeline stages ID
IE
IC
IWA
a. The sc instruction takes 16 cycles in IWA; the rfi instruction takes 13 cycles in IWA.
Both instructions are context-synchronizing, which causes at least a three-cycle stall
before the next instruction can enter the IE stage.
Pipeline stages ID
IE
CARB
CACC
IC
a. Note that this is the best-case timing. The time in this stage may be longer for a
given system implementation. Actual timing is dependent on bus speed, timing of the
AACK signal, whether the processor is parked on the bus, and bus synchronization.
Table 7-42 shows best-case timings for the dcbf(M), dcbst(M), and dcbz(M,E)
instructions. Note that the CACC stage requires only one cycle because these instructions
do not require access to the system interface.
Pipeline stages ID
IE
CARB
CACC
IC
Pipeline stages ID
IE
CARB
CACC
IC
IWL or FWL
As shown in Table 7-44, the same load instructions (lbz, lbzx, lhz, lhzx, lha, lhax, lwz,
lwzx, lwarx, lfs, lfsx, lfd, lfdx) take two cycles in IE and CACC stages if the operand
crosses a double-word addressing boundary; this is because the data from doublewords
requires separate cache accesses (splice 1 and 2). Note that in the third cycle the first access
in the cache (splice 1) overlaps with the second arbitration for the cache (splice 2). It then
takes one extra cycle for the register data to become available. For more information, see
Section 7.3.3.3.4, “Unaligned Load/Store Operations.”
Pipeline stages ID
IE IE
CARB CARB
CACC CACC
IC
IWL or FWL
Pipeline stages ID
IE
CARB
CACC
IC
IWA
IWL or FWL
Like load operations that do not update, when these instructions (lbzu, lbzux, lhzu, lhzux,
lhau, lhaux, lwzu, lwzux, lfsu, lfsux, lfdu, and lfdux) have operands that cross a double-
Pipeline stages ID
IE IE
CARB CARB
CACC CACC
IC
IWA
IWL or FWL
7.3.3.5.15 Load Multiple Word (lmw) and Load String Word Immediate
(lswi)
Load multiple word (lmw) and load string word immediate (lswi) instructions spend one
cycle in IE for each register of data to be transferred (shown as the variable n). Timing for
these operations is shown in Table 7-47.
Table 7-47. Load Multiple Word (lmw) and Load String Word Immediate
(lwsi)—Operand is Word-Aligned
Number of Cycles 1 1 1 na-2 1 1
Pipeline stages ID
IE IE IE
IC
Resources required rA
nonexclusively
Pipeline ID
stages
IE IE IE IE IE
IC
Resources rA
required
non
exclusively
7.3.3.5.16 Load String Word Indexed (lswx) and Load String and Compare
Byte Indexed (lscbx) Instruction Timing
Timing for the lswx and lscbx instructions differs from the lswi and lmw instructions in that
both rB and the XER are read in ID. Also the lscbx instruction reads the XER throughout
the execution of the instruction and may write the XER if a byte-compare match is found.
Since the determination of when the XER is written is data-dependent, the XER must be
accessed exclusively throughout the execution of the instruction. Timing for these
instructions is shown in Table 7-49.
Pipeline stages ID
IE IE IE
IC
When the operand is not on a word boundary, these two instructions take an extra cycle for
every other register of data (that is, each time the word boundary is crossed). For more
information, see Section 7.3.3.5.15, “Load Multiple Word (lmw) and Load String Word
Immediate (lswi).” Timing is shown in Table 7-50.
Table 7-50. lswx and lscbx Instruction Timings—Operand Not on a Word Boundary
Number of Cycles 1 1 1 na-2/2 (3 cycle iteration) 1 1
Pipeline stages ID
IE IE IE IE IE
IC
Resources required rA
nonexclusively
Pipeline stages ID
IE
CARB
CACC
IC
If the operand specified by the store crosses a double-word boundary, the access is split into
two pieces (See Section 7.3.3.3.4, “Unaligned Load/Store Operations”); therefore, the store
spends two cycles in IE (as do load operations). The timings for store instructions (stb,
stbx, sth, sthx, stw, stwx, sthbrx, and stwbrx) whose operands cross a double-word
boundary are shown in Table 7-52.
Table 7-52. Integer Store Instruction Timings—Operand Crosses a Double-Word
Boundary
Number of Cycles 1 1 1 1
Pipeline stages ID
IE IE
CARB CARB
CACC CACC
IC
Pipeline stages ID
IE
CARB
CACC
IC
IWA
The timings for misaligned store operations with update (stbu, stbux, sthu, sthux, stwu,
and stwux) with operands crossing a double-word boundary are shown in Table 7-54.
Table 7-54. Update Form Store Instruction Timings—Operand Crosses a
Double-Word Boundary
Number of Cycles 1 1 1 1
Pipeline stages ID
IE IE
CARB CARB
CACC CACC
IC
IWA
Pipeline stages ID
IE
CARB
CACC
IC
FD
FPM
FPA
FW
Store accesses that cross a double-word boundary must be broken into two pieces, each of
which requires a cache access. However, because the FPSB stage is only one-element deep,
it stays full with the first request until the data is made available from the FPU, delaying
arbitration for the second cache access for three cycles. As shown in Table 7-56, this takes
five cycles in IE for stfs, stfsx, stfd, stfdx instructions instead of one for an aligned floating-
point store.
Pipeline stages ID
IE IE IE IE IE
CARB CARB
CACC CACC
IC
FD
FPM
FPA
FW
Pipeline stages ID
IE
CARB
CACC
IC
IWA
FD
FPM
FPA
FW
Table 7-58 shows timings for the stfsu, stfsx, stfdu, and stfdux instructions when the
operand crosses a double-word boundary.
Pipeline stages ID
IE IE IE IE IE
CARB CARB
CACC CACC
IC
IWA
FD
FPM
FPA
FW
Resources required
exclusively
7.3.3.5.21 Store Multiple Word (stmw) and Store String Word Immediate
(stswi)
The stmw and stswi instructions spend a cycle in each of the IE, CARB, and CACC stages
for each register of data to be transferred. The timings for these instructions are shown in
Table 7-59. The number of registers to be transferred in this example is 3 (9–12 bytes of
data for the stswi instruction, 12 bytes of data for the stmw instruction). If four registers of
data were specified, there would be four IE, CARB, and CACC cycles—all overlapping as
shown.
Pipeline stages ID
IE IE IE
IC
Because the accesses are made one word at a time, if the EA specified is not on a word
boundary, every other access crosses a double-word boundary. Accesses that cross a
double-word boundary are split into two pieces just as for scalar store operations. Timing
for stmw and stswi when operands are not on a word boundary is shown in Table 7-60.
Table 7-60. stmw and stswi Instruction Timing (Not Word-Aligned)
Number of Cycles 1 1 (na-1)/2 (3 cycle iteration) 1
Pipeline stages ID
IE IE IE IE
Pipeline stages ID
IE IE IE
IC
When the operand is not on a word boundary, every other access takes two cycles in the IE,
the CARB, and the CACC stages. The timing for the stswx instruction when the operands
are not word-aligned is shown in Table 7-62.
Table 7-62. stswx Instruction Timing (Not Word-Aligned)
Number of Cycles 1 1 (na-1)/2 (3 cycle iteration) 1
Pipeline stages ID
IE IE IE IE
Pipeline stages ID
IE XXXa XXX
CARB
CACC
IC
IWA IWA
a. The XXX’s indicate that an instruction cannot be executing in IE while a stwcx. is in IWA. An instruction
may be present in IE, but if there is one, it will be held (stall).
Because the reservation bit may not be set by the time the stwcx. instruction is executed,
only one cycle is spent in IWA. The flow with the reservation bit cleared is shown in
Table 7-64.
Table 7-64. stwcx. Instruction Timing—Reservation Cleared
Number of Cycles 1 1 1
Pipeline stages ID
IE XXXa
CARB
CACC
IC
IWA
a. The XXX’s indicate that an instruction cannot be executing in the IE stage while a stwcx. is in
the IWA stage. An instruction may be present in IE, but if there is one, it will be held (stall).
F1 Load Data
(From Memory Subsystem)
(Queue)
FPR
FD
(Floating Point
(Decode Stage)
Registers)
Feedback for
FPM Double
(Multiply Stage) Precision
Multiply
FPA = Instruction
(Add Stage) flow
= Data flow
Normalizer,
Rounder
= Cycle
FWA boundary
(Writeback)
Store Data
to Memory
Subsystem
Some instructions require more than one cycle in a stage, and some of those instructions
may occupy multiple stages simultaneously. This process is described in Section 7.3.1.4.3,
“Floating-Point Dispatch.”
The FPU uses the multiply/add (fmadd) instruction (D = AC + B) as the base for other
floating-point operations. Other instructions are implemented using this instruction and
setting one or more of the operands with a constant (in the FD stage). These are shown as
follows:
Pipeline stages
FD
FPM
FPA
FWA
Cache access
Table 7-66 shows the timing for a single-precision multiple instruction with no special-case
data.
Table 7-66. Single-Precision Multiply Instruction (fmuls)—No Special-Case Data
Number of Cycles 1 1 1 1
Pipeline stages
FD
FPM
FPA
FWA
Cache access
Table 7-67 shows the timing for a single-precision divide instruction with no special-case
data.
Pipeline stages
FD FD FD FD
FPA FPA
FWA FWA
Cache access
Table 7-68 shows the timing for single-precision accumulate instructions with no special-
case data.
Table 7-68. Single-Precision Accumulate Instructions (fmadds, fmsubs, fnmadds,
fnmsubs)—No Special-Case Data
Number of Cycles 1 1 1 1
Pipeline stages
FD
FPM
FPA
FWA
Cache access
Table 7-69. Double-Precision Add Instructions (fadd, fsub, frsp, fcmpu, fcmpo)—No
Special-Case Data
Number of Cycles 1 1 1 1
Pipeline stages
FD
FPM
FPA
FWA
Cache access
Table 7-70 shows the timing for a double-precision multiply instruction with no special-
case data. Note that in this example, the instruction requires two cycles in the FD, FPM, and
FPM stages; however, because the instruction is self-pipelining, there is only a one-cycle
delay before the next instruction can enter the decode stage.
Table 7-70. Double-Precision Multiply Instructions (fmul)—No Special-Case Data
Number of Cycles 1 1 1 1 1
Pipeline stages
FD FD
FPM FPM
FPA FPA
FWA
Cache access
Table 7-71 shows the timing for a double-precision divide instruction with no special-case
data. Note that the instruction occupies the decode stage and subsequent stage until the
instruction enters the last cycle of the writeback stage.
Pipeline stages
FD FD FD FD
FPA FPA
FWA FWA
Cache access
Table 7-72 shows the timing for a double-precision accumulate instruction with no
special-case data. Note that the timing for this instruction is similar to that of the fmul
instruction, in that it requires two cycles in the FD, FPM, and FPA stages. However,
because the instruction is self-pipelining, its can begin the first cycle of the FPM stage
while it enters the second cycle of the FD stage. This allows the next instruction to enter
the decode stage on the third clock cycle.
Table 7-72. Double-Precision Accumulate Instructions (fmadd, fmsub, fnmadd,
fnmsub)—No Special-Case Data
Number of Cycles 1 1 1 1 1
Pipeline stages FD FD
FPM FPM
FPA FPA
FWA
Cache access
Table 7-73. Floating Point Move/Store Instructions (fmr, fabs, fneg, fnabs, stfs,
stfsu, stfsx, stfsux, stfd, stfdu, stfdx, stfdux)—No Special-Case Data
Number of Cycles 1 1 1 1
Pipeline stages FD
FPM
FPA
FWA
Cache access
Pipeline stages FD
FPM
FPA
FWA
Cache access
Pipeline stages
FD FD FD FD
FPA FPA
FWA
Cache access
Table 7-76 shows the timing for the Move From FPSCR instructions, which are
implemented in the FPU. Note that in this example, frD is the target only of the mffs
instruction and CR[BF] is the target of the mcrfs instruction.
Table 7-76. Move from FPSCR Instruction Timing (mffs, mcrfs)
Number of Cycles 1 1 1 1
Pipeline stages
FD
FPM
FPA
FWA
Cache Access
Pipeline stages
FD FD FD FD
FPM FPM
FPA FPA
FW FW
Resources required
nonexclusively
Resources required
exclusively
Cache access
Table 7-78 shows the timing for single-precision accumulate instructions (fmadds,
fmsubs, fnmadds, and fnmsubs). In this example, one operand requires prenormalization.
This operand travels through the pipeline before execution can begin.
Table 7-78. Single-Precision Accumulate Instruction Timing
Number of Cycles 1 1 1 1 1 1 1
Pipeline stages
FDa FD FD FD FDb
FPM FPM
FPA FPA
FW FW
Resources required
nonexclusively
Resources required
exclusively
Cache access
a. The operand that needs prenormalization traverses the pipeline before the execution of the instruction starts.
b. The operands are now ready and the instruction can actually start execution.
Pipeline stages
FW FPW FW
Resources required
nonexclusively
Resources required
exclusively
a. The first operand that needs prenormalization traverses the pipeline before the execution of the instruction
starts.
b. The second operand that needs prenormalization traverses the pipeline (immediately behind the first operand)
before the execution of the instruction starts.
c. The operands are now ready and the instruction can actually start execution.
Pipeline stages
FD FD
FPM FPM
FPA FPA
FW FW
abs Absolute IU 1 0
add[o][.] Add IU 1 0
and[.] AND IU 1 0
cmp Compare IU 1 0
crand CR AND IU 1 0
creqv CR Equivalent IU 1 0
crnand CR NAND IU 1 0
crnor CR NOR IU 1 0
cror CR OR IU 1 0
crxor CR XOR IU 1 0
div[o][.] Divide IU 36 0
eqv[.] Equivalent IU 1 0
nand[.] NAND IU 1 0
neg[o][.] Negate IU 1 0
nor[.] NOR IU 1 0
or[.] OR IU 1 0
ori OR Immediate IU 1 0
tw Trap Word IU 15 0
xor[.] XOR IU 1 0
1These instructions access the system bus, thus the latency may vary depending on the exact state of the
machine
2A delay may be incurred if the subsequent instruction requires access to the cache; for example a store
instruction.
3The longer latency may occur if the contents of rB is larger than 16 bits (not including sign-extending bits.
4Shortest latency occurs if rB ≤ 16 bits. Longer latency occurs if rB > 16 bits, but most significant bit is still 0.
Longest latency occurs if most significant bit is 1.
This chapter describes the PowerPC 601 microprocessor’s external signals. It contains a
concise description of individual signals, showing behavior when the signal is asserted and
negated and when the signal is an input and an output.
NOTE
A bar over a signal name indicates that the signal is active
low—for example, ARTRY (address retry) and TS (transfer
start). Active-low signals are referred to as asserted (active)
when they are low and negated when they are high. Signals that
are not active-low, such as AP0–AP3 (address bus parity
signals) and TT0–TT4 (transfer type signals) are referred to as
asserted when they are high and negated when they are low.
The 601 signals are grouped as follows:
• Address arbitration signals—The 601 uses these signals to arbitrate for address bus
mastership.
• Address transfer start signals—These signals indicate that a bus master has begun a
transaction on the address bus.
• Address transfer signals—These signals, which consist of the address bus, address
parity, and address parity error signals, are used to transfer the address and to ensure
the integrity of the transfer.
• Transfer attribute signals—These signals provide information about the type of
transfer, such as the transfer size and whether the transaction is bursted, write-
through, or cache-inhibited.
• Address transfer termination signals—These signals are used to acknowledge the
end of the address phase of the transaction. They also indicate whether a condition
exists that requires the address phase to be repeated.
• Data arbitration signals—The 601 uses these signals to arbitrate for data bus
mastership.
• Data transfer signals—These signals, which consist of the data bus, data parity, and
data parity error signals, are used to transfer the data and to ensure the integrity of
the transfer.
601
TRANSFER 3 1
TBST CKSTP_OUT
ATTRIBUTE 1 1
CI HRESET
1 1
WT SRESET
1 1 SYSTEM
GBL RSRV
1 1 STATUS
CSE0–CSE2 SC_DRIVE
3 1
HP_SNP_REQ
1
ADDRESS AACK
1
TERMINATION ARTRY
1
SHD SCAN INTERFACE COP/ SCAN
1 7
INTERFACE
2X_PCLK TEST INTERFACE
1 21
PCLK_EN SYS_QUIESC
1 1 TEST
CLOCKS BCLK_EN RESUME
1 1 SIGNALS
RTC QUIESC_REQ
1 1
59 59
+3.6 V
Negated— Indicates that the 601 is not the next potential address bus
master.
Timing Comments Assertion—May occur at any time to indicate the 601 is free to use
the address bus. After the 601 assumes bus mastership, it does not
check for a qualified bus grant again until the cycle during which the
address bus tenure is completed (assuming it has another transaction
to run). The 601 does not accept a BG in the cycles between the
assertion of any TS or XATS and AACK.
Negation—May occur at any time to indicate the 601 cannot use the
bus. The 601 may still assume bus mastership on the bus clock cycle
of the negation of BG because during the previous cycle BG
indicated to the 601 that it was free to take mastership (if qualified).
Negated— Indicates that the 601 is not using the address bus. If ABB
is negated during the bus clock cycle following a qualified bus grant,
the 601 did not accept mastership, even if BR was asserted. This can
occur if a potential transaction is aborted internally before the
transaction is started.
Timing Comments Assertion—Occurs on the bus clock cycle following a qualified BG
that is accepted by the processor (see Negated).
Negation—May occur whenever the 601 can use the address bus.
Note that this signal is logically ORed with an internally generated address bus busy signal.
For more information, see Section 9.3.1, “Address Bus Arbitration.”
Negated—Indicates that the 601 has not detected a parity error (even
parity) on the address bus.
Timing Comments Assertion—Occurs on the second bus clock cycle after TS or XATS
is asserted.
High Impedance—Occurs on the third bus clock cycle after TS or
XATS is asserted.
XATC(0–7)=TT(0–3)||TBST||TSIZ(0–2).
TT4 is driven negated as an output on the 601 and is defined for
future expansion.
Timing Comments Assertion/Negation/High Impedance—The same as A0–A31.
8.2.4.1.2 Transfer Type (TT0–TT3)—Input
Following are the state meaning and timing comments for the TT0–TT3 input signals on
the 601.
State Meaning Asserted/Negated—Indicates the type of transfer in progress (see
Table 8-2). For I/O controller interface operations these signals form
part of the XATC and are snooped by the 601 if XATS is asserted.
Timing Comments Assertion/Negation—The same as A0–A31.
TT0 Special operations: This signal is asserted whenever a bus transaction is run in response to a
lwarx/stwcx. instruction pair, a TLBI (translation lookaside buffer invalidate) operation, or either an
eciwx or ecowx instruction.
TT1 Read (or write) operations: This signal indicates whether the transaction is a read (TT1 high) or a write
(TT1 low). This assumes that the transaction is not address-only.
TT2 Invalidate operations: When asserted with GBL, the TT2 output signal indicates that all other caches in
the system should invalidate the cache entry on a snoop hit. If the snoop hit is to a modified entry, the
sector should be copied back before being invalidated.
TT3 Address-only operations: This signal, when asserted, indicates that the data transfer is to/from memory.
External logic can synthesize a data bus request from the combined assertions of TS (or XATS) and TT3.
If TT3 is not asserted with the address, the associated bus transaction is considered to be a broadcast
operation that all potential bus masters must honor (or a reserved operation), except for the external
control functions (eciwx and ecowx) which require both address and data tenures.
1 0 0 0 — — Reserved
1 0 1 1 — — Reserved
1. These are the transactions the 601 produces for the given encodings, and may not be the same
transactions produced by other bus masters with the same encoding. For example the encoding b'0001' is
a single-beat write coming from the 601, but another master may use this encoding or another type of
write transaction. Bus participants should use the TT pins in conjunction with the other transfer attribute
pins to determine the type of transaction.
2. Cache control operations resulting from explicit cache control instructions (for example, dclf, sync, dclz,
dcli).
3. The signal encodings for these operations do not use the TT0 and TT3 signals in the manner described in
Table 8-1. Note that TT4 is reserved.
TC0 Depends on whether the current transaction is a read or write operation; therefore, TC0 should be
used with TT1. On a read operation, TC0 asserted indicates the transaction is an instruction fetch
operation; otherwise, the read operation is a data operation.
Asserting TC0 for write operations indicates the cache sector associated with a write is being
invalidated; TC0 negated indicates the cache sector associated with a write is not being invalidated.
TC1 TC1, when asserted, indicates that an operation to reload the other sector is queued; therefore, the
next bus transaction will likely be to the same page of memory. After the addressed sector in a cache
line is loaded from memory, the 601 attempts to load the other sector in the cache line. This is a low-
priority bus operation and may not be the next transaction. The assertion of TC1 suggests that the
next access may be to the same page; the hint may be wrong depending on the bus traffic/code
execution dynamics.
Negated—Indicates that the 601 will not make available the reserved
queue for a snoop hit push resulting from a transaction. This is the
“normal” mode.
Timing Comments Assertion/Negation—Must be valid through the entire address
tenure.
Note: This pin is a feature of the 601 only and will not be available in any other PowerPC
processors.
High Impedance—Indicates that the 601 does not need the snooped
address tenure to be retried.
Timing Comments Assertion—Occurs two bus cycles immediately following the
assertion of TS if a retry is required.
Z A Pipeline busy
Negated—Indicates that the 601 must hold off its data tenures.
Timing Comments Assertion—May occur any time to indicate the 601 is free to take
data bus mastership. It is not sampled until TS or XATS is asserted.
Negated—Indicates that the 601 must run the data bus tenures in the
same order as the address tenures.
Timing Comments Assertion—Must occur no later than a qualified DBG for a previous
write tenure. Do not assert if no pending data bus write tenures are
pending from previous address tenures.
Negation—May occur any time after a qualified DBG and before the
next assertion of DBG.
DH0–DH7 0
DH8–DH15 1
DH16–DH23 2
DH24–DH31 3
DL0–DL7 4
DL8–DL15 5
DL16–DL23 6
DL24–DL31 7
DP0 DH0–DH7
DP1 DH8–DH15
DP2 DH16–DH23
DP3 DH24–DH31
DP4 DL0–DL7
DP5 DL8–DL15
DP6 DL16–DL23
DP7 DL24–DL31
Negation—Must occur after the bus clock cycle of the final (or only)
data beat of the transfer. For a burst transfer, the system can assert TA
for one bus clock cycle and then negate it to advance the burst
transfer to the next beat and insert wait states during the next beat.
Negation—Must occur during the bus clock cycle after a valid data
Negation—May occur any time after the minimum pulse width has
been met.
Note that systems that do not use this signal should tie it low.
Negated—The drive current for the six signals above will be the
same as all other signals for the 601.
Timing Comments Assertion/Negation—This is not a dynamic signal; it must not
change after HRESET is negated.
BSCAN_EN
(Boundary Scan Enable)
SCAN_CTL I This input signal should be driven high for normal operation.
SCAN_CLK I This input signal should be driven high for normal operation.
SCAN_SIN I This input signal should be driven low for normal operation.
ESP_EN I This input signal should be driven high for normal operation.
BSCAN_E N I This input signal should be driven high for normal operation.
(internal)
PCLK_EN D D OUT P_CLOCK
(IN)
Following are the state meaning and timing comments for the PCLK_EN signal.
State Meaning Asserted—Indicates that the 601 should generate the high phase of
the internal processor clock synchronized to 2X_PCLK.
Negated—Indicates that the 601 should generate the low phase of the
internal processor clock synchronized to 2X_PCLK.
Timing Comments Assertion—May occur one 2X_PCLK cycle after the negation of
PCLK_EN with appropriate setup to the falling edge of 2X_PCLK.
Negated—Indicates that the 601 outputs must not change state, and
the inputs will not be sampled. This signal can be treated as a
synchronous enable for the bus clock cycle clock.
Timing Comments Assertion/Negation—With appropriate setup and hold time to the
2X_PCLK provided the rising edge of the internal processor clock
coincides with the 2X_PCLK.
Figure 8-4 through Figure 8-8 illustrate how the 601 clocking signals can be used to
generate a logical bus clock. Note that the resulting logical bus clock is represented as an
arrow coincident with the rising edge of the resulting signal. It should not be inferred that
the duty cycle of the bus clock signal is 50 percent.
Figure 8-4 shows how the clock inputs can be used to control the 601. Note that the signal
IN is the output of the inverter shown in Figure 8-3.
0 1 2 3 4 5 6 7
2X_PCLK
PCLK_EN
* * * *
IN
* * *
P_CLK
Figure 8-5 shows a simple 601 clock implementation with the frequency of the logical bus
clock equal to that of the P_CLK.
PCLK_EN
* * * *
IN
* * *
P_CLK
BCLK_EN
Figure 8-6 shows the generation of the logical bus clock at one-half the frequency of the
P_CLK.
0 1 2 3 4 5 6 7 8 9 10 11
2X_PCLK
PCLK_EN
* * * * * *
IN
* * * * *
P_CLK
BCLK_EN
Bus Transition
*Delay of inverter output
Figure 8-7 shows the generation of the logical bus clock at one-third the frequency of the
P_CLK.
PCLK_EN
* * * * * *
IN
* * * * *
P_CLK
BCLK_EN
Bus Transition
*Delay of inverter output
Figure 8-8 shows how the PCLK_EN signal can be manipulated to perform cycle stretching
on the 601.
0 1 2 3 4 5 6 7 8 9 10 11
2X_PCLK
PCLK_EN
* * * * * *
IN
* * * * *
P_CLK
BCLK_EN
Bus Transition
In this document, processor clock refers to the internal P_CLOCK signal; bus clock refers
to the clock that causes the bus transitions.
Figure 8-5 and Figure 8-6 show two examples of the generation of bus transitions. In the
first example, BCLK_EN is grounded (always asserted) and the bus clock period is
equivalent to the P_CLOCK cycle period. In the second example, the BCLK_EN input is
driven by a clock switching at PCLK_EN/2 frequency. This allows the 601 bus interface to
run at half the frequency of the CPU P_CLOCK, easing system design constraints. Note
that the BCLK_EN input can be divided further (with respect to PCLK_EN), allowing an
even greater ratio between the clock- and bus-cycle frequencies.
This section describes the PowerPC 601 microprocessor bus interface and its operation. It
shows how the 601 signals, defined in Chapter 8, “Signal Descriptions,” interact to perform
address and data transfers.
RTCU INSTRUCTION
QUEUE
RTCL
+
8 WORDS
INSTRUCTION INSTRUCTION
ISSUE LOGIC
IU BPU FPU
+ * / + + * /
CTR
GPR CR FPR
XER FILE LR FILE FPSCR
1 WORD 2 WORDS
DATA
ADDRESS
MMU
32-KBYTE
UTLB ITLB PHYSICAL ADDRESS TAGS CACHE
(INSTRUC-
BAT TION AND DA-
ARRAY
ADDRESS
DATA
MEMORY UNIT 4 WORDS
READ WRITE QUEUE DATA
QUEUE SNOOP 8 WORDS
A SNOOP
B ADDRESS
ADDRESS
DATA
2 WORDS
SYSTEM INTERFACE
qual BG_ 601 internal signal (inaccessible to the user, but used in
diagrams to clarify operations)
ADDRESS TENURE
DATA TENURE
Figure 9-3. Overlapping Tenures on the PowerPC 601 Microprocessor Bus for a
Single-Beat Transfer
The basic functions of the address and data tenures are as follows:
• Address tenure
— Arbitration: During arbitration, address bus arbitration signals are used to gain
mastership of the address bus.
— Transfer: After the 601 is the address bus master, it transfers the address on the
address bus. The address signals and the transfer attribute signals control the
address transfer. The address parity and address parity error signals ensure the
integrity of the address transfer.
— Termination: After the address transfer, the system signals that the address tenure
is complete or that it must be repeated.
• Data tenure
— Arbitration: To begin the data tenure, the 601 arbitrates for mastership of the data
bus.
— Transfer: After the 601 is the data bus master, it samples the data bus for read
operations or drives the data bus for write operations. The data parity and data
parity error signals ensure the integrity of the data transfer.
— Termination: Data termination signals are required after each data beat in a data
transfer. Note that in a single-beat transaction, the data termination signals also
indicate the end of the tenure, while in burst accesses, the data termination
signals apply to individual beats and indicate the end of the tenure only after the
final data beat.
-1 0 1
Logical Bus Clock
need_bus
BR
bg
abb
artry
qual BG
ABB
External arbiters must allow only one device at a time to be address bus master. In
implementations in which no other device can be a master, BG can be grounded (always
asserted) to continually grant mastership of the address bus to the 601.
-1 0 1
need_bus
BR
bg
abb
artry
qual BG
ABB
When the 601 receives a qualified bus grant, it assumes address bus mastership by asserting
ABB and negating the BR output signal. Meanwhile, the 601 drives the address for the
requested access onto the address bus and asserts TS to indicate the start of a new
transaction.
When designing external bus arbitration logic, note that the 601 may assert BR without
using the bus after it receives the qualified bus grant. For example, in a system using bus
snooping, if the 601 asserts BR to perform a replacement copy-back operation, another
device can invalidate that sector before the 601 is granted mastership of the bus. Once the
601 is granted the bus, it no longer needs to perform the copy-back operation; therefore, the
601 does not assert ABB and does not use the bus for the copy-back operation. Note that
the 601 asserts BR for at least one clock cycle in these instances.
0 1 2 3 4
qual BG
TS
ABB
ADDR+
aack
artry_in
0 1 2 3 4
bg
TS
ABB
ADDR+
TT3
aack
artry_in
The basic coherency size of the bus is defined to be 32 bytes (corresponding to one cache
sector). Data transfers that cross an aligned, 32-byte boundary either must present a new
address onto the bus at that boundary (for coherency consideration) or must operate as
noncoherent data with respect to the 601.
Byte 0 0 1 000 √ — — — — — — —
0 0 1 001 — √ — — — — — —
0 0 1 010 — — √ — — — — —
0 0 1 011 — — — √ — — — —
0 0 1 100 — — — — √ — — —
0 0 1 101 — — — — — √ — —
0 0 1 110 — — — — — — √ —
0 0 1 111 — — — — — — — √
0 1 0 010 — — √ √ — — — —
0 1 0 100 — — — — √ √ — —
0 1 0 110 — — — — — — √ √
Word 1 0 0 000 √ √ √ √ — — — —
1 0 0 100 — — — — √ √ √ √
Notes:
√ The byte portions of the requested operand that are read or written during that bus transaction.
— These entries are not required and are ignored during read transactions and are driven with undefined
data during all write transactions (except noncacheable write transfers, in which data is mirrored on both
word lanes if the transfer does not exceed four bytes).
Data bus byte lane 0 corresponds to DH0-DH7, byte lane 7 corresponds to DL24-DL31.
The 601 also supports misaligned memory operations. These transfers address memory that
is not aligned to the size of the data being transferred (such as, a word read of an odd byte
address). Although most of these operations hit in the primary cache (or generate burst
memory operations if they miss), the 601 interface supports misaligned transfers within a
double-word (64-bit aligned) boundary, as shown in Table 9-3. Note that the three-byte
transfer in Table 9-3 is only one example of misalignment. As long as the attempted transfer
does not cross a double-word boundary, the 601 can transfer the data on the misaligned
address (for example, a word read from an odd byte-aligned address, or a seven-byte read
from an odd byte-aligned address).
An attempt to address data that crosses a double-word boundary requires two bus transfers
to access the data. This is illustrated in the last example of a three-byte transfer in Table 9-3.
The transfer requires two accesses—the first for the last two bytes of one double-word
011 001 — A A A — — — —
011 010 — — A A A — — —
011 011 — — — A A A — —
011 100 — — — — A A A —
011 101 — — — — — A A A
TC1 Asserted The next access is likely to be on same page. A sector has been loaded, and a
low-priority load of the adjacent sector is queued.
Negated The next access is not likely to be on the next page; an optional low-priority
load of an adjacent sector is not queued.
1 2 3 4 5 6 7
ts
abb
addr
aack
ARTRY
SHD
qualBG
ABB
When the data tenure begins before the address tenure is complete, if the 601 has asserted
DBB, assertion of ARTRY causes the 601 to terminate the data bus transaction and retry
both the address and data tenures later. If the transfer is a single-beat transfer and TA occurs
as early as the AACK window, there is no indication of an early data bus termination.
However, if a burst transaction is in progress, the 601 negates DBB early in response to
ARTRY. The system logic does not need to assert TA for four bus clock cycles in this case.
High impedance High impedance Exclusive. No snoop hit. Pipeline not busy.
If the SHD and ARTRY inputs are not asserted for a cache-sector fill operation, the sector
is marked as exclusive (see Section 9.4.4, “Memory Coherency—MESI Protocol”). If the
SHD input is asserted without ARTRY, the sector is marked as shared.
Note: If the invalidate (TT2) output signal is asserted for the transaction, the sector is
marked exclusive regardless of the state of the SHD signal. If ARTRY is asserted without
SHD, a device cannot service the address transaction currently (because of queuing
constraints) and the transaction is retried later. The 601 reacts to the assertion of ARTRY
the same way, regardless of the state of SHD. The timing of the SHD input is the same as
the timing for ARTRY.
One or more devices can indicate a queuing retry condition by asserting ARTRY while one
or more devices separately indicate the snoop-hit shared condition by asserting SHD. This
condition appears as a snoop hit modified condition on the bus, since both SHD and
ARTRY are asserted. This is not a problem for the 601 since ARTRY is not qualified by
SHD (that is, SHD is a don't care if ARTRY is asserted to the 601).
0 1 2 3
TS
dbg
dbb
drtry
qual DBG
DBB
0 1 2 3 4
TS
qual DBG
DBB
data
ta
drtry
TT2
AACK
0 1 2 3
TS
qual DBG
DBB
data
ta
drtry
TT2
AACK
Normal termination of a burst transfer occurs when TA is asserted during four bus clock
cycles, as shown in Figure 9-12. The bus clock cycles need not be consecutive, thus
allowing pacing of the data transfer beats. For read bursts to terminate successfully, TEA
and DRTRY must remain negated during the transfer. For write bursts, TEA must remain
negated during the transfer. DRTRY is ignored during data writes.
1 2 3 4 5 6 7
TS
qual DBG
DBB
data
ta
drtry
1 2 3 4 5
TS
qual DBG
DBB
data
ta
drtry
Figure 9-14 shows the effect of using DRTRY during a burst read. It also shows the effect
of using TA to pace the data transfer rate. Notice that in bus clock cycle 3 of Figure 9-14,
TA is negated for the second data beat. The 601 data pipeline does not proceed until bus
clock cycle 4 when the TA is reasserted.
Note that DRTRY is useful for systems that implement speculative forwarding of data such
as those with direct-mapped, second-level caches where hit/miss is determined on the
following bus clock cycle, or for parity- or ECC-checked memory systems.
Note that DRTRY may not be implemented on other PowerPC processors.
TS
qual DBG
DBB
data
ta
drtry
Assertion of the TEA signal causes a machine-check exception (and possibly a check-stop
condition within the 601). For more information, see Section 5.4.2, “Machine Check
Exception (x'00200').” However assertion of TEA does not invalidate data entering the GPR
or the cache; therefore, the 601 may act on invalid code/data (although the exception will
eventually be recognized, if enabled). Additionally, the corresponding address of the access
that caused TEA to be asserted is not latched by the 601. To recover, the 601 must be reset;
therefore, this function should only be used to flag fatal system conditions to the processor
(such as parity or uncorrectable ECC errors).
After the 601 has committed to run a transaction, that transaction must eventually complete.
Address retry causes the transaction to be restarted; TA wait states and DRTRY assertion
for reads delay termination of individual data beats. Eventually, however, the system must
either terminate the transaction or assert the TEA signal to put the 601 into checkstop mode.
For this reason, care must be taken to check for the end of physical memory and the location
of certain system facilities.
Note that TEA generates a machine-check exception depending on the ME bit in the MSR.
Setting the checkstop enable control bits properly leads to a true checkstop condition.
Note also that the 601 does not implement a synchronous error capability for memory
accesses (see Section 9.6, “Memory- vs. I/O-Mapped I/O Operations”). This means that the
exception instruction pointer does not point to the memory operation that caused the
assertion of TEA, but to the instruction about to be executed (perhaps several instructions
later).
INVALID SHW
(On a miss, the old
line is first invalidated RMS SHARED
and copied back
if M) RH
WM
RME WH
SHR
SHW
(burst)
SHW
SHR SHW
MODIFIED EXCLUSIVE
WH
RH RH
WH
BUS TRANSACTIONS
000 Set 0
001 Set 1
010 Set 2
011 Set 3
100 Set 4
101 Set 5
110 Set 6
111 Set 7
1 2 3 4 5 6 7 8 9 10 11 12
BR
BG
ABB
TS
TBST
GBL
AACK
ARTRY
DBG
DBB
D0–D63 In In In
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
BR
BG
ABB
TS
TBST
GBL
AACK
ARTRY
DBG
DBB
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12
Figure 9-18 shows three ways to delay single-beat reads showing data-delay controls:
• The TA hold-off can be used to insert wait states in clock cycles 3 and 4.
• For the second access, DBG could have been asserted in clock cycle 6.
• In the third access, DRTRY is asserted in clock cycle 11 to flush the previous data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
BR
BG
ABB
TS
TBST
GBL
AACK
ARTRY
DBG
DBB
D0–D63 In In Bad In
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1 2 3 4 5 6 7 8 9 10 11 12
BR
BG
ABB
TS
TBST
GBL
AACK
ARTRY
DBG
DBB
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12
CPU A BR
CPU A BG
CPU B BR
CPU B BG
ABB
TS
TBST
GBL
AACK
ARTRY
CPU A DBG
CPU B DBG
DBB
D0–D63 In Out In
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
BR
BG
ABB
TS
TBST
GBL
AACK
ARTRY
DBG
DBB
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
BR
BG
ABB
TS
TBST
GBL
AACK
ARTRY
DBG
DBB
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
DATA TENURE
For the first beat of the address bus, the extended address transfer code (XATC), contains
the I/O opcode as shown in Table 9-7; the opcode is formed by concatenating the transfer
type, transfer burst, and transfer size signals defined as follows:
XATC = TT(0–3)||TBST||TSIZ(0–2)
0 7 0 123 1112 27 28 31
XATC +
I/O Opcode
BUID PID
Key bit
Reserved
0 7 0 3 4 31
XATC + SR(28-31) Bus Address
Byte Count Address Bus (A0–A31)
0 7 0 123 1112 27 28 31
XATC +
I/O Opcode
BUID BUC Specific PID
Reserved
0–1 Reserved. These bits should be set to zero for compatibility with future PowerPC microprocessors.
3–11 BUID. Sender tag of a reply operation. Corresponds with bits 3–11 of one of the 601 segment
registers.
12–27 Address bits 12–27 are BUC-specific and are ignored by the 601.
28–31 PID (receiver tag). The 601 effectively snoops operations on the bus and, on reply operations,
compares this field to bits 28–31 of the PID register to determine if it should recognize this I/O reply.
The second beat of the address bus is reserved; the XATC and address buses should be
driven to zero to preserve compatibility with future protocol enhancements.
The following sequence occurs when the 601 detects an error bit set on an I/O reply
operation:
1. The 601 completes the instruction that initiated the access.
2. If the instruction is a load, the data is forwarded onto the register file(s)/sequencer.
3. An I/O controller interface error exception is generated, which transfers 601 control
to the I/O controller interface error exception handler to recover from the error. Refer
to Section 5.4.10, “I/O Controller Interface Error Exception (x'00A00'),” for more
information.
If the error bit is not set, the 601 instruction that initiated the access completes and
instruction execution resumes.
ABB
XATS
ADDR, XAT0 PKT 0 PKT 1 PKT 0 PKT 1 PKT 0 PKT 1 Reply Rsrvd
DBB
DH0–DH31
TA
Figure 9-28 shows an I/O store access, comprised of three I/O controller interface
operations in this example. As with the example in Figure 9-27, notice that data is
transferred only on the 32 bits of the DH bus. As opposed to Figure 9-27, there is no request
operation since the 601 has the data ready for the BUC.
The TEA signal may be asserted on any I/O controller interface operation. If it is asserted,
the processor enters a checkstop condition if MSR[ME] is cleared, or it will queue a
machine check exception if ME is set. After TEA is asserted, it must be reasserted for all
tenures associated with the current I/O controller interface operation until the load last or
store last operation occurs. When the operation occurs, the execution unit is released to take
the machine check exception. If the TEA signal is asserted for an I/O controller interface
operation, the reply operations (store reply or load reply) must not occur. If it does, it causes
a checkstop condition. If the TEA signal is not asserted with each tenure of a given I/O
controller interface operation, the result of the assertion of TEA is unpredictable. The 601
may take a machine check exception or cause a checkstop condition.
ABB
XATS
ADDR, XATC PKT 0 PKT 1 PKT 0 PKT 1 Reply Rsrvd
DBB
DH0–DH31
TA
9.7.2 Checkstops
The 601 has two checkstop signals, an input (CKSTP_IN) and an output (CKSTP_OUT).
If CKSTP_IN is asserted, the 601 halts operations by gating off all internal clocks. The 601
does not assert CKSTP_OUT if CKSTP_IN if asserted.
If CKSTP_OUT is asserted, the 601 has checkstopped internally. The CKSTP_OUT signal
can be asserted for various reasons including receiving a TEA signal, as the result of the
lack of an instruction dispatch, or internal and external parity errors. For more information
on checkstop state, refer to Section 5.4.2.2, “Checkstop State (MSR[ME] = 0).”
Note that checkstop conditions can be disabled by setting bits in the HID0 register. For
information, see Section 2.3.3.13.1, “Checkstop Sources and Enables Register—HID0.”
SCAN_OUT TDO 78
The BSCAN_EN input pin must be driven low to enable the boundary-scan test mode.
Additionally, the BSCAN_EN input must be pulled-up when boundary-scan testing is not
being performed. The addition of this pin is a deviation from the IEEE 1149.1 specification
which only defines five pins for the test interface.
PCLK_EN
TCK
Normal
TCK
–20%
TCK
+20%
Typical PCLK_EN frequency will be 50–66 MHz if supplied by the clock circuits on the
unit under test. This allows TCK frequencies of 10–13 MHz. In most IEEE 1149.1 testing
scenarios, the TCK frequency is likely to be much less than 20% of the PCLK_EN
frequency. When PCLK_EN and 2X_PCLK are provided by the automatic test equipment,
it is acceptable to run these clocks at a much lower frequency than they would normally run.
For example, if PCLK_EN = 5 MHz and 2X_PCLK = 10 MHz, then TCK must be ≤ 1
MHz.
The timing relationships between the IEEE 1149.1 signals is shown in Figure 9-30 and the
signal timing requirements are given in Table 9-10. All the timing parameters are specified
as a portion of the PCLK_EN cycle time.
TCK
3 4
TDI
or
TMS 5
TDO
Table 9-10 provides signal timing requirements for the IEEE 1149.1 interface for the
signals shown in Figure 9-30.
Table 9-10. IEEE 1149.1 Signal Timing Requirements
Label Characteristic Minimum Maximum
3 Input setup time for TDI and TMS One PCLK_EN period
4 Input hold time for TDI and TMS One PCLK_EN period
BYPASS 111
EXTEST 000
SAMPLE/PRELOAD 101
When using the EXTEST instruction, note that no stable logic levels will be held on the
outputs while in the SHIFT DR state. The 601 outputs are forced to the high impedance
state while the TAP controller is in the SHIFT DR state. The 601 outputs will be enabled if
a valid instruction is in the instruction register and the TAP controller is in the UPDATE DR
or UPDATE IR state. This is a deviation from the IEEE 1149.1 standard which requires
outputs to be held valid while in the SHIFT DR state.
18 A0 I/O 141/108
19 A1 I/O 142/109
21 A2 I/O 143/110
22 A3 I/O 144/111
23 A4 I/O 145/112
26 A5 I/O 146/113
27 A6 I/O 147/114
28 A7 I/O 148/115
30 A8 I/O 149/116
31 A9 I/O 150/117
Note that the internal signal JTAGEN shown at the end of Table 9-12 is used to control the
direction of all the bidirectional pins of the 601. JTAGEN is bit position 420 in the
boundary-scan chain and needs to be set by the user when EXTEST operation is desired.
Setting the JTAGEN bit to 1 places all the 601 bidirectional I/O pins in the output enabled
(drive) mode. When JTAGEN is cleared to 0 all the 601 bidirectional I/O pins are set to the
input (receive) mode.
CPU A BR
CPU A BG
CPU B BR
CPU B BG
ABB
TS
TBST
GBL
AACK
ARTRY
CPU A DBG
CPU A DBWO
CPU B DBG
TA
DRTRY
TEA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
* Indicates the 601 flushed this data due to address retry
BD (16–29) Immediate field specifying a 14-bit signed two's complement branch displacement that is
concatenated on the right with b'00' and sign-extended to 32 bits.
BI (11–15) Field used to specify a bit in the CR to be used as the condition of a branch conditional
instruction
BO (6–10) Field used to specify options for the branch conditional instructions. The encoding is described in
Section 3.6.2, “Conditional Branch Control.”
crbD (6–10) Field used to specify a bit in the CR or in the FPSCR as the destination of the result of an
instruction
crfD (6–8) Field used to specify one of the CR fields or one of the FPSCR fields as a destination
crfS (11–13) Field used to specify one of the CR fields or one of the FPSCR fields as a source
CRM (12–19) Field mask used to identify the CR fields that are to be updated by the mtcrf instruction
d(16–31) Immediate field specifying a 16-bit signed two's complement integer that is sign-extended to 32
bits
FM (7–14) Field mask used to identify the FPSCR fields that are to be updated by the mtfsf instruction
IMM (16–19) Immediate field used as the data to be placed into a field in the FPSCR
LI (6–29) Immediate field specifying a 24-bit, signed two's complement integer that is concatenated on the
right with b'00' and sign-extended to 32 bits
MB (21–25) and Fields used in rotate instructions to specify a 32-bit mask consisting of “1” bits from bit MB+32
ME (26–30) through bit ME+32 inclusive, and “0” bits elsewhere, as described in Section 3.3.4, “Integer
Rotate and Shift Instructions”.
NB (16–20) Field used to specify the number of bytes to move in an immediate string load or store
OE (21) Used for extended arithmetic to enable setting OV and SO in the XER
SPR (11–20) Field used to specify a special purpose register for the mtspr and mfspr instructions. The
encoding is described in Section 3.7.2, “Move to/from Special-Purpose Register Instructions.”
TO (6–10) Field used to specify the conditions on which to trap. The encoding is described in Section 3.6.9,
“Trap Instructions and Mnemonics
← Assignment
∗ Multiplication
+ Two’s-complement addition
|| Used to describe the concatenation of two values (i.e., 010 || 111 is the same as 010111)
(rA|0) The contents of rA if the rA field has the value 1–31, or the value 0 if the rA field is 0
. (period) As the last character of an instruction mnemonic, a period (.) means that the instruction
updates the condition register field.
DOUBLE(x) Result of converting x form floating-point single format to floating-point double format.
ROTL[32](x, y) Result of rotating the 64-bit value x||x left y positions, where x is 32 bits long
SINGLE(x) Result of converting x from floating-point double format to floating-point single format
(n)x The replication of x, n times (i.e., x concatenated to itself n-1 times). (n)0 and (n)1 are
special cases
Undefined An undefined value. The value may vary from one implementation to another, and from
one execution to another on the same implementation.
Characterization Reference to the setting of status bits, in a standard way that is explained in the text
CIA Current instruction address, which is the 32-bit address of the instruction being
described by a sequence of pseudocode. Used by relative branches to set the next
instruction address (NIA). Does not correspond to any architected register.
NIA Next instruction address, which is the 32-bit address of the next instruction to be
executed (the branch destination) after a successful branch. In pseudocode, a
successful branch is indicated by assigning a value to NIA. For instructions which do not
branch, the next instruction address is CIA + 4.
Do Do loop, indenting shows range. “To” and/or “by” clauses specify incrementing an
iteration variable, and “while” and/or “until” clauses give termination conditions, in the
usual manner.
∗, ÷ Left to right
|| Left to right
| Left to right
– (range) None
← None
Pseudocode description of
rD ← (rA) + (rB)
instruction operation
Text description of The sum (rA) + (rB) is placed into rD.
instruction operation
Registers altered by instruction Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: SO, OV (if OE=1)
Note in Figure 10-1 that the execution unit that executes the instruction may not be the same
for other PowerPC processors.
Reserved
31 D A 00000 OE 360 Rc
0 5 6 10 11 15 16 20 21 22 30 31
31 D A B OE 266 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← (rA) + (rB)
The sum (rA) + (rB) is placed into rD.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: SO, OV (if OE=1)
31 D A B OE 10 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← (rA) + (rB)
The sum (rA) + (rB) is placed into rD.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: CA
Affected: SO, OV (if OE=1)
31 D A B OE 138 Rc
0 5 6 10 11 15 16 20 21 22 30 31
addi rD,rA,SIMM
[POWER mnemonic: cal]
14 D A SIMM
0 5 6 10 11 15 16 31
addic rD,rA,SIMM
[POWER mnemonic: ai]
12 D A SIMM
0 5 6 10 11 15 16 31
rD ← (rA) + EXTS(SIMM)
The sum (rA) + SIMM is placed into rD.
Other registers altered:
• XER:
Affected: CA
addic. rD,rA,SIMM
[POWER mnemonic: ai.]
13 D A SIMM
0 5 6 10 11 15 16 31
rD ← (rA) + EXTS(SIMM)
The sum (rA) + SIMM is placed into rD.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO
• XER:
Affected: CA
addis rD,rA,SIMM
[POWER mnemonic: cau]
15 D A SIMM
0 5 6 10 11 15 16 31
Reserved
31 D A 00000 OE 234 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← (rA) + XER[CA] - 1
The sum (rA)+XER[CA]+x'FFFFFFFF' is placed into rD.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: CA
Affected: SO, OV (if OE=1)
Reserved
31 D A 00000 OE 202 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← (rA) + XER[CA]
The sum (rA)+XER[CA] is placed into rD.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: CA
Affected: SO, OV (if OE=1)
31 S A B 28 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 60 Rc
0 5 6 10 11 15 16 20 21 30 31
rA←(rS)+ ¬ (rB)
The contents of rS is ANDed with the one’s complement of the contents of rB and the result
is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
andi. rA,rS,UIMM
[POWER mnemonic: andil.]
)
28 S A UIMM
0 5 6 10 11 15 16 31
andis. rA,rS,UIMM
[POWER mnemonic: andiu.]
29 S A UIMM
0 5 6 10 11 15 16 31
18 LI AA LK
0 5 6 29 30 31
16 BO BI BD AA LK
0 5 6 10 11 15 16 29 30 31
Reserved
19 BO BI 00000 528 LK
0 5 6 10 11 15 16 20 21 30 31
Reserved
19 BO BI 00000 16 LK
0 5 6 10 11 15 16 20 21 30 31
clcs rD,rA
Reserved
31 D A 00000 531 Rc
0 5 6 10 11 15 16 20 21 30 31
00xxx Undefined
010xx Undefined
1xxxx Undefined
cmp crfD,L,rA,rB
Reserved
31 crfD 0 L A B 0 0
0 5 6 8 9 10 11 15 16 20 21 30 31
a ← (rA)
b ← (rB)
if a < b then c ← b'100'
else if a > b then c ← b'010'
else c ← b'001'
CR[4∗crfD–4∗crfD+3] ← c || XER[SO]
The contents of rA is compared with the contents of rB, treating the operands as signed
integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether the instruction operands are treated as 64- or 32-bit
operands, with L=0 indicating 32-bit operands and L=1 indicating 64-bit operands. The
state of the L operand does not affect the operation of the 601.
Other registers altered:
• Condition Register (CR Field specified by operand crfD):
Affected: LT, GT, EQ, SO
cmpi crfD,L,rA,SIMM
Reserved
11 crfD 0 L A SIMM
0 5 6 8 9 10 11 15 16 31
a ← (rA)
if a < EXTS(SIMM) then c ← b'100'
else if a > EXTS(SIMM) then c ← b'010'
else c ← b'001'
CR[4∗crfD–4∗crfD+3] ← c || XER[SO]
The contents of rA is compared with the sign-extended value of the SIMM field, treating
the operands as signed integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether the instruction operands are treated as 64- or 32-bit
operands, with L=0 indicating 32-bit operands and L=1 indicating 64-bit operands. The
state of the L operand does not affect the operation of the 601.
Other registers altered:
• Condition Register (CR Field specified by operand crfD):
Affected: LT, GT, EQ, SO
cmpl crfD,L,rA,rB
Reserved
31 crfD 0 L A B 32 0
0 5 6 8 9 10 11 15 16 20 21 31
a ← (rA)
b ← (rB)
if a <U b then c ← b'100'
else if a >U b then c ← b'010'
else c ← b'001'
CR[4∗crfD–4∗crfD+3] ← c || XER[SO]
The contents of rA is compared with the contents of rB, treating the operands as unsigned
integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether the instruction operands are treated as 64- or 32-bit
operands, with L=0 indicating 32-bit operands and L=1 indicating 64-bit operands. The
state of the L operand does not affect the operation of the 601.
Other registers altered:
• Condition Register (CR Field specified by operand crfD):
Affected: LT, GT, EQ, SO
cmpli crfD,L,rA,UIMM
Reserved
10 crfD 0 L A UIMM
0 5 6 8 9 10 11 15 16 31
a ← (rA)
if a <U ((48)0 || UIMM) then c ← b'100'
else if a >U ((48)0 || UIMM) then c ← b'010'
else c ← b'001'
CR[4∗crfD–4∗crfD+3] ← c || XER[SO]
The contents of rA is compared with x'0000' || UIMM, treating the operands as unsigned
integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether the instruction operands are treated as 64- or 32-bit
operands, with L=0 indicating 32-bit operands and L=1 indicating 64-bit operands. The
state of the L operand does not affect the operation of the 601.
Other registers altered:
• Condition Register (CR Field specified by operand crfD):
Affected: LT, GT, EQ, SO
Reserved
31 S A 00000 26 Rc
0 5 6 10 11 15 16 20 21 30 31
n←0
do while n < 32
if rS[n]=1 then leave
n ← n+1
rA ← n
A count of the number of consecutive zero bits starting at bit 0 of rS is placed into rA. This
number ranges from 0 to 32, inclusive.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
For count leading zeros instructions, if Rc=1 then LT is cleared to zero in the CR0 field.
crand crbD,crbA,crbB
Reserved
crandc crbD,crbA,crbB
Reserved
creqv crbD,crbA,crbB
Reserved
crnand crbD,crbA,crbB
Reserved
crnor crbD,crbA,crbB
Reserved
cror crbD,crbA,crbB
Reserved
crorc crbD,crbA,crbB
Reserved
crxor crbD,crbA,crbB
Reserved
dcbf rA,rB
Reserved
31 00000 A B 86 0
0 5 6 10 11 15 16 20 21 30 31
dcbi rA,rB
Reserved
31 00000 A B 470 0
0 5 6 10 11 15 16 20 21 30 31
dcbst rA,rB
Reserved
31 00000 A B 54 0
0 5 6 10 11 15 16 20 21 30 31
dcbt rA,rB
Reserved
31 00000 A B 278 0
0 5 6 10 11 15 16 20 21 30 31
dcbtst rA,rB
Reserved
31 00000 A B 246 0
0 5 6 10 11 15 16 20 21 30 31
dcbz rA,rB
[POWER mnemonic: dclz]
Reserved
31 00000 A B 1014 0
0 5 6 10 11 15 16 20 21 30 31
31 D A B OE 331 Rc
0 5 6 10 11 15 16 20 21 22 30 31
31 D A B OE 363 Rc
0 5 6 10 11 15 16 20 21 22 30 31
31 D A B OE 491 Rc
0 5 6 10 11 15 16 20 21 22 30 31
dividend ←(rA)
divisor ←(rB)
rD ← dividend ÷ divisor
Register rA is the 32-bit dividend. Register rB is the 32-bit divisor. A 32-bit quotient is
formed and placed into rD. The remainder is not supplied as a result.
Both operands are interpreted as signed integers. The quotient is the unique signed integer
that satisfies the following:
dividend=(quotient times divisor)+r
where
0≤ r < |divisor|
if the dividend is non-negative, and
–|divisor| < r ≤ 0
if the dividend is negative.
If an attempt is made to perform any of the divisions
x'8000 0000' / –1
<anything> / 0
then the contents of rD are undefined as are (if Rc=1) the contents of the LT, GT, and EQ
bits of the CR0 field. In these cases, if OE=1 then OV is set to 1.
31 D A B OE 459 Rc
0 5 6 10 11 15 16 20 21 22 30 31
dividend ← (rA)
divisor ← (rB)
rD ← dividend ÷ divisor
The dividend is the contents of rA. The divisor is the contents of rB. A 32-bit quotient is
formed and placed into rD. The remainder is not supplied as a result.
Both operands are interpreted as unsigned integers. The quotient is the unique unsigned
integer that satisfies the following:
dividend=(quotient ∗ divisor)+r
where
0≤ r < divisor.
If an attempt is made to divide by zero, the contents of rD are undefined as are (if Rc=1)
the contents of the LT, GT, and EQ bits of the CR0 field. In this case, if OE=1 then OV is
set to 1.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: SO, OV (if OE=1)
The 32-bit signed remainder of dividing rA by rB can be computed as follows, except in
the case that rA=–231 and rB=–1.
divwu rD,rA,rB # rD=quotient
mull rD,rD,rB # rD=quotient∗divisor
subf rD,rD,rA # rD=remainder
31 D A B OE 264 Rc
0 5 6 10 11 15 16 20 21 22 30 31
dozi rD,rA,SIMM
9 D A SIMM
0 5 6 10 11 15 16 31
eciwx rD,rA,rB
Reserved
31 D A B 310 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
if EAR[E]=1 then
paddr ← address translation of EA
send load request for paddr to device identified by EAR[RID]
rD ← word from device
else
DSISR[11] ← 1
generate data access exception
EA is the sum (rA|0)+(rB).
If EAR[E]=1, a load request for the physical address corresponding to EA is sent to the
device identified by EAR[RID], bypassing the cache. The word returned by the device is
placed in rD. The EA sent to the device must be word aligned, or the results will be
boundedly undefined.
If EAR[E]=0, a data access exception is taken, with bit 11 of DSISR set to 1.
The eciwx instruction is supported for effective addresses that reference ordinary
(SR[T]=0) segments, and for EAs mapped by the BAT registers. The eciwx instruction
support EAs generated when MSR[DT]=0 and MSR[DT]=1 when executed by the 601,
while the PowerPC architecture only supports EAs generated when MSR[DT]=1. The
instruction is treated as a no-op for EAs that correspond to I/O controller interface
(SR[T]=1) segments.
The access caused by this instruction is treated as a load from the location addressed by EA
with respect to protection and reference and change recording.
This instruction is defined as an optional instruction by the PowerPC architecture, and may
not be available in all PowerPC implementations.
Other registers altered:
• None
ecowx rS,rA,rB
Reserved
31 S A B 438 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
if EAR[E]=1 then
paddr ← address translation of EA
send store request for paddr to device identified by EAR[RID]
send rS to device
else
DSISR[11] ← 1
generate data access exception
EA is the sum (rA|0)+(rB).
If EAR[E]=1, a store request for the physical address corresponding to EA and the contents
of rS are sent to the device identified by EAR[RID], bypassing the cache. The EA sent to
the device must be word aligned, or the results will be boundedly undefined.
If EAR[E]=0, a data access exception is taken, with bit 11 of DSISR set to 1.
The ecowx instruction is supported for effective addresses that reference ordinary
(SR[T]=0) segments, and for EAs mapped by the BAT registers. The ecowx instruction
support EAs generated when MSR[DT]=0 and MSR[DT]=1 when executed by the 601,
while the PowerPC architecture only supports EAs generated when MSR[DT]=1. The
instruction is treated as a no-op for EAs that correspond to I/O controller interface
(SR[T]=1) segments. The access caused by this instruction is treated as a store to the
location addressed by EA with respect to protection and reference and change recording.
This instruction is defined as an optional instruction by the PowerPC architecture, and may
not be available in all PowerPC implementations.
Other registers altered:
• None
Reserved
The eieio instruction provides an ordering function for the effects of load and store
instructions executed by a given processor. Executing an eieio instruction ensures that all
memory accesses previously initiated by the given processor are complete with respect to
main memory before any memory accesses subsequently initiated by the given processor
access main memory.
The synchronize (sync) and the enforce in-order execution of I/O (eieio) instructions are
handled in the same manner internally to the 601. These instructions delay execution of
subsequent instructions until all previous instructions have completed to the point that they
can no longer cause an exception, all previous memory accesses are performed globally,
and the sync or eieio operation is broadcast onto the 601 bus interface.
eieio orders loads/stores to caching inhibited memory and stores to write-through required
memory.
Other registers altered:
• None
The eieio instruction is intended for use only in performing memory-mapped I/O
operations and to prevent load/store combining operations in main memory. It can be
thought of as placing a barrier into the stream of memory accesses issued by a processor,
such that any given memory access appears to be on the same side of the barrier to both the
processor and the I/O device.
The eieio instruction may complete before previously initiated memory accesses have been
performed with respect to other processors and mechanisms.
31 S A B 284 Rc
0 5 6 10 11 15 16 21 22 30 31
rA ← ((rS) ≡ (rB))
The contents of rS is XORed with the contents of rB and the complemented result is placed
into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
Reserved
31 S A 00000 954 Rc
0 5 6 10 11 15 16 20 21 30 31
S ← rS[24]
rA[24–31] ← rS[24–31]
rA[0–23] ← (24)S
The contents of rS[24–31] are placed into rA[24–31]. Bit 24 of rS is placed into rA[0–23].
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
Reserved
31 S A 00000 922 Rc
0 5 6 10 11 15 16 20 21 30 31
S ← rS[16]
rA[16–31]← rS[16–31]
rA[0–15] ← (16)S
The contents of rS[16–31] are placed into rA[16–31]. Bit 16 of rS is placed into rA[0–15].
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
Reserved
63 D 00000 B 264 Rc
0 5 6 10 11 15 16 20 21 30 31
The contents of frB with bit 0 cleared to zero is placed into frD.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
Reserved
63 D A B 00000 21 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The floating-point operand in frA is added to the floating-point operand in frB. If the most
significant bit of the resultant significand is not a one, the result is normalized. The result
is rounded to the target precision under control of the floating-point rounding control field
RN of the FPSCR and placed into frD.
Floating-point addition is based on exponent comparison and addition of the two
significands. The exponents of the two operands are compared, and the significand
accompanying the smaller exponent is shifted right, with its exponent increased by one for
each bit shifted, until the two exponents are equal. The two significands are then added
algebraically to form an intermediate sum. All 53 bits in the significand as well as all three
guard bits (G, R, and X) enter into the computation.
If a carry occurs, the sum's significand is shifted right one bit position and the exponent is
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for invalid
operation exceptions when FPSCR[VE]=1.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
• Floating-point Status and Control Register:
Affected: FPRF, FR, FI, FX, OX, UX, XX,VXSNAN, VXISI
Reserved
59 D A B 00000 21 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The floating-point operand in frA is added to the floating-point operand in frB. If the most
significant bit of the resultant significand is not a one, the result is normalized. The result
is rounded to the target precision under control of the floating-point rounding control field
RN of the FPSCR and placed into frD.
Floating-point addition is based on exponent comparison and addition of the two
significands. The exponents of the two operands are compared, and the significand
accompanying the smaller exponent is shifted right, with its exponent increased by one for
each bit shifted, until the two exponents are equal. The two significands are then added
algebraically to form an intermediate sum. All 53 bits in the significand as well as all three
guard bits (G, R, and X) enter into the computation.
If a carry occurs, the sum's significand is shifted right one bit position and the exponent is
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for invalid
operation exceptions when FPSCR[VE]=1.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
• Floating-point Status and Control Register:
Affected: FPRF, FR, FI, FX, OX, UX, XX,VXSNAN, VXISI
fcmpo crfD,frA,frB
Reserved
63 crfD 00 A B 32 0
0 5 6 8 9 10 11 15 16 20 21 30 31
The floating-point operand in frA is compared to the floating-point operand in frB. The
result of the compare is placed into CR Field crfD and the FPCC.
If one of the operands is a NaN, either quiet or signaling, then CR Field crfD and the FPCC
are set to reflect unordered. If one of the operands is a signaling NaN, then VXSNAN is set,
and if invalid operation is disabled (VE=0) then VXVC is set. Otherwise, if one of the
operands is a QNaN then VXVC is set.
Other registers altered:
• Condition Register (CR Field specified by operand crfD):
Affected: FPCC, FX, VXSNAN, VXVC
fcmpu crfD,frA,frB
Reserved
63 crfD 00 A B 0 0
0 5 6 8 9 10 11 15 16 20 21 30 31
Reserved
63 D 00000 B 14 Rc
0 5 6 10 11 15 16 20 21 30 31
The floating-point operand in register frB is converted to a 32-bit signed integer, using the
rounding mode specified by FPSCR[RN], and placed in bits 32–63 of frD. Bits 0–31 of frD
are undefined.
If the contents of frB is greater than 231 – 1, bits 32–63 of frD are set to x '7FFF_FFFF '.
If the contents of frB is less than –231, bits 32–63 of frD are set to x '8000_0000 '.
The conversion is described fully in Section F.2, “Conversion from Floating-Point Number
to Unsigned Fixed-Point Integer Word.”
Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] is undefined.
FPSCR[FR] is set if the result is incremented when rounded. FPSCR[FI] is set if the result
is inexact.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
• Floating-point Status and Control Register:
Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN, VXCVI
Reserved
63 D 00000 B 15 Rc
0 5 6 10 11 15 16 20 21 30 31
The floating-point operand in register frB is converted to a 32-bit signed integer, using the
rounding mode round toward zero, and placed in bits 32–63 of frD. Bits 0–31 of frD are
undefined.
If the operand in frB is greater than 231 – 1, bits 32–63 of frD are set to x '7FFF_FFFF '.
If the operand in frB is less than –231, bits 32–63 of frD are set to x '8000_0000 '.
The conversion is described fully in Section F.2, “Conversion from Floating-Point Number
to Unsigned Fixed-Point Integer Word.”
Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] is undefined.
FPSCR[FR] is set if the result is incremented when rounded. FPSCR[FI] is set if the result
is inexact.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
• Floating-point Status and Control Register:
Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN, VXCVI
Reserved
63 D A B 00000 18 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
Reserved
59 D A B 00000 18 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
63 D A B C 29 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
frD ← [(frA)∗(frC)]+(frB)
59 D A B C 29 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
frD ← [(frA)∗(frC)]+(frB)
Reserved
63 D 00000 B 72 Rc
0 5 6 10 11 15 16 20 21 30 31
63 D A B C 28 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
59 D A B C 28 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
Reserved
63 D A 00000 C 25 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
Reserved
59 D A 00000 C 25 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
Reserved
63 D 00000 B 136 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The contents of register frB with bit 0 set to one is placed into frD.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
Reserved
63 D 00000 B 40 Rc
0 5 6 10 11 15 16 20 21 30 31
The contents of register frB with bit 0 inverted is placed into frD.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
63 D A B C 31 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
59 D A B C 31 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
63 D A B C 30 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
59 D A B C 30 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
Reserved
63 D 00000 B 12 Rc
0 5 6 10 11 15 16 20 21 30 31
Reserved
63 D A B 00000 20 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The floating-point operand in register frB is subtracted from the floating-point operand in
register frA. If the most significant bit of the resultant significand is not a one the result is
normalized. The result is rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR and placed into frD.
The execution of the floating-point subtract instruction is identical to that of floating-point
add, except that the contents of frB participates in the operation with its sign bit (bit 0)
inverted.
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation
exceptions when FPSCR[VE]=1.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
• Floating-point Status and Control Register:
Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI
Reserved
59 D A B 00000 20 Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The floating-point operand in register frB is subtracted from the floating-point operand in
register frA. If the most significant bit of the resultant significand is not a one the result is
normalized. The result is rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR and placed into frD.
The execution of the floating-point subtract instruction is identical to that of floating-point
add, except that the contents of frB participates in the operation with its sign bit (bit 0)
inverted.
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation
exceptions when FPSCR[VE]=1.
Other registers altered:
• Condition Register (CR1 Field):
Affected: FX, FEX, VX, OX (if Rc=1)
• Floating-point Status and Control Register:
Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI
icbi rA,rB
Reserved
31 00000 A B 982 0
0 5 6 10 11 15 16 20 21 30 31
isync
Reserved
This instruction waits for all previous instructions to complete and then discards any
fetched instructions, causing subsequent instructions to be fetched (or refetched) from
memory and to execute in the context established by the previous instructions. This
instruction has no effect on other processors or on their caches.
This instruction is context synchronizing.
Other registers altered:
• None
lbz rD,d(rA)
34 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
rD ← (24)0 || MEM(EA, 1)
The effective address is the sum (rA|0) + d. The byte in memory addressed by EA is loaded
into rD[24–31]. Bits rD[0–23] are cleared to 0.
Other registers altered:
• None
lbzu rD,d(rA)
35 D A d
0 5 6 10 11 15 16 31
EA ← rA+EXTS(d)
rD←(24)0 || MEM(EA, 1)
rA←EA
EA is the sum (rA|0) + d. The byte in memory addressed by EA is loaded into rD[24–31].
Bits rD[0-23] are cleared to 0.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA=rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA=0 or rA=rD as invalid forms
Other registers altered:
• None
lbzux rD,rA,rB
Reserved
31 D A B 119 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD ← (24)0 || MEM(EA, 1)
rA ← EA
EA is the sum (rA|0) + (rB). The byte addressed by EA is loaded into rD[24–31]. Bits
rD[0–23] are set to 0.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA=rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA=0 or rA=rD as invalid forms
Other registers altered:
• None
lbzx rD,rA,rB
Reserved
31 D A B 87 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
rD ← (24)0 || MEM(EA, 1)
EA is the sum (rA|0) + (rB). The byte in memory addressed by EA is loaded into
rD[24–31].
Bits rD[0–23] are set to 0.
Other registers altered:
• None
lfd frD,d(rA)
50 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
frD ← MEM(EA, 8)
EA is the sum (rA|0) + d.
The double word in memory addressed by EA is placed into frD.
Other registers altered:
• None
lfdu frD,d(rA)
51 D A d
0 5 6 10 11 15 16 31
EA ← rA+EXTS(d)
frD ← MEM(EA, 8)
rA ← EA
EA is the sum (rA|0) + d.
The double word in memory addressed by EA is placed into frD.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0. The PowerPC architecture defines load
with update instructions with operand rA=0 as an invalid form.
Other registers altered:
• None
lfdux frD,rA,rB
Reserved
31 D A B 631 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
frD ← MEM(EA, 8)
rA ← EA
EA is the sum (rA|0) + (rB).
The double word in memory addressed by EA is placed into frD.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0. The PowerPC architecture defines load
with update instructions with operand rA=0 as an invalid form.
Other registers altered:
• None
lfdx frD,rA,rB
Reserved
31 D A B 599 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
frD ← MEM(EA, 8)
EA is the sum (rA|0) + (rB).
The double word in memory addressed by EA is placed into frD.
Other registers altered:
• None
lfs frD,d(rA)
48 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
frD ← DOUBLE(MEM(EA, 4))
EA is the sum (rA|0) + d.
The word in memory addressed by EA is interpreted as a floating-point single-precision
operand. This word is converted to floating-point double-precision (see Section 3.5.9.1,
“Double-Precision Conversion for Floating-Point Load Instructions”) and placed into frD.
Other registers altered:
• None
lfsu frD,d(rA)
49 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
frD ← DOUBLE(MEM(EA, 4))
rA ← EA
EA is the sum (rA|0) + d.
The word in memory addressed by EA is interpreted as a floating-point single-precision
operand. This word is converted to floating-point double-precision (see Section 3.5.9.1,
“Double-Precision Conversion for Floating-Point Load Instructions”) and placed into frD.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0. The PowerPC architecture defines load
with update instructions with operand rA=0 as an invalid form.
Other registers altered:
• None
lfsux frD,rA,rB
Reserved
31 D A B 567 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
frD ← DOUBLE(MEM(EA, 4))
rA ← EA
EA is the sum (rA|0) + (rB).
The word in memory addressed by EA is interpreted as a floating-point single-precision
operand. This word is converted to floating-point double-precision (see Section 3.5.9.1,
“Double-Precision Conversion for Floating-Point Load Instructions”) and placed into frD.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0. The PowerPC architecture defines load
with update instructions with operand rA=0 as an invalid form.
Other registers altered:
• None
lfsx frD,rA,rB
Reserved
31 D A B 535 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
frD ← DOUBLE(MEM(EA, 4))
EA is the sum (rA|0) + (rB).
The word in memory addressed by EA is interpreted as a floating-point single-precision
operand. This word is converted to floating-point double-precision (see Section 3.5.9.1,
“Double-Precision Conversion for Floating-Point Load Instructions”) and placed into frD.
Other registers altered:
• None
lha rD,d(rA)
42 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
rD ← EXTS(MEM(EA, 2))
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into
rD[16–31]. Bits rD[0–15] are filled with a copy of the most significant bit of the loaded
half word.
Other registers altered:
• None
lhau rD,d(rA)
43 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
rD ← EXTS(MEM(EA, 2))
rA ← EA
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into rD[16–
31].
Bits rD[0–15] are filled with a copy of the most significant bit of the loaded half word.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA = rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA = 0 or rA = rD as invalid forms
Other registers altered:
• None
lhaux rD,rA,rB
Reserved
31 D A B 375 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD ← EXTS(MEM(EA, 2))
rA ← EA
EA is the sum (rA|0) + (rB). The half word in memory addressed by EA is loaded into
rD[16-31]. Bits rD[0–15] are filled with a copy of the most significant bit of the loaded half
word.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA = rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA = 0 or rA = rD as invalid forms
Other registers altered:
• None
lhax rD,rA,rB
Reserved
31 D A B 343 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
rD ← EXTS(MEM(EA, 2))
EA is the sum (rA|0) + (rB). The half word in memory addressed by EA is loaded into
rD[16-31]. Bits rD[0–15] are filled with a copy of the most significant bit of the loaded half
word.
Other registers altered:
• None
lhbrx rD,rA,rB
Reserved
31 D A B 790 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
rD ← (16)0 || MEM(EA+1, 1) || MEM(EA,1)
EA is the sum (rA|0) + (rB). Bits 0–7 of the half word in memory addressed by EA are
loaded into rD[24–31]. Bits 8–15 of the half word in memory addressed by EA are loaded
into rD[16–23]. Bits rD[0–15] are cleared to 0.
The PowerPC architecture cautions programmers that some implementations of the
architecture may run the lhbrx instructions with greater latency than other types of load
instructions. This is not the case in the 601. This instruction operates with the same latency
as other load instructions.
Other registers altered:
• None
lhz rD,d(rA)
40 D A d
0 5 6 10 11 15 16 31
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into rD[16–
31]. Bits rD[0–15] are cleared to 0.
Other registers altered:
• None
lhzu rD,d(rA)
41 D A d
0 5 6 10 11 15 16 31
EA ← rA+EXTS(d)
rD ← (16)0 || MEM(EA, 2)
rA ← EA
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into rD[16–
31]. Bits rD[0–15] are cleared to 0.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA = rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA = 0 or rA = rD as invalid forms.
Other registers altered:
• None
lhzux rD,rA,rB
Reserved
31 D A B 311 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD←(16)0 || MEM(EA, 2)
rA←EA
EA is the sum (rA|0) + (rB). The half word in memory addressed by EA is loaded into
rD[16–31]. Bits rD[0–15] are cleared to 0.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA = rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA = 0 or rA = rD as invalid forms.
Other registers altered:
• None
lhzx rD,rA,rB
Reserved
31 D A B 279 0
0 5 6 10 11 15 16 20 21 30 31
The effective address is the sum (rA|0) + (rB). The half word in memory addressed by EA
is loaded into rD[16–31]. Bits rD[0–15] are cleared to 0.
Other registers altered:
• None
lmw rD,d(rA)
[POWER mnemonic: lm]
46 D A d
0 5 6 10 11 15 16 31
31 D A B 277 Rc
0 5 6 10 11 15 16 20 21 30 31
lswi rD,rA,NB
[POWER mnemonic: lsi]
Reserved
31 D A NB 597 0
0 5 6 10 11 15 16 20 21 30 31
The EA is (rA|0).
lswx rD,rA,rB
[POWER mnemonic: lsx]
Reserved
31 D A B 533 0
0 5 6 10 11 15 16 20 21 30 31
EA is the sum (rA|0) + (rB). Let n = XER[25–31]; n is the number of bytes to load. Let
nr = CEIL(n/4): nr is the number of registers to receive data. If n>0, n consecutive bytes
starting at EA are loaded into GPRs rD through rD + nr – 1.
Bytes are loaded left to right in each register. The sequence of registers wraps around
through r0 if required. If the bytes of rD + nr – 1 are only partially filled, the unfilled low-
order byte(s) of that register are cleared to 0. If n=0, the content of rD is undefined.
If rA and rB are in the range of registers specified to be loaded, it will be skipped in the
load process. If operand rA = 0, the register is not considered as used for addressing, and
will be loaded.
Under certain conditions (for example, segment boundary crossings) the alignment error
handler may be invoked. For additional information about alignment exceptions, see
Section 5.4.6, “Alignment Exception (x'00600').”
Other registers altered:
• None
In future implementations, this instruction is likely to have greater latency and take longer
to execute, perhaps much longer, than a sequence of individual load instructions that
produce the same results.
lwarx rD,rA,rB
Reserved
31 D A B 20 0
0 5 6 10 11 15 16 20 21 30 31
EA is the sum (rA|0) + (rB). The word in memory addressed by EA is loaded into rD.
This instruction creates a reservation for use by a store word conditional instruction. The
physical address computed from the EA is associated with the reservation, and replaces any
address previously associated with the reservation.
The EA must be a multiple of 4. If it is not, the alignment exception handler will be invoked
if the load crosses a page boundary, or the results will be boundedly undefined.
Other registers altered:
• None
lwbrx rD,rA,rB
[POWER mnemonic: lbrx]
Reserved
31 D A B 534 0
0 5 6 10 11 15 16 20 21 30 31
EA is the sum (rA|0)+(rB). Bits 0–7 of the word in memory addressed by EA are loaded
into rD[24–31]. Bits 8–15 of the word in memory addressed by EA are loaded into
rD[16–23]. Bits 16–23 of the word in memory addressed by EA are loaded into rD[8–15].
Bits 24–31 of the word in memory addressed by EA are loaded into rD[0–7].
The PowerPC architecture cautions programmers that some implementations of the
architecture may run the lwbrx instructions with greater latency than other types of load
instructions. This is not the case in the 601. This instruction operates with the same latency
as other load instructions.
Other registers altered:
• None
lwz rD,d(rA)
[POWER mnemonic: l]
32 D A d
0 5 6 10 11 15 16 31
EA is the sum (rA|0) + d. The word in memory addressed by EA is loaded into rD.
Other registers altered:
• None
lwzu rD,d(rA)
[POWER mnemonic: lu]
33 D A d
0 5 6 10 11 15 16 31
EA ← rA+EXTS(d)
rD←MEM(EA, 4)
rA←EA
EA is the sum (rA|0) + d. The word in memory addressed by EA is loaded into rD.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA = rD the load data is loaded
into rD and the register update is suppressed. The PowerPC architecture defines load with
update instructions with operand rA = 0 or rA = rD as invalid forms.
Other registers altered:
• None
lwzux rD,rA,rB
[POWER mnemonic: lux]
Reserved
31 D A B 55 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD←MEM(EA, 4)
rA←EA
EA is the sum (rA|0)+(rB). The word in memory addressed by EA is loaded into rD.
EA is placed into rA.
If operand rA=0 the 601 does not update register r0, or if rA = rD the load data is loaded
into register rD and the register update is suppressed. The PowerPC architecture defines
load with update instructions with operand rA = 0 or rA = rD as invalid forms
Other registers altered:
• None
lwzx rD,rA,rB
[POWER mnemonic: lx]
Reserved
31 D A B 23 0
0 5 6 10 11 15 16 20 21 30 31
EA is the sum (rA|0) + (rB). The word in memory addressed by EA is loaded into rD.
Other registers altered:
• None
31 S A B 29 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 541 Rc
0 5 6 10 11 15 16 20 21 30 31
mcrf crfD,crfS
Reserved
CR[4∗crfD–4∗crfD+3] ← CR[4∗crfS–4∗crfS+3]
The contents of condition register field crfS are copied into condition register field crfD.
All other condition register fields remain unchanged.
Note that if the link bit (bit 31) is set for this instruction, the PowerPC architecture considers
the instruction to be of an invalid form. Relative to the 601, this instruction executes and
the link register is left in an undefined state.
Note: Use of invalid instruction forms is not recommended. This description is provided
for informational purposes only.
Other registers altered:
• Condition Register (CR field specified by operand crfD):
Affected: LT, GT, EQ, SO
mcrfs crfD,crfS
Reserved
The contents of FPSCR field crfS are copied to CR Field crfD. All exception bits copied
are reset to zero in the FPSCR.
Other registers altered:
• Condition Register (CR Field specified by operand crfS):
Affected: FX, OX (if crfS=0)
Affected: UX, ZX, XX, VXSNAN (if crfS=1)
Affected: VXISI, VXIDI, VXZDZ, VXIMZ (if crfS=2)
Affected: VXVC (if crfS=3)
Affected: VXSOFT, VXSQRT, VXCVI (if crfS=5)
mcrxr crfD
Reserved
CR[4∗crfD+3]←XER[0–3]
XER[0–3]← b'0000'
The contents of XER[0–3] are copied into the condition register field designated by crfD.
All other fields of the condition register remain unchanged. XER[0–3] is cleared to zero.
Other registers altered:
• Condition Register (CR Field specified by crfD operand):
Affected: LT, GT, EQ, SO
• XER[0–3]
mfcr rD
Reserved
31 D 00000 00000 19 0
0 56 10 11 15 16 20 21 30 31
rD← CR
Reserved
0 56 10 11 15 16 20 21 30 31
The contents of the FPSCR are placed into bits 32–63 of register frD. Bits 0–31 of register
frD are undefined.
Other registers altered:
• Condition Register (CR1 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
POWER Compatibility Note: The PowerPC architecture defines bits 0–31 of floating-
point register frD as undefined. In the 601, these bits take on the value x'FFF8_ 0000'.
mfmsr rD
Reserved
31 D 00000 00000 83 0
0 56 10 11 15 16 20 21 30 31
rD← MSR
Reserved
31 D SPR 339 0
0 56 10 11 20 21 30 31
n←SPR[5–9] || SPR[0–4]
rD← SPR(n)
The SPR field denotes a special purpose register, encoded as shown in Table 10-4. The
contents of the designated special purpose register are placed into rD.
The value of SPR[0] is 1 if and only if reading the register is at the supervisor-level.
Execution of this instruction specifying a supervisor-level register when MSR[PR]=1 will
result in a supervisor-level instruction type program exception.
If the SPR field contains a value that is not valid for the 601, the instruction is treated as a
no-op. For an invalid instruction form in which SPR[0]=1, if MSR[PR]=1 a supervisor-
level instruction type program exception will occur instead of a no-op.
Other registers altered:
• None
Table 10-4. SPR Encodings for mfspr
1
SPR
Register Name Access
Decimal SPR[5–9] SPR[0–4]
1Note that the order of the two 5-bit halves of the SPR number is reversed compared with actual instruction
coding. If the SPR field contains any value other than one of these implementation-specific values or one of
the values shown in Table 3-40, the instruction form is invalid. SPR[0]=1 if and only if the register is being
accessed at the supervisor level. Execution of this instruction specifying a defined and supervisor-level
register when MSR[PR]=1 results in a privilege violation type program exception.
For mtspr and mfspr instructions, SPR number coded in assembly language does not appear directly as a
10-bit binary number in the instruction. The number coded is split into two 5-bit halves that are reversed in
the instruction, with high-order 5 bits appearing in bits 16–20 of the instruction and low-order 5 bits in bits 11
to 15.
SPR encodings for DEC, MQ, RTCL, and RTCU are not part of the PowerPC architecture.
2On the 601, the mfspr instruction for the RTCU and RTCL registers must use these encodings (SPR4 and
SPR5, respectively) regardless whether the processor is in supervisor or user mode. The mtspr instruction,
which is supervisor-only for the RTCU and RTCL registers, must use the SPR20 and SPR21 encodings,
respectively.
mfsr rD,SR
Reserved
31 D 0 SR 00000 595 0
0 5 6 10 11 12 15 16 20 21 30 31
rD←SEGREG(SR)
mfsrin rD,rB
Reserved
31 D 00000 B 659 0
0 56 10 11 15 16 20 21 30 31
rD←SEGREG(rB[0–3])
The contents of the segment register selected by bits 0–3 of rB are copied into rD.
This is a supervisor-level instruction.
This instruction is defined only for 32-bit implementations. Using it on a 64-bit
implementation causes an illegal instruction exception.
Other registers altered:
• None
mtcrf CRM,rS
Reserved
31 S 0 CRM 0 144 0
0 5 6 10 11 12 19 20 21 30 31
The contents of rS are placed into the condition register under control of the field mask
specified by CRM. The field mask identifies the 4-bit fields affected. Let i be an integer in
the range 0–7. If CRM(i) = 1, CR Field i (CR bits 4∗i through 4∗i+3) is set to the contents
of the corresponding field of the of rS.
Other registers altered:
• CR fields selected by mask
Reserved
0 56 10 11 15 16 20 21 30 31
Reserved
0 56 10 11 15 16 20 21 30 31
Reserved
63 0 FM 0 frB 711 Rc
0 5 6 7 14 15 16 20 21 30 31
Bits 32–63 of register frB are placed into the FPSCR under control of the field mask
specified by FM. The field mask identifies the 4-bit fields affected. Let i be an integer in the
range 0–7. If FM(i) = 1, FPSCR Field i (FPSCR bits 4∗i through 4∗i+3) is set to the contents
of the corresponding field of the low-order 32 bits of register frB.
The other PowerPC implementations, the move to FPSCR fields (mtfsf) instruction may
perform more slowly when only a portion of the fields are updated.
Other registers altered:
• Condition Register (CR1 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• Floating-point Status and Control Register:
FPSCR fields selected by mask
Updating fewer than all eight fields of the FPSCR may have substantially poorer
performance on some implementations than updating all the fields.
When FPSCR[0–3] is specified, bits 0 (FX) and 3 (OX) are set to the values of frB[32] and
frB[35] (that is, even if this instruction causes OX to change from 0 to 1, FX is set from
frB[32] and not by the usual rule that FX is set to 1 when an exception bit changes from 0
to 1). Bits 1 and 2 (FEX and VX) are set according to the usual rule and not from
frB[33–34].
Reserved
0 5 6 8 9 10 11 12 15 16 19 20 21 30 31
The value of the IMM field is placed into FPSCR field crfD.
Other registers altered:
• Condition Register (CR1 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• Floating-point Status and Control Register:
FPSCR field crfD
When FPSCR[0–3] is specified, bits 0 (FX) and 3 (OX) are set to the values of IMM[0] and
IMM[3] (that is, even if this instruction causes OX to change from 0 to 1, FX is set from
IMM[0] and not by the usual rule that FX is set to 1 when an exception bit changes from 0
to 1). Bits 1 and 2 (FEX and VX) are set according to the usual rule, given in Section 2.2.3,
“Floating-Point Status and Control Register (FPSCR)” and not from IMM[1–2].
mtmsr rS
Reserved
0 5 6 10 11 15 16 20 21 30 31
MSR←rS[0–31]
mtspr SPR,rS
Reserved
31 S SPR 467 0
0 56 10 11 20 21 30 31
n = SPR[5–9] || SPR[0–4]
SPREG(n)←rS[0–31]
The SPR field denotes a special purpose register, encoded as shown in Table 10-4. The
contents of rS are placed into the designated special purpose register.
The value of SPR[0] is 1 if and only if writing the register is a supervisor-level operation.
Execution of this instruction specifying a defined and supervisor-level register when
MSR[PR]=1 results in a supervisor-level instruction exception.
If the SPR field contains an invalid value, the instruction is treated as a no-op. For an invalid
instruction form in which SPR[0]=1, if MSR[PR]=1 a supervisor-level instruction
exception will occur instead of a no-op.
Other registers altered:
• None
Table 10-4 lists the SPR encodings for the 601.
Table 10-5. SPR Encodings for mtspr
SPR1
Register Name Access
Decimal SPR[5–9] SPR[0–4]
1Note that the order of the two 5-bit halves of the SPR number is reversed compared with actual instruction
coding. If the SPR field contains any value other than one of these implementation-specific values or one of
the values shown in Table 3-40, the instruction form is invalid. SPR[0]=1 if and only if the register is being
accessed at the supervisor level. Execution of this instruction specifying a defined and supervisor-level
register when MSR[PR]=1 results in a privilege violation type program exception.
For mtspr and mfspr instructions, SPR number coded in assembly language does not appear directly as a
10-bit binary number in the instruction. The number coded is split into two 5-bit halves that are reversed in the
instruction, with high-order 5 bits appearing in bits 16–20 of the instruction and low-order 5 bits in bits 11–15.
SPR encodings for DEC, MQ, RTCL, and RTCU are not part of the PowerPC architecture.
2On the 601, the mfspr instruction for the RTCU and RTCL registers must use these encodings (SPR4 and
SPR5, respectively) regardless whether the processor is in supervisor or user mode. The mtspr instruction,
which is supervisor-only for the RTCU and RTCL registers, must use the SPR20 and SPR21 encodings,
respectively.
3Read access to the DEC register is supervisor-only in the PowerPC architecture, using SPR22. However,
the POWER architecture allows user-level read access using SPR6. Note that the SPR6 encoding for the
DEC will not be supported by other PowerPC processors.
mtsr SR,rS
Reserved
31 S 0 SR 00000 210 0
0 56 10 11 12 15 16 20 21 30 31
SEGREG(SR)←(rS)
mtsrin rS,rB
[POWER mnemonic: mtsri]
Reserved
31 S 00000 B 242 0
0 5 6 10 11 15 16 20 21 30 31
SEGREG(rB[0–3])←(rS)
The contents of rS are copied to the segment register selected by bits 0–3 of rB.
This is a supervisor-level instruction.
This instruction is defined only for 32-bit implementations. Using it on a 64-bit
implementation causes an illegal instruction exception.
Other registers altered:
• None
31 D A B OE 107 Rc
0 5 6 10 11 15 16 20 21 22 30 31
Reserved
31 D A B 0 75 Rc
0 5 6 10 11 15 16 20 21 22 30 31
prod[0–63]←rA[32–63]∗rB[32–63]
rD[32–63]←prod[0–31]
rD[0–31]←undefined
The contents of rA and of rB are interpreted as 32-bit signed integers. They are multiplied
to form a 64-bit signed integer product. The high-order 32 bits of the 64-bit product are
placed into rD.
If the smaller absolute value of the two multipliers is placed in rB, the instruction may
complete execution more quickly. See Chapter 7, “Instruction Timing,” for additional
information about instruction performance.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
Reserved
31 D A B 0 11 Rc
0 5 6 10 11 15 16 20 21 22 30 31
prod[0–63]←rA[32–63]∗rB[32–63]
rD[32–63]←prod[0–31]
rD[0–31]←undefined
The contents of rA and of rB are extracted and interpreted as 32-bit unsigned integers. They
are multiplied to form a 64-bit unsigned integer product. The high-order 32 bits of the 64-
bit product are placed into rD.
If the smaller absolute value of the two multipliers is placed in rB, the instruction may
complete execution more quickly. See Chapter 7, “Instruction Timing,” for additional
information about instruction performance.
This instruction causes the contents of the MQ to become undefined.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
mulli rD,rA,SIMM
[POWER mnemonic: muli]
07 D A SIMM
0 5 6 10 11 15 16 31
prod[0–48]←rA∗SIMM
rD←prod[16–48]
The low-order 32 bits of the 48-bit product (rA)∗SIMM are placed into rD. The low-order
bits of the 32-bit product are independent of whether the operands are treated as signed or
unsigned integers.
Other registers altered:
• None
31 D A B OE 235 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD←rA[32–63]∗rB[32–63]
The low-order 32 bits of the 64-bit product (rA)∗(rB) are placed into rD. The low-order
bits of the 32-bit product are independent of whether the operands are treated as signed or
unsigned integers. However, OV is set based on the result interpreted as a signed integer.
If the smaller absolute value of the two multipliers is placed in rB, the instruction may
complete execution more quickly. See Chapter 7, “Instruction Timing,” for additional
information about instruction performance.
If OE=1, then OV is set to one if the product cannot be represented in 32 bits.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: SO, OV (if OE=1)
Reserved
31 D A 00000 OE 488 Rc
0 5 6 10 11 15 16 20 21 22 30 31
31 S A B 476 Rc
0 5 6 10 11 15 16 20 21 30 31
The contents of rS are ANDed with the contents of rB and the one’s complement of the
result is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
NAND with rA=rB can be used to obtain the one's complement.
Reserved
31 D A 00000 OE 104 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD← ¬ (rA) + 1
31 S A B 124 Rc
0 5 6 10 11 15 16 20 21 30 31
The contents of rS are ORed with the contents of rB and the one’s complement of the result
is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
or rA,rS,rB (Rc=0)
or. rA,rS,rB (Rc=1)
31 S A B 444 Rc
0 5 6 10 11 15 16 20 21 30 31
rA←(rS) | (rB)
The contents of rS is ORed with the contents of rB and the result is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
31 S A B 412 Rc
0 5 6 10 11 15 16 20 21 30 31
rA ← (rS) | ¬ (rB)
The contents of rS is ORed with the complement of the contents of rB and the result is
placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
ori rA,rS,UIMM
[POWER mnemonic: oril]
24 S A UIMM
0 5 6 10 11 15 16 31
The contents of rS is ORed with x'0000' || UIMM and the result is placed into rA.
The preferred "no-op" (an instruction that does nothing) is:
ori 0,0,0
Other registers altered:
• None
oris rA,rS,UIMM
[POWER mnemonic: oriu]
25 S A UIMM
0 5 6 10 11 15 16 31
The contents of rS is ORed with UIMM || x'0000' and the result is placed into rA.
Other registers altered:
• None
Reserved
MSR[16–31]←SRR1[16–31]
NIA←iea SRR0[0–29] || 0b00
Bits 16–31 of SRR1 are placed into bits 16–31 of the MSR, then the next instruction is
fetched, under control of the new MSR value, from the address SRR0[0–29] || b'00'.
This is a supervisor-level instruction and is context synchronizing.
Other registers altered:
• MSR
22 S A B MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
20 S A SH MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
n←SH
r←ROTL(rS, n)
m←MASK(MB, ME)
rA←(r&m) | (rA & ¬ m)
The contents of rS are rotated left SH bits. A mask is generated having 1-bits from bit MB
through bit ME and 0-bits elsewhere. The rotated data is inserted into rA under control of
the generated mask.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
21 S A SH MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
n←SH
r←ROTL(rS, n)
m←MASK(MB, ME)
rA←r & m
The contents of rS are rotated left SH bits. A mask is generated having 1-bits from bit MB
through bit ME and 0-bits elsewhere. The rotated data is ANDed with the generated mask
and the result is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
The opcode rlwinm can be used to extract an n-bit field, that starts at bit position b in rS[0–
31], right-justified into rA (clearing the remaining 32 – n bits of rA), by setting SH = b + n,
MB = 32-n, and ME = 31. It can be used to extract an n-bit field, that starts at bit position
b in rS[0–31], left-justified into rA (clearing the remaining 32 – n bits of rA), by setting
SH = b, MB = 0, and ME = n – 1. It can be used to rotate the contents of a register left (or
right) by n bits, by setting SH = n(32– n), MB = 0, and ME = 31. It can be used to shift the
contents of a register right by n bits, by setting SH = 32 – N, MB = n, and ME = 31. It can
be used to clear the high-order b bits of a register and then shift the result left by n bits by
setting SH = n, MB = b – n and ME = 31 – n. It can be used to clear the low-order n bits of
a register, by setting SH = 0, MB = 0, and ME = 31 – n.
23 S A B MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
n←rB[27-31]
r←ROTL(rS, n)
m←MASK(MB, ME)
rA←r & m
The contents of rS are rotated left the number of bits specified by rB[27–31]. A mask is
generated having 1-bit from bit MB through bit ME and 0-bits elsewhere. The rotated data
is ANDed with the generated mask and the result is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
The opcode rlwnm can be used to extract an n-bit field, that starts at variable bit position b
in rS[0–31], right-justified into rA (clearing the remaining 32 – n bits of rA), by setting
rB[27–31] = b + n, MB = 32 – n, and ME = 31. It can be used to extract an n-bit field, that
starts at variable bit position b in rS[0–31], left-justified into rA (clearing the remaining
32 – n bits of rA), by setting rB[27–31] = b, MB = 0, and ME = n – 1. It can be used to
rotate the contents of a register left (or right) by variable n bits, by setting rB[27–31] =
n(32 – N), MB = 0, and ME = 31.
Equivalent mnemonics are provided for some of these uses.
31 S A B 537 Rc
0 5 6 10 11 15 16 20 21 30 31
Reserved
This instruction calls the operating system to perform a service. When control is returned
to the program that executed the system call, the content of the registers depends on the
register conventions used by the program providing the system service.
This instruction is context synchronizing, as described in Section 3.1.2, “Context
Synchronization”. Although the PowerPC architecture considers sc to be a branch
processor instruction, it is executed by the integer processor in the 601.
Other registers altered:
• Dependent on the system service
POWER Compatibility Note: The PowerPC sc instruction is substantially different from
the POWER svc instruction. The following aspects of these instructions were considered
with respect to POWER compatibility:
The PowerPC architecture defines the sc instruction with the “LK” bit set to be an invalid
form. POWER architecture defines the svc instruction (same opcode as PowerPC sc
instruction) with the “LK” bit set as a valid form which places the address of the instruction
following the svc into the link register. In the case of the 601, an sc instruction with the
“LK” bit set will execute correctly (as defined in the PowerPC architecture) and will update
the link register with the address of the instruction following the sc instruction.
The PowerPC architecture defines the sc instruction in such a manner that requires bit 30
of the instruction to be b'1' (when bit 30 is b'0', the instruction is considered reserved). The
POWER architecture svc instruction does not have such a restriction, and uses this bit to
define an alternate form of the svc instruction. Although the 601 does not support this
alternate form of the svc instruction, it does ignore the state of bit 30 of the instruction
during decode and execution.
As a result of executing an sc instruction, the PowerPC architecture defines bits 0–15 of
register SRR1 to be undefined. In the case of the 601, execution of the sc instruction will
cause bits 16–31 of the instruction to be placed into bits 0–15 of register SRR1.
The effective (logical) address of the instruction following the system call instruction is
placed into SRR0. Bits 16–31 of the MSR are placed into bits 16–31 of SRR1.
31 S A B 153 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 217 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A SH 184 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A SH 248 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 216 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 152 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 24 Rc
0 5 6 10 11 15 16 20 21 30 31
n←rB[27-31]
rA←ROTL(rS, n)
If bit 16 of rB=0, the contents of rS are shifted left the number of bits specified by rB[27–
31]. Bits shifted out of position 0 are lost. Zeros are supplied to the vacated positions on the
right. The 32-bit result is placed into rA. If bit 16 of rB=1, 32 zeros are placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
31 S A SH 952 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 920 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 792 Rc
0 5 6 10 11 15 16 20 21 30 31
n←rB[27-31]
rA←ROTL(rS, n)
If rB[26]=0,then the contents of rS are shifted right the number of bits specified by rB[27–
31]. Bits shifted out of position 31 are lost. The result is padded on the left with sign bits
before being placed into rA. If rB[26]=1, then rA is filled with 32 sign bits (bit 0) from rS.
CR0 is set based on the value written into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: CA
31 S A SH 824 Rc
0 5 6 10 11 15 16 20 21 30 31
n←SH
rA←ROTL(rS, 32-n)
The contents of rS are shifted right SH bits. Bits shifted out of position 31 are lost. The
shifted value is sign extended before being placed in rA. The 32-bit result is placed into rA.
XER[CA] is set to 1 if rS contains a negative number and any 1-bits are shifted out of
position 31; otherwise XER[CA] is cleared to 0. A shift amount of zero causes XER[CA]
to be cleared to 0.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: CA
31 S A B 665 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 921 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 729 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A SH 696 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A SH 760 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 728 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 664 Rc
0 5 6 10 11 15 16 20 21 30 31
31 S A B 536 Rc
0 5 6 10 11 15 16 20 21 30 31
n←rB[27-31]
rA←ROTL(rS, 32-n)
If rB[26]=0, the contents of rA are shifted right the number of bits specified by rA[27–31].
Bits shifted out of position 31 are lost. Zeros are supplied to the vacated positions on the
left. The 32-bit result is placed into rA.
If rB[26]=1, then rA is filled with zeros.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
stb rS,d(rA)
38 S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 1)←rS[24-31]
EA is the sum (rA|0)+d. Register rS[24–31] is stored into the byte in memory addressed
by EA. Register rS is unchanged.
Other registers altered:
• None
stbu rS,d(rA)
39 S A d
0 5 6 10 11 15 16 31
EA←(rA) + EXTS(d)
MEM(EA, 1)←rS[24-31]
rA←EA
EA is the sum (rA|0)+d. Register rS[24–31] is stored into the byte in memory addressed
by EA.
EA is placed into rA.
While the PowerPC architecture defines the instruction form as invalid if rA=0, the 601
supports execution with rA=0 as shown above.
Other registers altered:
• None
stbux rS,rA,rB
Reserved
31 S A B 247 0
0 5 6 10 11 15 16 21 22 30 31
EA←(rA) + (rB)
MEM(EA, 1)←rS[24-31]
rA←EA
EA is the sum (rA|0)+(rB). Register rS[24–31] is stored into the byte in memory addressed
by EA.
EA is placed into rA.
While the PowerPC architecture defines the instruction form as invalid if rA=0, the 601
supports execution with rA=0 as shown above.
Other registers altered:
• None
stbx rS,rA,rB
Reserved
31 S A B 215 0
0 5 6 10 11 15 16 21 22 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 1) ← rS[24-31]
EA is the sum (rA|0)+(rB). Register rS[24–31] is stored into the byte in memory addressed
by EA. Register rS is unchanged.
Other registers altered:
• None
stfd frS,d(rA)
54 frS A d
0 5 6 10 11 15 16 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 8)←(frS)
stfdu frS,d(rA)
55 frS A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + d
MEM(EA, 4)← SINGLE(frS)
rA←EA
stfdux frS,rA,rB
Reserved
31 frS A B 759 0
0 5 6 10 11 15 16 20 21 30 31
EA←(rA) + (rB)
MEM(EA, 8)←(frS)
rA←EA
stfdx frS,rA,rB
Reserved
31 frS A B 727 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b ←0
else b←(rA)
EA←b + (rB)
MEM(EA, 8)←(frS)
stfs frS,d(rA)
52 frS A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 4)←SINGLE(frS)
stfsu frS,d(rA)
53 frS A d
0 5 6 10 11 15 16 31
EA←rA + EXTS(d)
MEM(EA, 4)←SINGLE(frS)
rA←EA
stfsux frS,rA,rB
Reserved
31 frS A B 695 0
0 5 6 10 11 15 16 20 21 30 31
EA←(rA) + (rB)
MEM(EA, 4)←SINGLE(frS)
rA←EA
stfsx frS,rA,rB
Reserved
31 frS A B 663 0
0 5 6 10 11 15 16 20 21 30 31
sth rS,d(rA)
44 S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 2)←rS[16-31]
EA is the sum (rA|0) + d. Register rS[16–31] is stored into the half word in memory
addressed by EA.
Other registers altered:
• None
sthbrx rS,rA,rB
Reserved
31 S A B 918 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 2)←rS[24-31] || rS[16-23]
EA is the sum (rA|0)+(rB). The contents of rS[24–31] are stored into bits 0–7 of the half
word in memory addressed by EA. Bits rS[16–23] are stored into bits 8–15 of the half word
in memory addressed by EA.
Other registers altered:
• None
sthu rS,d(rA)
45 S A d
0 5 6 10 11 15 16 31
EA←rA + EXTS(d)
MEM(EA, 2)←rS[16-31]
rA←EA
EA is the sum (rA|0)+d. The contents of rS[16–31] are stored into the half word in memory
addressed by EA.
EA is placed into rA.
While the PowerPC architecture defines the instruction form as invalid if rA=0, the 601
supports execution with rA=0 as shown above.
Other registers altered:
• None
sthux rS,rA,rB
Reserved
31 S A B 439 0
0 5 6 10 11 15 16 20 21 30 31
EA←(rA) + (rB)
MEM(EA, 2)←rS[16-31]
rA←EA
EA is the sum (rA|0)+(rB). Register rS[16–31] is stored into the half word in memory
addressed by EA.
EA is placed into rA.
While the PowerPC architecture defines the instruction form as invalid if rA=0, the 601
supports execution with rA=0 as shown above.
Other registers altered:
• None
sthx rS,rA,rB
Reserved
31 S A B 407 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 2)←rS[16-31]
EA is the sum (rA|0) + (rB). Register rS[16–31] is stored into the half word in memory
addressed by EA.
Other registers altered:
• None
stmw rS,d(rA)
[POWER mnemonic: stm]
47 S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
r←rS
do while r ≤ 31
MEM(EA, 4) ← GPR(r)
r←r + 1
EA← EA + 4
stswi rS,rA,NB
[POWER mnemonic: stsi]
Reserved
31 S A NB 725 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then EA←0
else EA←(rA)
if NB = 0 then n←32
else n←NB
r←rS-1
i←0
do while n>0
if i = 0 then r←r+1 (mod 32)
MEM(EA, 1)←GPR(r)[i–i+7]
i←i+8
if i = 32 then i←0
EA←EA+1
n←n-1
stswx rS,rA,rB
[POWER mnemonic: stsx]
Reserved
31 S A B 661 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b+(rB)
n←XER[25-31]
r←rS-1
i←0
do while n>0
if i = 0 then r←r+1 (mod 32)
MEM(EA, 1)←GPR(r)[i–i+7]
i←i+8
if i = 32 then i←0
EA←EA+1
n←n-1
stw rS,d(rA)
[POWER mnemonic: st]
36 S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 4)←rS
EA is the sum (rA|0) + d. The contents of rS are stored into the word in memory addressed
by EA.
Other registers altered:
• None
stwbrx rS,rA,rB
[POWER mnemonic: stbrx]
Reserved
31 S A B 662 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 4)←rS[24-31] || rS[16-23] || rS[8-15] || rS[0-7]
EA is the sum (rA|0) + (rB). The contents of rS[24–31] are stored into bits 0–7 of the word
in memory addressed by EA. Bits rS[16–23] are stored into bits 8–15 of the word in
memory addressed by EA. Bits rS[8–15] are stored into bits 16–23 of the word in memory
addressed by EA. Bits rS[0–7] are stored into bits 24–31 of the word in memory addressed
by EA.
Other registers altered:
• None
stwcx. rS,rA,rB
31 S A B 150 1
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
if RESERVE then
MEM(EA, 4)←rS
RESERVE←0
CR0←0b00 || 0b1|| XER[SO]
else
CR0←0B00 || 0B0 || XER[SO]
stwu rS,d(rA)
[POWER mnemonic: stu]
37 S A d
0 5 6 10 11 15 16 31
EA←rA + EXTS(d)
MEM(EA, 4)←rS
rA←EA
EA is the sum (rA|0)+d. The contents of rS are stored into the word in memory addressed
by EA.
EA is placed into rA.
While the PowerPC architecture defines the instruction form as invalid if rA=0, the 601
supports execution with rA=0 as shown above.
Other registers altered:
• None
stwux rS,rA,rB
[POWER mnemonic: stux]
Reserved
31 S A B 183 0
0 5 6 10 11 15 16 20 21 30 31
EA←(rA) + (rB)
MEM(EA, 4)←rS
rA←EA
EA is the sum (rA|0)+(rB). The contents of rS are stored into the word in memory
addressed by EA.
EA is placed into rA.
While the PowerPC architecture defines the instruction form as invalid if rA=0, the 601
supports execution with rA=0 as shown above.
Other registers altered:
• None
stwx rS,rA,rB
[POWER mnemonic: stx]
Reserved
31 S A B 151 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 4)←rS
EA is the sum (rA|0)+(rB). The contents of rS are is stored into the word in memory
addressed by EA.
Other registers altered:
• None
31 D A B OE 40 Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← ¬ (rA) + (rB) + 1
The sum ¬ (rA) + (rB) +1 is placed into rD.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: SO, OV (if OE=1)
31 D A B OE 8 Rc
0 5 6 10 11 15 16 20 21 22 30 31
31 D A B OE 136 Rc
0 5 6 10 11 15 16 20 21 22 30 31
subfic rD,rA,SIMM
[POWER mnemonic: sfi]
08 D A SIMM
0 5 6 10 11 15 16 31
Reserved
31 D A 00000 OE 232 Rc
0 5 6 10 11 15 16 20 21 22 30 31
Reserved
31 D A 00000 OE 200 Rc
0 5 6 10 11 15 16 20 21 22 30 31
Reserved
The sync instruction provides an ordering function for the effects of all instructions
executed by a given processor. Executing a sync instruction ensures that all instructions
previously initiated by the given processor appear to have completed before any subsequent
instructions are initiated by the given processor. When the sync instruction completes, all
external accesses initiated by the given processor prior to the sync will have been
performed with respect to all other mechanisms that access memory.
The sync instruction can be used to ensure that the results of all stores into a data structure,
performed in a “critical section” of a program, are seen by other processors before the data
structure is seen as unlocked. The eieio instruction may be more appropriate than sync for
cases in which the only requirement is to control the order in which external references are
seen by I/O devices.
Other registers altered:
• None
tlbie rB
[POWER mnemonic: tlbi]
Reserved
VPI←rB[4–19]
Identify TLB entries corresponding to VPI
each such TLB entry←invalid
EA is the contents of rB. The translation lookaside buffer (referred to as the TLB)
containing entries corresponding to the EA are made invalid (i.e., removed from the TLB).
Additionally, a TLB invalidate operation is broadcast on the system interface. The TLB
search is done regardless of the settings of MSR[IT] and MSR[DT]. Block address
translation for EA, if any, is ignored.
Because the 601 supports broadcast of TLB entry invalidate operations, the following must
be observed:
• The tlbie instruction(s) must be contained in a critical section, controlled by
software locking, so that tlbie is issued on only one processor at a time.
• A sync instruction must be issued after every tlbie and at the end of the critical
section. This causes the hardware to wait for the effects of the preceding tlbie
instructions(s) to propagate to all processors.
A processor detecting a TLB invalidate broadcast performs the following:
1. Prevents execution of any new load, store, cache control or tlbie instructions and
prevents any new reference or change bit updates
2. Waits for completion of any outstanding memory operations (including updates to
the reference and change bits associated with the entry to be invalidated)
3. Invalidates the two entries (both associativity classes) in the UTLB indexed by the
matching address
4. Resumes normal execution
This is a supervisor-level instruction. It is optional in the PowerPC architecture.
Nothing is guaranteed about instruction fetching in other processors if the tlbie instruction
deletes the page in which some other processor is currently executing.
Other registers altered:
• None
tw TO,rA,rB
[POWER mnemonic: t]
Reserved
31 TO A B 4 0
0 5 6 10 11 15 16 20 21 30 31
a← EXTS(rA)
b← EXTS(rB)
if (a < b) & TO[0] then TRAP
if (a > b) & TO[1] then TRAP
if (a = b) & TO[2] then TRAP
if (a <U b) & TO[3] then TRAP
if (a >U b) & TO[4] then TRAP
The contents of rA are compared with the contents of rB. If any bit in the TO field is set to
1 and its corresponding condition is met by the result of the comparison, then the system
trap handler is invoked.
Other registers altered:
• None
twi TO,rA,SIMM
[POWER mnemonic: ti]
03 TO A SIMM
0 5 6 10 11 15 16 31
a← EXTS(rA)
if (a < EXTS(SIMM)) & TO[0] then TRAP
if (a > EXTS(SIMM)) & TO[1] then TRAP
if (a = EXTS(SIMM)) & TO[2] then TRAP
if (a <U EXTS(SIMM)) & TO[3] then TRAP
if (a >U EXTS(SIMM)) & TO[4] then TRAP
The contents of rA are compared with the sign-extended SIMM field. If any bit in the TO
field is set to 1 and its corresponding condition is met by the result of the comparison, then
the system trap handler is invoked.
Other registers altered:
• None
31 S A B 316 Rc
0 5 6 10 11 15 16 20 21 30 31
rA←(rS) ⊕ (rB)
The contents of rA is XORed with the contents of rB and the result is placed into rA.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
xori rA,rS,UIMM
[POWER mnemonic: xoril]
26 S A UIMM
0 5 6 10 11 15 16 31
xoris rA,rS,UIMM
[POWER mnemonic: xoriu]
27 S A UIMM
0 5 6 10 11 15 16 31
Table 10-7 provides a list of 32-bit SPR encodings that are not implemented by the 601.
Table 10-7. 32-Bit SPR Encodings Not Implemented by the PowerPC 601
Microprocessor
SPR
Register
Access
Name
Decimal SPR[5–9] SPR[0–4]
Table 10-9 provides the 64-bit SPR encoding that is not implemented by the 601.
Table 10-9. 64-Bit SPR Encoding Not Implemented by the PowerPC 601
Microprocessor
SPR
Register
Access
Name
Decimal SPR[5–9] SPR[0–4]
This document contains information on a new product under development by Motorola and IBM. Motorola and IBM reserve the right to change or
discontinue this product without notice. Information in this document is provided solely to enable system and software implementers to use PowerPC
microprocessors. There are no express or implied copyright or patent licenses granted hereunder by Motorola or IBM to design, modify the design of, or
fabricate circuits based on the information in this document.
The PowerPC 601 microprocessor embodies the intellectual property of Motorola and of IBM. However, neither Motorola nor IBM assumes any
responsibility or liability as to any aspects of the performance, operation, or other attributes of the microprocessor as marketed by the other party or by
any third party. Neither Motorola nor IBM is to be considered an agent or representative of the other, and neither has assumed, created, or granted hereby
any right or authority to the other, or to any third party, to assume or create any express or implied obligations on its behalf. Information such as data
sheets, as well as sales terms and conditions such as prices, schedules, and support, for the product may vary as between parties selling the product.
Accordingly, customers wishing to learn more information about the products as marketed by a given party should contact that party.
Both Motorola and IBM reserve the right to modify this manual and/or any of the products as described herein without further notice. NOTHING IN THIS
MANUAL, NOR IN ANY OF THE ERRATA SHEETS, DATA SHEETS, AND OTHER SUPPORTING DOCUMENTATION, SHALL BE INTERPRETED AS
THE CONVEYANCE BY MOTOROLA OR IBM OF AN EXPRESS WARRANTY OF ANY KIND OR IMPLIED WARRANTY, REPRESENTATION, OR
GUARANTEE REGARDING THE MERCHANTABILITY OR FITNESS OF THE PRODUCTS FOR ANY PARTICULAR PURPOSE. Neither Motorola nor
IBM assumes any liability or obligation for damages of any kind arising out of the application or use of these materials. Any warranty or other obligations
as to the products described herein shall be undertaken solely by the marketing party to the customer, under a separate sale agreement between the
marketing party and the customer. In the absence of such an agreement, no liability is assumed by Motorola, IBM, or the marketing party for any damages,
actual or otherwise.
“Typical” parameters can and do vary in different applications. All operating parameters, including “Typicals,” must be validated for each customer
application by customer’s technical experts. Neither Motorola nor IBM convey any license under their respective intellectual property rights nor the rights
of others. Neither Motorola nor IBM makes any claim, warranty, or representation, express or implied, that the products described in this manual are
designed, intended, or authorized for use as components in systems intended for surgical implant into the body, or other applications intended to support
or sustain life, or for any other application in which the failure of the product could create a situation where personal injury or death may occur. Should
customer purchase or use the products for any such unintended or unauthorized application, customer shall indemnify and hold Motorola and IBM and
their respective officers, employees, subsidiaries, affiliates, and distributors harmless against all claims, costs, damages, and expenses, and reasonable
attorney fees arising out of, directly or indirectly, any claim of personal injury or death associated with such unintended or unauthorized use, even if such
claim alleges that Motorola or IBM was negligent regarding the design or manufacture of the part.
Motorola and are registered trademarks of Motorola, Inc. Motorola, Inc. is an Equal Opportunity/Affirmative Action Employer.
IBM and IBM logo are registered trademarks, and IBM Microelectronics is a trademark of International Business Machines Corp.
The PowerPC name, PowerPC logotype,PowerPC 601, PowerPC 603, PowerPC 603e, PowerPC Architectur, and POWER Architecture are trademarks
of International Business Machines Corp. used by Motorola under license from International Business Machines Corp. International Business Machines
Corp. is an Equal Opportunity/Affirmative Action Employer.