Latticemico32 Processor Reference Manual: December 2012
Latticemico32 Processor Reference Manual: December 2012
Reference Manual
December 2012
Copyright
Copyright © 2012 Lattice Semiconductor Corporation.
Trademarks
Lattice Semiconductor Corporation, L Lattice Semiconductor Corporation (logo), L
(stylized), L (design), Lattice (design), LSC, CleanClock, Custom Movile Device,
DiePlus, E2CMOS, Extreme Performance, FlashBAK, FlexiClock, flexiFLASH,
flexiMAC, flexiPCS, FreedomChip, GAL, GDX, Generic Array Logic, HDL Explorer,
iCE Dice, iCE40, iCE65, iCEblink, iCEcable, iCEchip, iCEcube, iCEcube2, iCEman,
iCEprog, iCEsab, iCEsocket, IPexpress, ISP, ispATE, ispClock, ispDOWNLOAD,
ispGAL, ispGDS, ispGDX, ispGDX2, ispGDXV, ispGENERATOR, ispJTAG, ispLEVER,
ispLeverCORE, ispLSI, ispMACH, ispPAC, ispTRACY, ispTURBO, ispVIRTUAL
MACHINE, ispVM, ispXP, ispXPGA, ispXPLD, Lattice Diamond, LatticeCORE,
LatticeEC, LatticeECP, LatticeECP-DSP, LatticeECP2, LatticeECP2M, LatticeECP3,
LatticeECP4, LatticeMico, LatticeMico8, LatticeMico32, LatticeSC, LatticeSCM,
LatticeXP, LatticeXP2, MACH, MachXO, MachXO2, MACO, mobileFPGA, ORCA,
PAC, PAC-Designer, PAL, Performance Analyst, Platform Manager, ProcessorPM,
PURESPEED, Reveal, SiliconBlue, Silicon Forest, Speedlocked, Speed Locking,
SuperBIG, SuperCOOL, SuperFAST, SuperWIDE, sysCLOCK, sysCONFIG, sysDSP,
sysHSI, sysI/O, sysMEM, The Simple Machine for Complex Design, TraceID,
TransFR, UltraMOS, and specific product designations are either registered
trademarks or trademarks of Lattice Semiconductor Corporation or its subsidiaries in
the United States and/or other countries. ISP, Bringing the Best Together, and More of
the Best are service marks of Lattice Semiconductor Corporation.
Other product names used in this publication are for identification purposes only and
may be trademarks of their respective companies.
Disclaimers
NO WARRANTIES: THE INFORMATION PROVIDED IN THIS DOCUMENT IS “AS IS”
WITHOUT ANY EXPRESS OR IMPLIED WARRANTY OF ANY KIND INCLUDING
WARRANTIES OF ACCURACY, COMPLETENESS, MERCHANTABILITY,
NONINFRINGEMENT OF INTELLECTUAL PROPERTY, OR FITNESS FOR ANY
PARTICULAR PURPOSE. IN NO EVENT WILL LATTICE SEMICONDUCTOR
CORPORATION (LSC) OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES
WHATSOEVER (WHETHER DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR
CONSEQUENTIAL, INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF
PROFITS, BUSINESS INTERRUPTION, OR LOSS OF INFORMATION) ARISING
OUT OF THE USE OF OR INABILITY TO USE THE INFORMATION PROVIDED IN
THIS DOCUMENT, EVEN IF LSC HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES. BECAUSE SOME JURISDICTIONS PROHIBIT THE EXCLUSION
OR LIMITATION OF CERTAIN LIABILITY, SOME OF THE ABOVE LIMITATIONS MAY
NOT APPLY TO YOU.
Bold Items in the user interface that you select or click. Text that you type
into the user interface.
Courier Code examples. Messages, reports, and prompts from the software.
Arithmetic 30
Logic 31
Comparison 31
Shift 31
Data Transfer 32
Program Flow Control 32
cmpgeu 64
cmpgeui 64
cmpgu 65
cmpgui 66
cmpne 66
cmpnei 67
divu 67
eret 68
lb 68
lbu 69
lh 69
lhu 70
lw 70
modu 71
mul 72
muli 72
mv 73
mvhi 73
nor 74
nori 74
not 75
or 75
ori 76
orhi 76
rcsr 77
ret 78
sb 78
scall 79
sextb 79
sexth 80
sh 81
sl 81
sli 82
sr 82
sri 83
sru 83
srui 84
sub 85
sw 85
wcsr 86
xnor 86
xnori 87
xor 87
xori 88
Index 89
As systems become more complex, there are a growing number of L_2 and
L_3 protocols that continue to burden a local host processor. These tend to
incrementally add processing requirements to the local processor, starving
other critical functions of processor machine cycles. To alleviate the local host
processor’s processing requirements, embedded processors are being
utilized to support the main processor in a distributed processing architecture.
These embedded processors offer localized control, OA&M functionality, and
statistics gathering and processing features, thereby saving the host
processor many unnecessary clock cycles, which can be used for higher-level
functions.
performance required for a broad application set. Some of the key features of
this 32-bit processor include:
RISC architecture
32-bit data path
32-bit instructions
32 general-purpose registers
Up to 32 external interrupts
Optional instruction cache
Optional data cache
Dual WISHBONE memory interfaces (instruction and data)
Programmer’s Model
Pipeline Architecture
The LatticeMico32 processor uses a 32-bit, 6-stage pipeline, as shown in
Figure 3 on page 6. It is fully bypassed and interlocked. The bypass logic is
responsible for forwarding results back through the pipeline, allowing most
instructions to be effectively executed in a single cycle. The interlock is
responsible for detecting read-after-write hazards and stalling the pipeline
until the hazard has been resolved. This avoids the need to insert nop
directives between dependent instructions, keeping code size to a minimum,
as well as simplifying assembler-level programming.
PC
A
Instruction Instruction
WISHBONE Cache Memory
F
Instruction Register
Bypass Interlock
Logic &
X
Divide
Data Data
Cache Memory
WISHBONE
M
Align
Data Types
The LatticeMico32 processor supports the data types listed in Table 1.
In addition to the above, the extended data types in Table 2 can be emulated
through a compiler.
Register Architecture
This section describes the register architecture of the LatticeMico32
processor.
General-Purpose Registers
The LatticeMico32 features the following 32-bit registers:
By convention, register 0 (r0) must always hold the value 0, and this is
required for correct operation by both the LatticeMico32 assembler and
the C compiler. On power-up, the value of 0 in r0 is not hardwired, so you
must initialize it to load r0 with the 0 value.
Registers 1 through 28 are truly general purpose and can be used as the
source or destination register for any instruction. After reset, the values in
all of these registers are undefined.
Register 29 (ra) is used by the call instruction to save the return address
but is otherwise general purpose.
Register 30 (ea) is used to save the value of the Program Counter (PC)
when an exception occurs, so it should not be used by user-level
programs.
Register 31 (ba) saves the value of the Program Counter (PC) when a
breakpoint or watchpoint exception occurs, so it should not be used by
user-level programs.
After reset, the values in all of the above 32-bit registers are undefined. To
ensure that register 0 contains 0, the first instruction executed after reset
should be xor r0, r0, r0.
Table 3 lists the general-purpose registers and specifies their use by the C
compiler. In this table, the callee is the function called by the caller function.
r3 General-purpose/argument 2 Caller
r4 General-purpose/argument 3 Caller
r5 General-purpose/argument 4 Caller
r6 General-purpose/argument 5 Caller
r7 General-purpose/argument 6 Caller
r8 General-purpose/argument 7 Caller
r9 General-purpose Caller
PC No Program counter
PC – Program Counter
The PC CSR is a 32-bit register that contains the address of the instruction
currently being executed. Because all instructions are four bytes wide, the two
least significant bits of the PC are always zero. After reset, the value of the PC
CSR is h00000000.
IE – Interrupt Enable
The IE CSR contains a single-bit flag, IE, that determines whether interrupts
are enabled. This flag has priority over the IM CSR. In addition, there are two
bits, BIE and EIE, that are used to save the value of the IE field when either a
breakpoint or other exception occurs. Each interrupt is associated with a
mask bit (IE bit) indexed with each interrupt. After reset, the value of the IE
CSR is h00000000.
IM – Interrupt Mask
The IM CSR contains an enable bit for each of the 32 interrupts. Bit 0
corresponds to interrupt 0. In order for an interrupt to be raised, both an
enable bit in this register and the IE flag in the IE CSR must be set to 1. After
reset, the value of the IM CSR is h00000000.
IP – Interrupt Pending
The IP CSR contains a pending bit for each of the 32 interrupts. A pending bit
is set when the corresponding interrupt request line is asserted low. Bit 0
corresponds to interrupt 0. Bits in the IP CSR can be cleared by writing a 1
with the wcsr instruction. Writing a 0 has no effect. After reset, the value of the
IP CSR is h00000000.
I Any – Invalidate data cache When written, the contents of the data
cache are invalidated.
CC – Cycle Counter
The CC CSR is an optional 32-bit register that is incremented on each clock
cycle. It can be used to profile ghost code sequences.
CFG – Configuration
The CFG CSR details the configuration of a particular instance of a
LatticeMico32 processor.
U Reserved.
Table 8:
Field Values Description
DIM 0 – Data inline memory is not implemented. Indicates whether data inline
memory is implemented.
1 – Data inline memory is implemented.
Memory Architecture
This section describes the memory architecture of the LatticeMico32
processor.
Address Space
The LatticeMico32 processor has a flat 32-bit, byte-addressable address
space. By default, this address space is uncachable. The designer can
configure the entire address space, or just a portion of it, to be cachable. The
designer can also designate the entire uncachable address space, or just a
portion of it, to be processor inline memory space. For LatticeMico32
processors with caches, the portion of the address space that is cacheable
can be configured separately for both the instruction and data cache. This
allows for the size of the cache tag RAMs to be optimized to be as small as is
required (the fewer the number of cacheable addresses, the smaller the tag
RAMs will be).
Endianness
The LatticeMico32 processor is big-endian, which means that multi-byte
objects, such as half-words and words, are stored with the most significant
byte at the lowest address.
Address Alignment
All memory accesses must be aligned to the size of the access, as shown in
Table 9. No check is performed for unaligned access. All unaligned accesses
result in undefined behavior.
Byte None
Stack Layout
Figure 12 shows the conventional layout of a stack frame. The stack grows
toward lower memory as data is pushed onto it. The stack pointer (sp) points
to the first unused location, and the frame pointer (fp) points at the first
location used in the active frame. In many cases, a compiler may be able to
eliminate the frame pointer, because data can often be accessed by using a
negative displacement from the stack pointer, freeing up the frame pointer for
use as a general-purpose register.
Incoming arguments
fp
Locals
Callee saves
Outgoing arguments
sp
Lower Address
Free memory
Caches
A cache is a fast memory (single-cycle access) that stores a copy of a limited
subset of the data held in main memory, which may take the CPU several
cycles to access. A cache helps improve overall performance by exploiting
the fact that the same data is typically accessed several times in a short
interval. By storing a local copy of the data in the processor’s cache, the
multiple cycles required to access the data can be reduced to just a single
cycle for all subsequent accesses once the data is loaded into the cache.
Cache Architecture
When a cache accesses a data item, it is also likely to access data at adjacent
addresses (such as with arrays or structures) by loading data into the cache in
lines. A line can consist of 4, 8, or 16 adjacent bytes, and is specified by the
BYTES_PER_LINE option.
Associativity 1, 2
Bytes-per-line 4, 8, 16
The contents of the data cache can similarly be invalidated by writing to the
DCC CSR as follows:
wcsr DCC, r0
The LatticeMico32 caches are not kept consistent with respect to each other.
This means that if a store instruction writes to an area of memory that is
currently cached by the instruction cache, the instruction cache will not be
automatically updated to reflect the store. It is your responsibility to invalidate
the instruction cache after the write has taken place, if necessary.
Similarly, the caches do not snoop bus activity to monitor for writes by
peripherals (by DMA for example) to addresses that are cached. It is again
your responsibility to ensure that the cache is invalidated before reading
memory that may have been written by a peripheral.
Inline Memories
The LatticeMico32 processor enables you to optionally connect to on-chip
memory, through instruction and data ports, by using a local bus rather than
the Wishbone interface. Memory connected to the CPU in such a manner is
referred to as inline memory. Figure 14 shows a functional block diagram of
the LatticeMico32 processor with inline memories. The addresses occupied
by inline memories are not cachable.
Note
The Instruction Inline Memory is also connected to the Data Port of the
LatticeMico32 CPU in order to facilitate loading of the memory image of
the software application through the command line lm32-elf-gdb or
through the C/C++ SPE Debugger.
This diagram compares the number of cycles it takes to service read access
from the LatticeMico32 CPU by the inline memory versus the Wishbone-
based on-chip EBR. The read access initiated to inline memory will be
completed in the next cycle, whereas a read access initiated to EBR will take
four cycles. A similar behavior can be seen for writes initiated by the
LatticeMico32 CPU. This shows that deploying program code or data to inline
memory can provide at least a 3x speedup over Wishbone-based memories.
Exceptions
Exceptions are events either inside or outside of the processor that cause a
change in the normal flow of program execution. The LatticeMico32 processor
can raise eight types of exceptions, as shown in Table 11. The exceptions are
listed in a decreasing order of priority, so if multiple exceptions occur
simultaneously, the exception with the highest priority is raised.
Exception Processing
Exceptions occur in the execute pipeline stage. It is possible to have two
exceptions occur simultaneously. In this situation the exception with the
highest priority is handled. The sequence of operations performed by the
processor after an exception depends on the type of the highest priority
exception that has occured. Before the exception is handled, all instructions in
the Memory and Writeback pipeline stages are allowed to complete. Also, all
instructions in the Execute, Address, Fetch, and Decode pipeline stages are
squashed to ensure that they do not modify the processor state.
Non-Debug Exceptions
The Reset, Instruction Bus Error, Data Bus Error, Divide-By-Zero, Interrupt
and System Call exceptions are classified as non-debug exceptions. The
following sequence of events occur in one atomic operation:
ea = PC
IE.EIE = IE.IE
IE.IE = 0
PC = (DC.RE ? DEBA : EBA) + (ID * 32)
Debug Exceptions
Whether the EBA or DEBA is used as the base address depends upon the
type of the exception that occurred, whether DC.RE is set, and whether
dynamic mapping of EBA to DEBA is enabled via the 'at_debug' input pin to
the processor. Having two different base addresses for the exception table
allows a debug monitor to exist in a different memory from the main program
code. For example, the debug monitor may exist in an on-chip ROM, whereas
the main program code may be in a DDR or SRAM. The DC.RE flag and
at_debug pin allow either interrupts to run at full speed when debugging or for
the debugger to take complete control and handle all exceptions. When an
exception occurs, the only state that is automatically saved by the CPU is the
PC, which is saved in either ea or ba, and the interrupt enable flag, IE.IE,
which is saved in either IE.EIE or IE.BIE. It is the responsibility of the
exception handler to save and restore any other registers that it uses, if it
returns to the previously executing code.
The piece of code in Figure 16 shows how the exception handlers can be
implemented. The nops are required to ensure that the next exception handler
is aligned at the correct address. To ensure that this code is at the correct
address, it is common practice to place it in its own section. Place the
following assembler directive at the start of the code:
_reset_handler:
xor r0, r0, r0
bi _crt0
nop
nop
nop
nop
nop
nop
_breakpoint_handler:
sw (sp+0), ra
calli save_all
mvi r1, SIGTRAP
calli raise
bi restore_all_and_bret
nop
nop
nop
_instruction_bus_error_handler:
sw (sp+0), ra
calli save_all
mvi r1, SIGSEGV
calli raise
bi restore_all_and_eret
nop
nop
nop
_data_bus_error_handler:
sw (sp+0), ra
calli save_all
mvi r1, SIGSEGV
calli raise
bi restore_all_and_eret
nop
nop
nop
_divide_by_zero_handler:
sw (sp+0), ra
calli save_all
mvi r1, SIGFPE
calli raise
bi restore_all_and_eret
nop
nop
nop
_interrupt_handler:
sw (sp+0), ra
calli save_all
mvi r1, SIGINT
calli raise
bi restore_all_and_eret
nop
nop
nop
_system_call_handler:
sw (sp+0), ra
calli save_all
mv r1, sp
calli handle_scall
bi restore_all_and_eret
nop
nop
nop
Then in the linker script, place the code at the reset value of EBA or DEBA, as
shown in Figure 17.
SECTIONS
{
.boot : { *(.boot) } > ram
}
Nested Exceptions
Because different registers are used to save a state when a debug-related
exception occurs (ba and IE.BIE instead of ea and IE.EIE), limited nesting of
exceptions is possible, allowing the interrupt handler code to be debugged.
Any further nesting of exceptions requires software support.
To enable nested exceptions, an exception handler must save all the state
that is modified when an exception occurs, including the ea and ba registers,
as well as the IE CSR. These registers can simply be saved on the stack.
When returning from the exception handler, these registers must, obviously,
be restored from the values saved on the stack.
Reset Summary
During reset, the following occurs:
All CSRs are set to their reset values as listed in “Control and Status
Registers” on page 9.
Interrupts are disabled.
All hardware breakpoints and watchpoints are disabled.
If implemented, the contents of the caches are invalidated.
A reset exception is raised, which causes the PC to be set to the value in
the EBA CSR, where program execution starts. The PC can be optionally
set to the value in the DEBA CSR by enabling dynamic mapping of
exception handlers to Debugger (i.e., mapping EBA to DEBA) and
asserting the at_debug pin.
The register file is not reset, so it is the responsibility of the reset exception
handler to set register 0 to 0. This should be achieved by executing the
following sequence: xor r0, r0, r0.
Using Breakpoints
The LatticeMico32 architecture supports both software and hardware
breakpoints. Software breakpoints should be used for setting breakpoints in
code that resides in volatile memory, such as DDR or SRAM, while hardware
breakpoints should be used for setting breakpoints in code that resides in
non-volatile memory, such as FLASH or ROM.
breakpoint is enabled (by the LSB being set to 1), a breakpoint exception will
be raised. As with software breakpoints, the address of the instruction that
caused the breakpoint is saved in the ba register. If the breakpoint exception
handler wishes to resume program execution, it must clear the enable bit in
the relevant BP CSR; otherwise, the breakpoint exception is raised as soon
as execution resumes.
Using Watchpoints
The LatticeMico32 architecture supports hardware watchpoints. Watchpoints
are a mechanism by which a program can watch out for specific memory
accesses. For example, a program can set up a watchpoint that will cause a
watchpoint exception to be raised every time the address 0 is accessed
(something that is useful for tracking down null pointer errors in C programs).
Debug Architecture
This section describes the debug architecture of the LatticeMico32 processor.
DC – Debug Control
The DC CSR contains flags that control debugging facilities. After reset, the
value of the DC CSR is h00000000. This CSR is only implemented if
DEBUG_ENABLED equals TRUE.
BPn – Breakpoint
The BPn CSRs hold an instruction breakpoint address and a control bit that
determines whether the breakpoint is enabled. Because instructions are
always word-aligned, only the 30 most significant bits of the breakpoint
address are needed. After reset, the value of the BPn CSRs is h00000000.
WPn – Watchpoint
The WPn CSRs hold data watchpoint addresses. After reset, the value of the
WPn CSRs is h00000000. These CSRs are only implemented if
DEBUG_ENABLED equals TRUE.
Instructions ending with the letter “i” use an immediate value instead of a
register. Instructions ending with “hi” use a 16-bit immediate and the high 16
bits from a register. Instructions ending with the letter “u” treat the data as
unsigned integers.
Arithmetic
The instruction set includes the standard 32-bit integer arithmetic operations.
Support for the multiply and divide instructions is optional.
Add: add, addi
Subtract: sub
Multiply: mul, muli
Divide and modulus: divu, modu
There are also instructions to sign-extend byte and half-word data to word
size. Support for these instructions is optional.
Sign-extend: sextb, sexth
Logic
The instruction set includes the standard 32-bit bitwise logic operations. Most
of the logic instructions also have 16-bit immediate or high 16-bit versions.
AND: and, andi, andhi
OR: or, ori, orhi
Exclusive-OR: xor, xori
Complement: not
NOR: nor, nori
Exclusive-NOR: xnor, xnori
Comparison
The instruction set has basic comparison instructions with versions for
register-to-register and register-to-16-bit-immediate and signed and unsigned
comparisons. The instructions return 1 if true and 0 if false.
Equal: cmpe, cmpei
Not equal: cmpne, cmpnei
Greater: cmpg, cmpgi, cmpgu, cmpgui
Greater or equal: cmpge, cmpgei, cmpgeu, cmpgeui
Shift
The instruction set supports left and right shifting of data in general-purpose
registers. The number of bits to shift can be given through a register or a 5-bit
immediate. The right shift instruction has signed and unsigned versions (also
known as arithmetic and logical shifting). Support for shift instructions is
optional.
Left shift: sl, sli
Right shift: sr, sri, sru, srui
Data Transfer
Data transfer includes instructions that move data of byte, half-word, and
word sizes between memory and registers. Memory addresses are relative
and given as the sum of a general-purpose register and a signed 16-bit
immediate, for example, (r2+32).
Load register from memory: lb, lbu, lh, lhu, lw
Byte and half-word values are either sign-extended or zero-extended to fill
the register.
Store register to memory: sb, sh, sw
Byte and half-word values are taken from the lowest order part of the
register.
There are also instructions for moving data from one register to another,
including general-purpose and control and status registers.
Move between general-purpose registers: mv
Move immediate to high 16 bits of register: mvhi
Read and write control and status register: rcsr, wcsr
This chapter describes possible configuration options that you can use for the
LatticeMico32 processor. You are expected to use the Lattice Mico System
Builder (MSB) tool to configure the LatticeMico32 processor. Use the
processor's configuration GUI, located in the MSB, to specify the Verilog
parameters of the processor's RTL. For more information on the processor's
configuration GUI, refer to LatticeMico32 online Help.
Configuration Options
Table 17 describes the Verilog parameters for the LatticeMico32 processor.
ICACHE_SETS 128, 256, 512, 512 Specifies the number of sets in the
1024 instruction cache.
DCACHE_SETS 128, 256, 512, 512 Specifies the number of sets in the data
1024 cache.
EBA_RESET Any 256-byte 0 Specifies the reset value of the EBA CSR.
aligned
address
EBR_POSEDGE_REGISTER_FILE TRUE, FALSE FALSE Use EBR to implement register file instead
of distributed RAM (LUTs).
CFG_FAST_DOWNLOAD_ENABLED TRUE, FALSE FALSE Enable 16-entry FIFO in the debug core to
improve download speeds for large
applications.
EBR Use
The following details of embedded block RAM (EBR) use with different
configurations are based on the LatticeECP family of FPGAs.
Software-based debugging (DEBUG_ENABLED) requires two EBRs.
The instruction and data caches (ICACHE_ENABLED and
DCACHE_ENABLED, respectively) require EBR based on the size of the
cache:
cache size = sets × bytes per cache line × associativity
number of EBR = cache size/EBR_Size
For example, the default LatticeMico32 processor in the MSB has software-
based debugging, an instruction cache, and a data cache. Both caches have
512 sets, 16 bytes per cache line, and an associativity of 1.
WISHBONE Interconnect
Architecture
CTI_IO( )
The cycle-type identifier CTI_IO( ) address tag provides additional information
about the current cycle. The master sends this information to the slave. The
slave can use this information to prepare the response for the next cycle.
011 Reserved
100 Reserved
101 Reserved
110 Reserved
BTE_IO( )
The burst-type extension BTE_IO( ) address tag provides additional
information about the current burst. The master sends this information to the
slave. This information is only relevant for incrementing bursts. In the future,
other burst types may use these signals. See Table 19 for BTE_IO(1:0) signal
incrementing and decrementing bursts.
00 Linear burst
Component Signals
In Mico System Builder (MSB), you define which components are in the
platform and what needs to communicate with what. When the platform
generator is run in MSB, it uses this information to build the WISHBONE-
The prefixes used in the port and signal naming are not described in this
section.
The port and signal descriptions that follow refer to the port or signal that ends
with the string in the title.
ADR_O [31:2]
The address output array ADR_O( ) is used to pass a binary address.
ADR_O( ) actually has a full 32 bits. But, because all addressing is on
DWORD (4-byte) boundaries, the lowest two bits are always zero.
DAT_O [31:0]
The data output array DAT_O( ) is used to store a binary value for output.
DAT_I [31:0]
The data input array DAT_I( ) is used to store a binary value for input.
SEL_O [3:0]
The Select Output array SEL_O( ) indicates where valid data is expected on
the DAT_I( ) signal array during READ cycles and where it is placed on the
DAT_O( ) signal array during WRITE cycles. The array boundaries are
determined by the granularity of a port.
WE_O
The write enable output WE_O indicates whether the current local bus cycle
is a READ or WRITE cycle. The signal is negated during READ cycles and is
asserted during WRITE cycles.
ACK_I
This signal is called the acknowledge input ACK_I. When asserted, the signal
indicates the normal termination of a bus cycle by the slave. Also see the
ERR_I and RTY_I signal descriptions.
ERR_I
The Error Input ERR_I indicates an abnormal cycle termination by the slave.
The source of the error and the response generated by the master depends
on the master functionality. Also see the ACK_I and RTY_I signal
descriptions.
RTY_I
The Retry Input RTY_I indicates that the interface is not ready to accept or
send data, so the cycle should be retried. The core functionality defines when
and how the cycle is retried. Also see the ERR_I and RTY_I signal
descriptions.
CTI_O [2:0]
For descriptions of the cycle-type identifier CTI_O( ), see “CTI_IO( )” on
page 40.
BTE_O [1:0]
For descriptions of the burst-type extension BTE_O( ), see “BTE_IO( )” on
page 41.
LOCK_O
The lock output LOCK_O, when asserted, indicates that the current bus cycle
cannot be interrupted. Lock is asserted to request complete ownership of the
bus. After the transfer starts, the INTERCON does not grant the bus to any
other master until the current master negates LOCK_O or CYC_O.
CYC_O
The cycle output CYC_O, when asserted, indicates that a valid bus cycle is in
progress. The signal is asserted for the duration of all bus cycles. For
example, during a BLOCK transfer cycle there can be multiple data transfers.
The CYC_O signal is asserted during the first data transfer and remains
asserted until the last data transfer. The CYC_O signal is useful for interfaces
with multi-port interfaces, such as dual-port memories. In these cases, the
CYC_O signal requests the use of a common bus from an arbiter.
STB_O
The strobe output STB_O indicates a valid data transfer cycle. It is used to
qualify various other signals on the interface, such as SEL_O( ). The slave
asserts either the ACK_I, ERR_I, or RTY_I signals in response to every
assertion of the STB_O signal.
ADR_I [31:2]
The address input array ADR_I( ) is used to pass a binary address. ADR_I( )
actually has a full 32 bits. But, because all addressing is on DWORD (4-byte)
boundaries, the lowest two bits are always zero.
DAT_I [31:0]
The data input array DAT_I( ) is used to store a binary value for input.
DAT_O [31:0]
The data output array DAT_O( ) is used to store a binary value for output.
SEL_I [3:0]
The select input array SEL_I( ) indicates where valid data is placed on the
DAT_I( ) signal array during WRITE cycles and where it should be present on
the DAT_O( ) signal array during READ cycles. The array boundaries are
determined by the granularity of a port.
WE_I
The write enable Input WE_I indicates whether the current local bus cycle is a
READ or WRITE cycle. The signal is negated during READ cycles and is
asserted during WRITE cycles.
ACK_O
The acknowledge output ACK_O, when asserted, indicates the termination of
a normal bus cycle by the slave. Also see the ERR_O and RTY_O signal
descriptions.
ERR_O
The error output ERR_O indicates an abnormal cycle termination by the
slave. The source of the error and the response generated by the master
depends on the master functionality. Also see the ACK_O and RTY_O signal
descriptions.
RTY_O
The retry output RTY_O indicates that the slave interface is not ready to
accept or send data, so the cycle should be retried. The core functionality
defines when and how the cycle is retried. Also see the ERR_O and RTY_O
signal descriptions.
CTI_I
For descriptions of the cycle-type identifier CTI_I( ), see “CTI_IO( )” on
page 40.
BTE_I [1:0]
For descriptions of the burst-type extension BTE_i( ), see “BTE_IO( )” on
page 41.
LOCK_I
The lock input LOCK_I, when asserted, indicates that the current bus cycle is
uninterruptible. A slave that receives the LOCK LOCK_I signal is accessed by
a single master only until either LOCK_I or CYC_I is negated.
CYC_I [2:0]
The Cycle Input CYC_I, when asserted, indicates that a valid bus cycle is in
progress. The signal is asserted for the duration of all bus cycles. For
example, during a BLOCK transfer cycle there can be multiple data transfers.
The CYC_I signal is asserted during the first data transfer and remains
asserted until the last data transfer.
STB_I
The strobe input STB_I, when asserted, indicates a valid data transfer cycle.
A slave responds to other WISHBONE signals only when this STB_I is
asserted, except for the RST_I signal, to which it should always respond. The
slave asserts either the ACK_O, ERR_O, or RTY_O signals in response to
every assertion of the STB_I signal.
Arbitration Schemes
MSB supports the following arbitration schemes for platform generation:
Shared-bus arbitration schemes
Slave-side fixed arbitration schemes
Slave-side round-robin arbitration schemes
Shared-Bus Arbitration
The shared-bus arbitration scheme is shown in Figure 23.
Shared-bus arbiter
WISHBONE bus
In the shared-bus arbitration scheme, one or more bus masters and bus
slaves connect to a shared bus. A single arbiter controls the bus, that is, the
path between masters and slaves. Each bus master requests control of the
bus from the arbiter, and the arbiter grants access to a single master at a time.
Once a master has control of the bus, it performs transfers with a bus slave. If
multiple masters attempt to access the bus at the same time, the arbiter
allocates the bus resources to a single master according to fixed arbitration
rules, forcing all other masters to wait.
Slave-Side Arbitration
Slave-side arbitration is shown in Figure 24.
Master 1 Master 2
Arbiter
In slave-side arbitration, each multi-master slave has its own arbiter. A master
port never waits to access a slave port, unless a different master port attempts
to access the same slave port at the same time. As a result, multiple master
ports active at the same time simultaneously transfer data with independent
slave ports.
Instruction Set
Instruction Formats
All LatticeMico32 instructions are 32 bits wide. They are in four basic formats,
as shown in Figure 25 through Figure 28.
Pseudo-Instructions
To aid the semantics of assembler programs, the LatticeMico32 assembler
implements a variety of pseudo-instructions. Table 21 lists these instructions
and to what actual instructions they are mapped. Disassemblers show the
actual implementation.
mvhi rX, imm16 orhi rX, r0, imm16 Moves the 16-bit, left-shifted immediate into rX.
not rX, rY xnor rX, rY, r0 Is the bitwise complement of the value in rY and
stores the result in rX.
mvi addi rd, r0, imm16 Adds 16-bit immediate to r0 and stores the
result in rd.
Note: GCC compiler tool chain expects r0
contents to be zero.
Instruction Descriptions
Some of the following tables include these parameters:
Syntax – Describes the assembly language syntax for the instruction.
Issue – The “issue” cycles mean the number of cycles that the
microprocessor takes to place this instruction in the pipeline. For example,
if the issue is 1 cycle, the next instruction will be introduced into the
pipeline the very next cycle. If the issue is 4, the next instruction will be
introduced three cycles later. The branches and calls are issue 4 cycles,
which means that the pipeline stalls for the next three cycles.
Semantics – Describes how the instruction creates a result from the inputs
and where it puts the result. The Semantics feature refers to terms used in
the assembly language syntax for the instruction.
The Semantics feature also uses the following terms:
gpr – Refers to a general-purpose register.
PC – Refers to a program counter.
csr – Refers to a control and status register.
IE.BIE – Refers to the BIE bit of the IE (interrupt enable) register.
IE.IE – Refers to the IE bit of the IE (interrupt enable) register.
IE.EIE – Refers to the EIE bit of the IE (interrupt enable) register.
EBA – See “EBA – Exception Base Address” on page 12.
add
Figure 29: add Instruction
Description Adds the value in rY to the value in rZ, storing the result in rX.
Result 1 cycle
Issue 1 cycle
addi
Figure 30: addi Instruction
Description Adds the value in rY to the sign-extended immediate, storing the result
in rX.
Result 1 cycle
Issue 1 cycle
and
Figure 31: and Instruction
Description Bitwise AND of the value in rY with the value in rZ, storing the result in
rX.
Result 1 cycle
Issue 1 cycle
See Also andi, AND with immediate; andhi, AND with high 16 bits
andhi
Figure 32: andhi Instruction
Description Bitwise AND of the value in rY with the 16-bit, left-shifted immediate,
storing the result in rX.
Result 1 cycle
Issue 1 cycle
andi
Figure 33: andi Instruction
Result 1 cycle
Issue 1 cycle
See Also and, AND between registers; andhi, AND with high 16 bits
b
Figure 34: b Instruction
Syntax b rX
Example b r3
Semantics PC = gpr[rX]
Issue 4 cycles
be
Figure 35: be Instruction
Description Compares the value in rX with the value in rY, branching to the address
given by the sum of the PC and the sign-extended immediate if the
values are equal.
bg
Figure 36: bg Instruction
Description Compares the value in rX with the value in rY, branching to the address
given by the sum of the PC and the sign-extended immediate if the
value in rX is greater than the value in rY. The values in rX and rY are
treated as signed integers.
bge
Figure 37: bge Instruction
Description Compares the value in rX with the value in rY, branching to the address
given by the sum of the PC and the sign-extended immediate if the
value in rX is greater or equal to the value in rY. The values in rX and rY
are treated as signed integers.
bgeu
Figure 38: bgeu Instruction
Description Compares the value in rX with the value in rY, branching to the address
given by the sum of the PC and the sign-extended immediate if the
value in rX is greater or equal to the value in rY. The values in rX and rY
are treated as unsigned integers.
bgu
Figure 39: bgu Instruction
Description Compares the value in rX with the value in rY, branching to the address
given by the sum of the PC and the sign-extended immediate if the
value in rX is greater than the value in rY. The values in rX and rY are
treated as unsigned integers.
bi
Figure 40: bi Instruction
Description Unconditional branch to the address given by the sum of the PC and the
sign-extended immediate.
Syntax bi imm26
Example bi label
Issue 4 cycles
bne
Figure 41: bne Instruction
Description Compares the value in rX with the value in rY, branching to the address
given by the sum of the PC and the sign-extended immediate if the
values are not equal.
break
Figure 42: break Instruction
Syntax break
Example break
Semantics gpr[ba] = PC
IE.BIE = IE.IE
IE.IE = 0
PC = DEBA + ID * 32
Issue 4 cycles
bret
Figure 43: bret Instruction
Syntax bret
Example bret
Semantics PC = gpr[ba]
IE.IE = IE.BIE
Issue 4 cycles
call
Figure 44: call Instruction
Description Adds 4 to the PC, storing the result in ra, then unconditionally branches
to the address in rX.
Syntax call rX
Example call r3
Semantics gpr[ra] = PC + 4
PC = gpr[rX]
Result 1 cycle
Issue 4 cycles
See Also calli, call with immediate; ret, return from call
calli
Figure 45: calli Instruction
Description Adds 4 to the PC, storing the result in ra, then unconditionally branches
to the address given by the sum of the PC and the sign-extended
immediate.
Semantics gpr[ra] = PC + 4
PC = PC + sign_extend(imm26 << 2)
Result 1 cycle
Issue 4 cycles
See Also call, call from register; ret, return from call
cmpe
Figure 46: cmpe Instruction
Description Compares the value in rY with the value in rZ, storing 1 in rX if they are
equal, otherwise 0.
Result 2 cycles
Issue 1 cycle
cmpei
Figure 47: cmpei Instruction
Result 2 cycles
Issue 1 cycle
cmpg
Figure 48: cmpg Instruction
Description Compares the value in rY with the value in rZ, storing 1 in rX if the value
in rY is greater than the value in rZ, 0 otherwise. Both operands are
treated as signed integers.
Result 2 cycles
Issue 1 cycle
See Also cmpgi, compare greater with immediate; cmpgu, compare greater,
unsigned; cmpgui, compare greater with immediate, unsigned
cmpgi
Figure 49: cmpgi Instruction
Result 2 cycles
Issue 1 cycle
See Also cmpg, compare greater between registers; cmpgu, compare greater,
unsigned; cmpgui, compare greater with immediate, unsigned
cmpge
Figure 50: cmpge Instruction
Description Compares the value in rY with the value in rZ, storing 1 in rX if the value
in rY is greater or equal to the value in rZ, 0 otherwise. Both operands
are treated as signed integers.
Result 2 cycles
Issue 1 cycle
cmpgei
Figure 51: cmpgei Instruction
Result 2 cycles
Issue 1 cycle
cmpgeu
Figure 52: cmpgeu Instruction
Description Compares the value in rY with the value in rZ, storing 1 in rX if the value
in rY is greater or equal to the value in rZ, 0 otherwise. Both operands
are treated as unsigned integers.
Result 2 cycles
Issue 1 cycle
See Also cmpge, compare between registers; cmpgei, compare with immediate;
cmpgeui, compare with immediate, unsigned
cmpgeui
Figure 53: cmpgeui Instruction
Result 2 cycles
Issue 1 cycle
See Also cmpge, compare between registers; cmpgei, compare with immediate;
cmpgeu, compare, unsigned
cmpgu
Figure 54: cmpgu Instruction
Description Compares the value in rY with the value in rZ, storing 1 in rX if the value
in rY is greater than the value in rZ, 0 otherwise. Both operands are
treated as unsigned integers.
Result 2 cycles
Issue 1 cycle
See Also cmpg, compare greater, signed; cmpgi, compare greater with
immediate; cmpgui, compare greater with immediate, unsigned
cmpgui
Figure 55: cmpgui Instruction
Result 2 cycles
Issue 1 cycle
See Also cmpg, compare greater, signed; cmpgi, compare greater with
immediate; cmpgu, compare greater, unsigned
cmpne
Figure 56: cmpne Instruction
Description Compares the value in rY with the value in rZ, storing 1 in rX if they are
not equal, 0 otherwise.
Result 2 cycles
Issue 1 cycle
cmpnei
Figure 57: cmpnei Instruction
Result 2 cycles
Issue 1 cycle
divu
Figure 58: divu Instruction
Description Divides the value in rY by the value in rZ, storing the quotient in rX. Both
operands are treated as unsigned integers.
Available only if the processor was configured with the
DIVIDE_ENABLED option.
Result 34 cycles
Issue 34 cycles
eret
Figure 59: eret Instruction
Syntax eret
Example eret
Semantics PC = gpr[ea]
IE.IE = IE.EIE
Result
Issue 3 cycles
lb
Figure 60: lb Instruction
Description Loads a byte from memory at the address specified by the sum of the
value in rY added to the sign-extended immediate, storing the sign-
extended result into rX.
Result 3 cycles
Issue 1 cycle
See Also lbu, load byte, unsigned; lh, load half-word, signed; lhu, load half-word,
unsigned; lw, load word
lbu
Figure 61: lbu Instruction
Description Loads a byte from memory at the address specified by the sum of the
value in rY added to the sign-extended immediate, storing the zero-
extended result into rX.
Result 3 cycles
Issue 1 cycle
See Also lb, load byte, signed; lh, load half-word, signed; lhu, load half-word,
unsigned; lw, load word
lh
Figure 62: lh Instruction
Description Loads a half-word from memory at the address specified by the sum of
the value in rY added to the sign-extended immediate, storing the sign-
extended result into rX.
Result 3 cycles
Issue 1 cycle
See Also lb, load byte, signed; lbu, load byte, unsigned; lhu, load half-word,
unsigned; lw, load word
lhu
Figure 63: lhu Instruction
Description Loads a half-word from memory at the address specified by the sum of
the value in rY added to the sign-extended immediate, storing the zero-
extended result into rX.
Result 3 cycles
Issue 1 cycle
See Also lb, load byte, signed; lbu, load byte, unsigned; lh, load half-word, signed;
lw, load word
lw
Figure 64: lw Instruction
Description Loads a word from memory at address specified by the sum of the value
in rY added to the sign-extended immediate, storing the result in rX.
Result 3 cycles
Issue 1 cycle
See Also lb, load byte, signed; lbu, load byte, unsigned; lh, load half-word, signed;
lhu, load half-word, unsigned
modu
Figure 65: modu Instruction
Description Divides the value in rY by the value in rZ, storing the remainder in rX.
Both operands are treated as unsigned integers.
Available only if the processor was configured with the
DIVIDE_ENABLED option.
Result 34 cycles
Issue 34 cycles
mul
Figure 66: mul Instruction
Description Multiplies the value in rY by the value in rZ, storing the low 32 bits of the
product in rX.
Available only if the processor was configured with either the
MC_MULTIPLY_ENABLED or PL_MULTIPLY_ENABLED option.
Result 3 cycles
Issue 1 cycle
muli
Figure 67: muli Instruction
Result 3 cycles
Issue 1 cycle
mv
Feature Description
Operation Move
Syntax mv rX, rY
Example mv r4, r2
Result 1 cycle
Issue 1 cycle
mvhi
Feature Description
Result 1 cycle
Issue 1 cycle
nor
Figure 68: nor Instruction
Description Bitwise NOR of the value in rY with the value in rZ, storing the result in
rX.
Result 1 cycle
Issue 1 cycle
nori
Figure 69: nori Instruction
Result 1 cycle
Issue 1 cycle
not
Feature Description
Description Bitwise complement of the value in rY, storing the result in rX.
This is a pseudo-instruction implemented with: xnor rX, rY, r0.
Result 1 cycle
Issue 1 cycle
or
Figure 70: or Instruction
Description Bitwise OR of the value in rY with the value in rZ, storing the result in rX.
Result 1 cycle
Issue 1 cycle
ori
Figure 71: ori Instruction
Result 1 cycle
Issue 1 cycle
orhi
Figure 72: orhi Instruction
Result 1 cycle
Issue 1 cycle
rcsr
Figure 73: rcsr Instruction
Description Reads the value of the specified control and status register and stores it
in rX.
Result 1 cycle
Issue 1 cycle
ret
Feature Description
Syntax ret
Example ret
Semantics PC = gpr[ra]
Result
Issue 4 cycles
See Also call, function call from register; calli, function call with immediate
sb
Figure 74: sb Instruction
Description Stores the lower byte in rY into memory at the address specified by the
sum of the value in rX added to the sign-extended immediate.
Syntax sb(rX+imm16), rY
Example sb(r2+8), r4
Result
Issue 1 cycle
scall
Figure 75: scall Instruction
Syntax scall
Example scall
Semantics gpr[ea] = PC
IE.EIE = IE.IE
IE.IE = 0
PC = (DC.RE ? DEBA : EBA) + ID * 32
Result
Issue 4 cycles
sextb
Figure 76: sextb Instruction
Result 1 cycle
Issue 1 cycle
sexth
Figure 77: sexth Instruction
Result 1 cycle
Issue 1 cycle
sh
Figure 78: sh Instruction
Description Stores the lower half-word in rY into memory at the address specified by
the sum of the value in rX added to the sign-extended immediate.
Syntax sh (rX+imm16), rY
Example sh (r2+8), r4
Result
Issue 1 cycle
sl
Figure 79: sl Instruction
Description Shifts the value in rY left by the number of bits specified by the value in
rZ, storing the result in rX.
Available only if the processor was configured with either the
MC_BARREL_SHIFT_ENABLED or PL_BARREL_SHIFT_ENABLED
option.
Result 2 cycles
Issue 1 cycle
sli
Figure 80: sli Instruction
Description Shifts the value in rY left by the number of bits specified by the
immediate, storing the result in rX.
Available only if the processor was configured with either the
MC_BARREL_SHIFT_ENABLED or PL_BARREL_SHIFT_ENABLED
option.
Result 2 cycles
Issue 1 cycle
sr
Figure 81: sr Instruction
Description Shifts the signed value in rY right by the number of bits specified by the
value in rZ, storing the result in rX.
Available only if the processor was configured with either the
MC_BARREL_SHIFT_ENABLED or PL_BARREL_SHIFT_ENABLED
option.
Result 2 cycles
Issue 1 cycle
See Also sri, shift right with immediate; sru, shift right, unsigned; srui, shift right
with immediate, unsigned
sri
Figure 82: sri Instruction
Description Shifts the signed value in rY right by the number of bits specified by the
immediate, storing the result in rX.
Available only if the processor was configured with either the
MC_BARREL_SHIFT_ENABLED or PL_BARREL_SHIFT_ENABLED
option.
Result 2 cycles
Issue 1 cycle
See Also sr, shift right from register; sru, shift right, unsigned; srui, shift right with
immediate, unsigned
sru
Figure 83: sru Instruction
Description Shifts the unsigned value in rY right by the number of bits specified by
the value in rZ, storing the result in rX.
Available only if the processor was configured with either the
MC_BARREL_SHIFT_ENABLED or PL_BARREL_SHIFT_ENABLED
option.
Result 2 cycles
Issue 1 cycle
See Also sr, shift right from register; sri, shift right with immediate; srui, shift right
with immediate, unsigned
srui
Figure 84: srui Instruction
Description Shifts the unsigned value in rY right by the number of bits specified by
the immediate, storing the result in rX.
Available only if the processor was configured with either the
MC_BARREL_SHIFT_ENABLED or PL_BARREL_SHIFT_ENABLED
option.
Result 2 cycles
Issue 1 cycle
See Also sr, shift right from register; sri, shift right with immediate; sru, shift right,
unsigned
sub
Figure 85: sub Instruction
Description Subtracts the value in rZ from the value in rY, storing the result in rX.
Result 1 cycle
Issue 1 cycle
sw
Figure 86: sw Instruction
Description Stores the value in rY into memory at the address specified by the sum
of the value in rX added to the sign-extended immediate.
Syntax sw(rX+imm16), rY
Example sw(r2+8), r4
Result
Issue 1 cycle
See Also sb, store byte; sh, store half-word
wcsr
Figure 87: wcsr Instruction
Result 1 cycle
Issue 1 cycle
xnor
Figure 88: xnor Instruction
Description Bitwise exclusive-NOR of the value in rY with the value in rZ, storing the
result in rX.
Result 1 cycle
Issue 1 cycle
xnori
Figure 89: xnori Instruction
Result 1 cycle
Issue 1 cycle
xor
Figure 90: xor Instruction
Description Bitwise exclusive-OR of the value in rY with the value in rZ, storing the
result in rX.
Result 1 cycle
Issue 1 cycle
xori
Figure 91: xori Instruction
Result 1 cycle
Issue 1 cycle
master signals 42 E
slave signals 44 E field 30
configuration options 33 ea register 7
configuration register 9, 11 EBA register 9, 12, 26
control and status registers EBR 36
configuration 9, 11 EIE field 9
cycle counter 9, 11 embedded block RAM 36
data cache control 9, 11 endianness 14
exception base address 9, 12, 26 eret instruction 68
extended configuration 9 ERR_I 43
instruction cache control 9, 10 ERR_O 45
interrupt enable 9 Error Input 43
interrupt mask 9, 10 Error Output 45
interrupt pending 9, 10 exception address register 7
introduction 9 exception base address 9, 12, 26
program counter 9 exceptions
CR format for instructions 49 breakpoints 26
CTI_I( ) 40 debug 21
CTI_O( ) 40 interrupts 25
CYC_I 46 introduction to 20
CYC_O 44 nested 25
cycle counter 9, 11 non-debug 21
Cycle Input 46 processing 21
Cycle Output 44 reset 26
cycle type identifier 40 watchpoints 27
CYCLE_COUNTER_ENABLED 34 execute pipeline stage 5
extended configuration register 9
D extended data types 7
D field 11
DAT_I( ) 43, 44 F
DAT_O( ) 43, 45 F field in JTAG UART Receive Register 29
data cache control 9, 11 F field in JTAG UART Transmit Register 29
Data Input array 43, 44 fetch pipeline stage 5
Data Output array 43, 45 fixed slave-side arbitration scheme 48
data transfer instructions 32 fp register 8, 14
data types 6 frame pointer 8, 14
DataBusError 20, 23
DC field 11 G
DC register 27, 28 G field 11
DCACHE_ASSOCIATIVITY 35 general-purpose registers 7
DCACHE_BASE_ADDRESS 35 global pointer 8
DCACHE_BYTES_PER_LINE 35 gp register 8
DCACHE_ENABLED 35
DCACHE_LIMIT 35 H
DCACHE_SETS 35
H field 11
DCC register 9, 11
DEBA register 27, 29
I
debug 27, 36
debug control 27, 28 I bit in data cache control 11
debug control and status registers 27 I bit in instruction cache control 10
debug exception base aAddress 27 I format for instructions 49
debug exception base address 29 IC field 11
debug exceptions 21 ICACHE_ASSOCIATIVITY 35
DEBUG_ENABLED 34, 36 ICACHE_BASE_ADDRESS 35
decode pipeline stage 5 ICACHE_BYTES_PER_LINE 35
DIVIDE_ENABLED 33 ICACHE_ENABLED 35
DivideByZero 20, 23 ICACHE_LIMIT 35
divu instruction 67 ICACHE_SETS 35
ICC register 9, 10
IE field 9 sextb 79
IE register 9 sexth 80
IM register 9, 10 sh 81
initializing caches 16 sl 81
inline memories 17 sli 82
instruction cache control 9, 10 sr 82
instruction set sri 83
categories 30 sru 83
descriptions 51 srui 84
add 52 sub 85
addi 52 sw 85
and 53 wcsr 86
andhi 53 xnor 86
andi 54 xnori 87
b 54 xor 87
be 55 xori 88
bg 55 formats 49
bge 56 opcodes 50
bgeu 56 pseudo-instructions 51
bgu 57 InstructionBusError 20, 22
bi 57 INT field 11
bne 58 interconnect architecture see WISHBONE
break 58 interconnect
bret 59 interlock in pipeline 5
call 59 Interrupt 20, 23, 25
calli 60 interrupt enable 9
cmpe 61 interrupt mask 9, 10
cmpei 61 interrupt pending 9, 10
cmpg 62 interrupt renable 9
cmpge 63 invalidating caches 17
cmpgei 63 IP register 9, 10
cmpgeu 64
cmpgeui 64 J
cmpgi 62 J field 11
cmpgu 65 JRX register 28, 29
cmpgui 66 JTAG UART Receive Register 28, 29
cmpne 66 JTAG UART Transmit Register 29
cmpnei 67 JTAG UART transmit register 28
divu 67 JTX register 28, 29
eret 68
lb 68 L
lbu 69 lb instruction 68
lhu 69, 70 lbu instruction 69
lw 70 lh instruction 69
modu 71 lhu instruction 70
mul 72 lines in caches 15
muli 72 Lock Input 46
mv 73 Lock Output 44
mvh 73 LOCK_I 46
nor 74 LOCK_O 44
nori 74 logic instructions 31
not 75 lw instruction 70
or 75
orhi 76
M
ori 76
M field 11
rcsr 77
ret 78 master signals 42
sb 78 memory architecture
address alignment 14
scall 79
coomponent signals 41
introduction to 39
master signals 42
registered feedback mode 40
slave signals 44
WP field 11
WP registers 27, 28, 30
Write Enable Input 45
Write Enable Output 43
writeback pipeline stage 5
X
X field 11
xnor instruction 86
xnori instruction 87
xor instruction 87
xori instruction 88