The document discusses stack machine design and optimizations for the Java Virtual Machine (JVM). It proposes a two-level stack cache with the top two stack elements stored in registers and the rest in a dual-port on-chip memory. This design simplifies the pipeline and avoids forwarding logic compared to a three-port memory implementation. Evaluation shows the two-level cache approach achieves higher frequency and smaller size than alternatives like a register file cache or single-level RAM cache.
The document discusses stack machine design and optimizations for the Java Virtual Machine (JVM). It proposes a two-level stack cache with the top two stack elements stored in registers and the rest in a dual-port on-chip memory. This design simplifies the pipeline and avoids forwarding logic compared to a three-port memory implementation. Evaluation shows the two-level cache approach achieves higher frequency and smaller size than alternatives like a register file cache or single-level RAM cache.
The Java Virtual Machine JVM is a stack machine All instructions access the stack 40% access local variables Stack and local variables need caching
JOP Stack Architecture 3
An Efficient Stack Machine JVM stack is a logical stack Frame for return information Local variable area Operand stack We could use independent stacks Argument-passing regulates the layout
JOP Stack Architecture 4
Parameter passing int val = foo(1, 2); ... public int foo(int a, int b) { int c = 1; return a+b+c; }
The invocation sequence:
aload_0 // Push the object reference iconst_1 // and the parameter onto the iconst_2 // operand stack. invokevirtual #2 // Invoke method foo:(II)I. istore_1 // Store the result in val.
public int foo(int,int):
iconst_1 // The constant is stored in a method istore_3 // local variable (at position 3). iload_1 // Arguments are accessed as locals iload_2 // and pushed onto the operand stack. iadd // Operation on the operand stack. iload_3 // Push c onto the operand stack. iadd ireturn // Return value is on top of stack.
JOP Stack Architecture 5
Stack Layout
JOP Stack Architecture 6
Stack Content Operand stack A=B+C*D TOS and TOS-1 Stack JVM Local variable area push B iload_1 Former op stack push C iload_2 At a deeper position push D iload_3 Saved context * imul Between locals and + iadd operand stack pop A istore_0
JOP Stack Architecture 7
Stack access Stack operation Read TOS and TOS-1 Execute Write back TOS Variable load Read from deeper stack location Write into TOS Variable store Read TOS Write into deeper stack location JOP Stack Architecture 8 Three Port Stack Memory Single cycle execution Two read ports for TOS and TOS-1 or Local variable One write port for TOS or Local variable
JOP Stack Architecture 9
Register File Stack Cache
Register file as circular Instruction fetch
buffer - small Instruction decode Automatic spill/fill RF read and execute Five access ports RF write back picoJava, aJile
JOP Stack Architecture 10
On-chip Memory Stack Cache
Large cache Instruction fetch
Three-port memory Instruction decode Additional pipeline stage Memory read Komodo, FemtoJava Execute Memory write back JOP Stack Architecture 11 JVM Stack Access Revised ALU operation A is TOS A <- A op B B is TOS-1 B <- sm[p] sm is stack array p <- p -1 p points to TOS-2 Variable load (Push) v points to local area A <- sm[v+n] n is the local offset B <- A sm[p+1] <- B op is a two operand stack operation p <- p +1 Variable store (Pop) sm[v+n] <- A A <- B B <- sm[p] p <- p -1
JOP Stack Architecture 12
Do we need a 3-port memory? Stack operation: Dual read from TOS and TOS-1 Write to TOS Variable load/store: One read port One write port TOS and TOS-1 as register Deeper locations as on-chip memory JOP Stack Architecture 13 Two-Level Stack Cache
Dual read only from TOS and Instruction fetch
TOS-1 Instruction decode Two register (A/B) Execute, load or store Dual-port memory Simpler Pipeline No forwarding logic JOP Stack Architecture 14 Stack Caches Compared Design Cache fmax Size
(LC) (bit) (MHz) (word)
ALU - - 237 -
16 register 707 0 110 16
RAM 111 8192 153 128
Two-level 112 4096 213 130
JOP Stack Architecture 15
Summary The JVM is a stack machine Stack and local variables need caching Two-level cache Two top levels as register Rest as on-chip memory (two ports) Small design Short pipeline
JOP Stack Architecture 16
Further Information JOP Thesis: p 78-93 Martin Schoeberl, Design and Implementation of an Efficient Stack Machine, In Proceedings of the 12th IEEE Reconfigurable Architecture Workshop, RAW 2005, Denver, Colorado, USA, April 2005.