0% found this document useful (0 votes)
194 views

Ee382M - Vlsi I: Spring 2009 (Prof. David Pan) Final Project

This document outlines the requirements for the final project in the EE382M - VLSI I course. Students will design an intellectual property (IP) core module. A timeline is provided, with project selection due by March 26th and the final report due on May 5th. The final report must include specifications, design, user, testing, and optimization documents as well as source code and layout. Suggested project topics are also listed, including designing a low-leakage SRAM using sleep transistors and implementing a fast Fourier transform kernel.

Uploaded by

sepritahara
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
194 views

Ee382M - Vlsi I: Spring 2009 (Prof. David Pan) Final Project

This document outlines the requirements for the final project in the EE382M - VLSI I course. Students will design an intellectual property (IP) core module. A timeline is provided, with project selection due by March 26th and the final report due on May 5th. The final report must include specifications, design, user, testing, and optimization documents as well as source code and layout. Suggested project topics are also listed, including designing a low-leakage SRAM using sleep transistors and implementing a fast Fourier transform kernel.

Uploaded by

sepritahara
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

EE382M - VLSI I

Spring 2009 (Prof. David Pan)


Final Project

1. Introduction
Your task in this project is to use the skills you have been acquiring through the lectures
and labs to design a fairly sophisticated module—an intellectual property (IP) core. The
purpose of the project is threefold:
1. It is worth a large fraction of your grade (but this should be the least important item).
2. Working on this project should be training on how to go about approaching a project.
3. The project should yield results:
• Experimental results on the efficacy of proposed VLSI architectures, and
• Suggestions for improving these architectures.
Projects can be done individually, or in groups of 2–3; naturally, I will expect more from
group projects.

2. Timeline
Project selection: Make a decision as to the projects you are interested in working on by
March 26th. You may turn in only a sheet with your topic, a paragraph description about
what you are going to implement, and your teammate at this time.
Final report: The final report should consist of the following:
• Specifications document
• Design document
• User document
• Testing strategy and results
• Optimization strategy and results
• Source code and layout
The report is due in hardcopy on May 5th, 2009 in my office.

3. Final Report Details


You may need the following information below in each section

3.1. Specifications
The specifications document should include a high-level overview of the IP block
you are implementing; a description based on a diagram or set of diagrams is the best
way to do this. It should also include the summary of the logical interface the block
presents to its environment.
In addition, the document should include the area, power, and performance numbers
you are targeting. If you base your work on an existing design, you should be able to
come up with estimates on these parameters; otherwise, back-of-the-envelope
calculations are fine. It’s not imperative that you meet the numbers in the specification
document.
The specifications document should not discuss the implementation; its focus is the
functionality that you will implement, and the cost of this functionality.

3.2. Design
The design document should include a description of how you will implement the
specification — a set of figures is the best way to convey this. The implementation
discussion should include the basic architecture and algorithms, as well as the floorplan,
and circuit technology, etc.
You should also make notes on the optimization techniques you expect to use and
their implications to your design, and the trade-offs they will entail. For example, if you
have long interconnects, you may want to state that you intend to overcome problems
resulting from crosstalk by shielding, and hence all long nets should have enough space
between them for such shielding lines.
All choices should be justified, on the basis of references to portions of the
book/research papers, and by logical arguments.
The design document should also include an overview of the tool suite you will be
using, the naming conventions for variables/modules/files, the regression control strategy,
and an issue tracking mechanism (which could be just entries in a text file).
Think of the design document as something you would give to an engineer just
joining the project to help him/her come up to speed.
(Design documents also spell out a regular system of “code reviews,” where
designers have to explain what they have done to their colleagues, at a very detailed level,
e.g., a walk-through of RTL code. We won’t have review process as it is probably too
involved for a class project.)
The specifications and design documents do not have to be exactly what you turned
in; indeed I would expect the design document to evolve as you discover problems and
find improvements with your approach.

3.3. User Document


The user document describes how end-users are to integrate the IP block into their
designs—think of it as being like the datasheet you get with a chip.
In particular, the user document should include detailed information on interfacing
to the block, i.e., the timing on the different signals. It should describe the power, area,
delay numbers at various operating points, and the loading capacitance and drive
strengths on the input-output signals.

3.4. Testing
In this section, you are to describe the set of tests you applied to your design to
check for logical errors, and your coverage metrics. Classify the bugs you encountered,
and how you corrected for them. In addition, discuss the traces you applied to determine
the critical path, and compute the delays.
For some projects it may make sense to write a high-level model in C or C++ and do
performance simulations (e.g., determine the average latency and drop rate through the
Benes fabric as a function of load, and buffering). If this is the case, include results from
these simulations.

3.5. Optimization
Include a discussion of all the steps you took to improve performance, and the
magnitude of improvements that you saw. I am particularly interested in novel techniques
that gave your better performance that the descriptions that you based your approach on.

4. Project Topics
This following list is a set of suggested topics that I would like to see you work on.
If you have other idea, you are encouraged to pursue, but please come and talk to me
before you do it.

4.1. A Sleepy SRAM


One of the major techniques used to control sub-threshold leakage is using sleep
transistors.
In essence, sleep transistors are used for power gating: Logic runs at low Vt, and the
gates are faster and leaky; Sleep transistors are high Vt. They are switched off when idle
(usually NMOS alone is used) and can save 2-1000× leakage power.
Your goal is to design a 32 kbit SRAM (128 rows, 256 columns, 8 bit words) which
uses sleep transistors to reduce leakage power. There are a number of ways you can go: a
single huge sleep transistor, a sleep transistor per cell, or a sleep transistor per 4 cells, etc.
There are power–delay tradeoffs between these, which you are to explore.
Some papers that will help:
z “A Leakage Reduction Methodology for Distributed MTCMOS.” B. H.
Calhoun, F. A. Honore. IEEE JSC, 2004
z “Sleep Transistor Sizing Using Timing Criticality and Temporal Currents”,
Proc. Asia South Pacific Design Automation Conference (ASPDAC), to appear,
Jan. 2005.
z “5-GHz 32-bit integer execution core in 130-nm dual-Vt/CMOS”, IEEE
Journal of Solid State Circuits, Nov. 2002. Pages 1421–1432.
z ”A shared-well dual-supply-voltage 64-bit ALU”, IEEE Journal of Solid State
Circuits. Mar. 2004. Pages 494–500.
z “Leakage Control Through Fine-Grained Placement and Sizing of Sleep
Transistors.” , V. Khandelwal, A, Srivastava, IEEE Transactions on
Computer-aided design of integrated circuits and systems., 2007
Mentor: Junyoung Park

4.2. Divider
Division is the hardest of the four basic arithmetic operations. As Intel famously
demonstrated, implementation of division algorithms is not always straightforward.
Nonrestoring, SRT, Newton-Raphson, and Goldschmidt are all possible methods for the
implementation of division.
For this project review all of these methods. Select at least one to implement. Provide
both hand worked examples and simulation results demonstrating the correctness of your
approach and implementation. Additionally contrast your implementation with each of
the other methods and provide the reasons for your selection, as well as the advantages
and disadvantages provided by your approach.
References:
z Computer Arithmetic, Behrooz Parhami, 2000.
z "A new class of digital division methods", J. E. Robertson, IRE Trans.
Electronic Computers, 1958.
z "Techniques of multiplication and division for automatic binary computers",
K.D. Tocher, Quarterly J. Mechanics and Applied Mathematics, 1958.
Mentor: Whitney

4.3. Fast Fourier Transform Kernel


For the FFT project, you are to finish a VLSI implementation of a 16-point FFT.
The chip shall take the time-sampled data input at a set sampling frequency of your
choice and output the correct bin counts for all the points in the FFT, within the range of
error tolerance. For real-world applications, you are encouraged to aim for 64/128-point
high precision FFT kernel which are compatible to wireless industry’s protocols.
Example Specifications (16 bit precision for both real and img parts of inputs):
Testing: You must establish your own testing benchmark structure to test your FFT
core with reasonably good coverage using all types of signals (sine waves, noise, dc) and
their random combinations, below your chosen Nyquist frequency.
On algorithm level, you may choose from radix-2, radix-4, and specialized FFT
implementations, etc. Final chip must be presented in layout level after synthesis, a code
file alone is not sufficient.
For establishing the benchmark, you may need C/C++, TCL/TK, PERL, MATLAB,
etc.
The project’s deliverables will include your FFT specification definitions, test files,
testing benchmark, test outputs reports (both simulated and physical), a Cadence layout
of the FFT hardware, code or a schematic abstracting your layout, and a report of your
algorithm (in the form of a paper or pseudo-code).
Below are several references on FFT hardware implementations:
z A radix 4 delay commutator for fast Fourier transform processor
implementation Swartzlander, E.E.; Young, W.K.W.; Joseph, S.J.; Solid-State
Circuits, IEEE Journal of ,Volume: 19 , Issue: 5 , Oct 1984 Pages:702 - 709
z A 64-point Fourier transform chip for high-speed wireless LAN application
using OFDM Maharatna, K.; Grass, E.; Jagdhold, U.; Solid-State Circuits,
IEEE Journal of ,Volume: 39 , Issue: 3 , March 2004 Pages:484 - 493
z “Design Considerations and Implementation of a DSP-Based Car-Radio IF
Processor”, IEEE Journal of Solid State Circuits. Jul. 2004. Pages 1110–1118.
z “A single-chip MPEG-2 codec based on customizable media embedded
processor”, IEEE Journal of Solid State Circuits. Mar. 2003. Pages 530–540.
z Please actively search/Google for newest ideas and build upon them!
Mentor: Whitney

4.4. All Digital PLL


Phase-lock loops (PLLs) are used to recover timing information from a signal—they
are ubiquitous in communications, and are also used for timing recovery on boards and
chips. Analog PLLs are very hard to design because they use feedback, and are very
sensitive to noise and operating parameters.
The goal of this project is to design an “all digital PLL” which is an implementation
of the PLL with all digital components and compare its performance (measured in lock
time and phase noise) and costs (in terms of area, power, delay) to a traditional analog
PLL.
References:
z www.cs.wright.edu/~jstephen/ee737/ResearchPapers/DeLong.doc
z CMOS Circuit Design, Layout, and Simulation, by R. Jacob Baker, Harry W.Li
and David E. Boyce, Published by IEEE
z R. B. Staszewski, C. Hung, K. Maggio, J. Wallberg D. Leipold, P. T. Balsara,
“All-Digital Phase-Domain TX Frequency Synthesizer for Bluetooth Radios in
0.13um CMOS
z R. B. Staszewski et al., “A First Digitally-Controlled Oscillator in a
Deep-Submicron CMOS Process for Multi-GHz Wireless Applications,” Dig.
RFIC Symp., pp. 81–84, June 2003.
z R. B. Staszewski et al., “Digitally-Controlled Oscillator (DCO)-Based
Architecture for RF Frequency Synthesis in a Deep-Submicron CMOS
Pprocess,” Trans. on Circuits and Systems II, vol. 50, no. 11, pp. 815-828, Nov.
2003.
Mentor: Jae-Seok Yang

4.5. Design of Robust CMOS Circuits for Soft-error Tolerance


With the continuing scaling of technology, lower supply voltage and increase of
operating frequency, integrated circuits become increasingly susceptible to single event
upsets (SEU) caused by transient noise or high energy particles. A SEU may cause a bit
flip in some latch or memory element, thereby altering the state of the system, leading to
a ‘soft error’. The main objective of the project is to introduce more robustness with
some redundancy in circuits to make them less susceptible to undesired errors. The focus
is to explore various circuit level as well as system level techniques to reduce the effect
of soft errors for logic and memory circuits.
Some of the references are:
z R.C. Baumann, “Soft errors in advanced computer systems,” IEEE Des. Test.
Computer., vol.22, no. 3, pp. 258-266, May/Jun. 2005
z R.C. Baumann, “Radiation-Induced Soft Errors in Advanced Semiconductor
Technologies,” IEEE Transactions on Device and Materials Reliability, Vol. 5,
No. 3, September 2005
z Abdulkadir Utku Diril, “Circuit Level Techniques for Power and Reliability
Optimization of CMOS Logic,” PhD Dissertation, Department of Electrical
and Computer Engineering, Georgia Institute of Technology, May 2005
Mentor: Whitney

4. 6. Programmable FIR Filter


For this project, you are to design and test a programmable filter core for
communication and signal processing systems.
From a practical point of view, you are encouraged to shoot for 16 bit precision
input/coefficients (X[15:0], a[15:0]), or even more ambition.
From a testing point of view, a testing benchmark must be established to perform
systematic testing with reasonably good coverage. Typical corner case testing alone is not
sufficient. Possible languages to use for benchmark establishment: C/C++, TCL/TK,
PERL, MATLAB, etc.
Suggested Functional Specification:
z The Finite Impulse Response filter core that implements the following function:
N
z Y [k ] = ∑ ai X [k − i ]
i =1

z The FIR filter core has 2 modes.


– coefficient ai is 8bit, N=50 taps
– coefficient ai is 16bit, N=25 taps
Suggested Interface Specification:
z Fig.1. illustrates the interface of the core.
z Table. 1 tabulates description of each signal.
Fig. 1. Interface diagram

Signal Direction Description

CLK input Clock

Coefficient update mode when 1. In this mode, A is read


AUP input and inserted into internal register at each rising edge of
CLK. When 0, X is read at every 5 rising edge of CLK.

Eight bits of coefficient. In 8 bit/50 tap mode, each input


A[0:7] input represents one coefficient. In 16bit/25 tap mode, 8LSB is
read first, followed by 8MSB.

X[0:7] input Eight bits of input data.

The core resets itself when RESET=1, at rising edge of


RESET input
CLK.

Y[23:0] output 24 bit output data.

When this signal is up, it means the output Y will become


O_STB output
valid at the next rising edge of CLK.

Table. 1. Signal specifications


Suggested Performance Targets:
In 0.18μm CMOS technology, operating under 2.5V power supply, this core should
have area less than 1 mm2 . The core should operate with internal clock higher clock than
200MHz and should process data at a speed higher than 50M sample/s. Power
consumption should be below 100mW.
Suggested Testing tools: C/C++, TCL/TK, PERL, MATLAB, etc.
References:
z Hatamian, M.; Parhi, K.K. “An 85-MHz fourth-order programmable IIR digital
filter chip” IEEE Journal of Solid-State Circuits, Volume: 27 , Issue: 2 , Feb.
1992
z Anantha P. Chandrakasan and Robert W. Brodersen, “Minimizing Power
Consumption in Digital CMOS Circuits”, Proceedings of the IEEE, vol. 83, No.
4 April 1995.
z Ahmed M. Shams, et al., “Performance Analysis of Low-Power 1-Bit CMOS
Full Adder Cells” IEEE Transactions on VLSI Systems, VOL. 10, NO. 1,
February 2002
z Related book chapters and most updated papers/journals.
z Please actively search/Google for newest ideas and build upon them!
Mentor: Whitney

4.7. On-Chip Interconnect Network


In this project, you will get an in-depth understanding of the VLSI design of modern
on-chip interconnection network. To begin with, the following article serves as a good
introduction: “Architectural Choices in Large Scale ATM Switches,” J. Turner and N.
Yamanaka, IEICE Transactions, 1998.
The major task of this project is to select and implement a switching architecture.
For instance, in the article above, a Batcher-Banyan based, self-routing network is
chosen.
Many new techniques have been proposed; please spend proportional time on
selecting among them. You are encouraged to invent new architectures and algorithms
and analyze their strength and drawback.
Here are some more articles that may be useful:
z A 250-Mbit/s CMOS Crosspoint Switch. Shin and Hodges, IEEE JSSC 24(2),
April 1989, pp. 478-486.
z A 250-Mb/s CMOS Crosspoint LSI for ATM Switching. Akata et al., IEEE
JSSC 25(6), December 1990, pp. 1433-1439.
z A High-speed CMOS Circuit for 1.2-Gb/s 16x16 ATM Switching. Chemarin et
al., IEEE JSSC 27(7), July 1992, pp. 1116-1120.
z A 200Mhz CMOS Broad-Band Switching Chip. O’Neill et al., IEEE JSSC
28(3), March 1993, pp. 269-275.
z Please actively search/Google for new ideas and build upon them!
Once the architecture is matured, you may employ the skills developed in our labs to
implement a prototype (physical level). In view the limited time, you may put most of the
efforts on core algorithm and structure and size down the whole system. Please consider
how to establish your testing benchmark of your switch from the very beginning. Again,
your testing benchmark should have fairly good coverage. As to the benchmark setup,
you may use C/C++, or scripture languages like TCL/TK, PERL, etc. As this is more in
the flavor of an open topic, your final grade will be based on your ideas, implementation
workload, and testing mechanisms, etc. Especially, your implementation should
demonstrate fair workload worthy of a serious project in our graduate class.
Mentor: Junyoung Park

4.8. Design of Circuits for Sub-threshold Voltages


For ultra low power and portable applications, design of digital subthreshold logic is
investigated with transistors operated in the subthreshold region (supply voltage
corresponding to Logic 1, less than the threshold voltage of the transistor). In this
technique, the subthreshold leakage current of the device is used for computation.
Standard design techniques suitable for superthreshold design can be used in the
subthreshold region. However, it has been shown that a complete co-design at all levels
of hierarchy (device, circuit, and architecture) is necessary to reduce the overall power
consumption while achieving acceptable performance (hundreds of kilohertz) in the
subthreshold regime of operation. Your goal is to choose a suitable application, such as an
adder, multiplier, FFT module etc, and implement it using sub-threshold voltage logic.
References:
z A. Wang and A. Chandrakasan, “A 180mV FFT Processor Using Subthreshold
Circuit Techniques,” in Proc. IEEE International Solid-State Circuits
Conference, Feb. 2004.
z H. Soeleman and K. Roy, "Ultra Low Power Digital Sub-Threshold Logic",
International Symp. on Low-Power Electronics and Design, pp. 94-96, August
1999.
z H. Soeleman and K. Roy, "Digital CMOS Logic Operation in the
Sub-Threshold Region", IEEE Great Lakes Symposium on VLSI, pp. 107-112,
March 2000.
z H. Soeleman, K. Roy, and B. Paul, "Robust Sub-Threshold Logic for
Ultra-Low Power Operation", IEEE Transactions on VLSI Systems, Special
issue on low-power design, pp.90-99, February 2001.
z B. Paul, H. Soeleman, and K. Roy, "An 8X8 Sub-Threshold Digital CMOS
Carry Save Array Multiplier", European Solid State Circuits Conference,
September 2001.
Mentor: Jae-Seok Yang

4.9. Implement a Sub-threshold Voltage Cell Library for 45nm


Technology
A standard cell library (SCL) contains the basic building blocks for designing an
integrated circuit. It has a fixed set of well-characterized logic blocks. Once an
integrated circuit is built using the library, the behavior of circuit will depend on
information within the individual cells from the library. This information includes
parasitic capacitance, area, and delay. In order to qualify as a standard cell library, it has
to include NAND, NOR, inverter, and D flip-flops. SCL is commonly employed by
Application Specific Integrated Circuit (ASIC) designers due to robustness and flexibility
of the library, resulting in quick turnaround times.
This purpose of this project is to build a low power standard cell library using
sub-threshold voltage in 45nm technology. You should have your report explaining in
detail how you create the library.
Reference:
z http://www.cerc.utexas.edu/~tywu/library
z B. H. Calhoun and A. Chandrakasan, “Ultra-Dynamic Voltage Scaling Using
Subthreshold Operation and Local Voltage Dithering in 90nm CMOS,” in Proc.
IEEE International Solid-State Circuits Conference,Feb. 2005.
z H. Soeleman and K. Roy, "Ultra Low Power Digital Sub-Threshold Logic",
International Symp. on Low-Power Electronics and Design, pp. 94-96, August
1999.
z H. Soeleman and K. Roy, "Digital CMOS Logic Operation in the
Sub-Threshold Region", IEEE Great Lakes Symposium on VLSI, pp. 107-112,
March 2000.
Mentor: Jae-Seok Yang

4.10. Higher arithmetic (CORDIC)


We’ll discuss the implementation of adders and multipliers in detail in this class.
However, we won’t talk about how complex functions are implemented.
So if you’ve ever wondered how a 2$ pocket calculator computes sines, cosines,
logs, etc. in the blink of an eye, this is the project for you.
It would be grossly inefficient to use Taylor series expansions for computing
transcendental functions. Instead there are much better representations, and CORDIC
makes use of one particular representation, which allows sines and cosines to be
computed with nothing more than additions, shifts, and a single multiplication (by a
constant); it very high precision with very little computational cost (O(n) work for n bits).
The text book talks about computing CORDIC in Chapter 8. And, note that the
reference article sidesteps the issue of approximability of angles by the sum Σi=0
i=N(−1)ai‫ڄ‬tan−12−i). The approach works because tan−12−i<2‫ڄ‬tan−12−(i+1)), so each successive
iteration yields an angle that’s less than half of what it was before.
Reference:
z http://www.worldserver.com/turk/computergraphics/FixedPointTrigonometry.pdf
Mentor: Whitney
4.11. Hardware accelerated Monte Carlo simulation
In its simplest form, an option gives the purchaser the right to buy an object (which
could be a stock, or a commodity, we’ll assume stock for simplicity) for a fixed price at a
given time in the future. More generally, options exist wherein the purchaser can buy the
commodity for a fixed price at any point up to a given time, or at the lowest price up to
the given time, etc.
When the purchase time is fixed, interest rates are constant, and the object price
follows Brownian motion, the Black-Scholes formula gives an analytical way to
determine the fair price of the option. This situation is rare, and analytical techniques do
not exist for general option pricing.
Monte Carlo simulation can be used to get an idea of the fair price; it is
computationally challenging, and the goal of this project is to use hardware acceleration
for pricing. It is most natural to use a finite time step for the simulation.
One approach is to derive the exact distribution of the stock price. Given a
distribution for a discrete random variable X (the stock price), and a distribution for a
discrete random variable Y (its change), the distribution for X+Y is derived by
convolving the two distributions—direct convolution can get expensive (quadratic in the
range of the two variables), and FFT-based convolution may be a good way to proceed.
You may want to consider various distributions for the increment, not just binomial, but
something with a heavy tail.
Another approach is to simulate a large number of trials, and determine a
distribution based on the trial outcomes.
Reference:
z en.wikipedia.org/wiki/Binomial_options_pricing_model
z Approximate Option Pricing. P. Chalasani, S. Jha, and I. Saias. Algorithmica. 1999.
z Mathematics for Finance: An Introduction to Financial Engineering. Capinski and
Zastawniak.
z Randomized Algorithms. Motwani and Raghavan.
Mentor: Jae-Seok Yang

4.12. Synthesis of Niagara Processor Core


The intent of this project is to do a top-down design of a synthesizable block in the
processor core and to achieve low power. One of the following blocks in SPARC-T1 core
from SUN will be used as a target module for this project: EXU (Execution Unit), LSU
(Load Store Unit) and IFU (Instruction Fetch Unit). You are expected to design it using
45nm technology. The project activities will include: Reducing leakage and dynamic
power, and estimating area.
Reference:
z http://opensparc-t1.sunsource.net/
z http://csg.csail.mit.edu/6.375/6_375_2008_www/handouts/tutorials/tut4-dc.pdf
z http://www2.informatik.uni-jena.de/~doersing/lehre/ps/sn99.10_dok/synth/print/dc
rmo.pdf
z http://users.ece.utexas.edu/~mcdermot/vlsi/main/project/web/niagara.pdf
Mentor: Junyoung Park

4.13. On-silicon delay characterization


As variability increases, there is growing interesting in making adaptive chips,
where parameters such as supply voltage and body biases can be set post-manufacturing
to overcome the effects of parametric variation.
The goal of this project is to study the cost and accuracy of on-chip delay
characterization structures. I’d like you to survey the state-of-the-art, as well as perform
your own experiments.
For example, Dhar et al. introduce an adaptive voltage scaling controller that uses an
inexpensive ring oscillator to measure speed. There could be multiple ring oscillators
placed throughout the design. The gate delay would be approximated based on the delay
of the nearest ring oscillator.
Another promising approach would be to implement delay characterization based on
Razor by Ernst et al. By using a shadow latch and comparator logic, Razor has
mechanisms to monitor when a delay error has taken place. In the context of an FPGA, a
test input could run through the CLB in successively faster clock cycles until there is a
delay error. Additionally, neighboring CLBs could perform the shadow latching and
comparator logic need for Razor testing using existing CLB resources. The test literature
(International Test Conference, Fault-Tolerant Computing) would also be a good place
review.
Reference:
z S. Dhar, D. Maksimović, and B. Kranzen. Closed-loop adaptive voltage scaling
controller for standard-cell ASICs. International Symposium on Low Power
Electronics and Design, pages 103–107, 2002.
z D. Ernst, S. Das, S. Lee, D. Blaauw, T. Austin, T. Mudge, N. Kim, and K. Flautner.
Razor: Circuit-Level Correction of Timing Errors for Low-Power Operation. IEEE
Micro, 24(6):10–20, 2004.
Mentor: Junyoung Park
 

You might also like