0% found this document useful (0 votes)
19 views48 pages

Lec10 Memory 2

Uploaded by

xzj19950313
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views48 pages

Lec10 Memory 2

Uploaded by

xzj19950313
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

CENG 4480

Embedded System Development & Applications

Lecture 10: Memory 2


Bei Yu
CSE Department, CUHK
[email protected]

(Latest update: October 27, 2021)

Fall 2021
CENG4480 v.s. CENG3420

CENG3420:
• architecture perspective

• memory coherent

• data address

CENG4480:
• more details on how data is stored

2/48
Memory Arrays

3/48
SRAM
SRAM

• What if we add feedback to a pair of inverters?

0 1 0

• Usually drawn as a ring of cross-coupled inverters

• Stable way to store one bit of information (w. power)

0 1

1 0

5/48
How to change the value stored?

• Replace inverter with NAND gate

• RS Latch

A B A nand B
1
0 0 0 1
0 1 1
1 0 1
1 0
1 1 0

6/48
12T SRAM Cell

• Basic building block: SRAM Cell


• Holds one bit of information, like a latch
• Must be read and written

• 12-transistor (12T) SRAM cell


• Use a simple latch connected to bitline
• 46 × 75 λ unit cell

7/48
nMOS, pMOS, Inverter

• nMOS:
• Gate = 1, transistor is ON
• Then electric current path

• pMOS:
• Gate = 0, transistor is ON
• Then electric current path

• Inverter:
• Q = NOT (A)

8/48
6T SRAM Cell

• Used in most commercial chips

• A pair of weak cross-coupled inverters

• Data stored in cross-coupled inverters

• Compared with 12T SRAM, 6T SRAM:


• (+) reduce area
• (-) much more complex control

9/48
6T SRAM Read

• Precharge both bitlines high

• Then turn on wordline

• One of the two bitlines will be pulled


down by the cell

• Read stability
• A must not flip
• N1 >> N2

10/48
EX: 6T SRAM Read

• Question 1: A = 0, A_b = 1, discuss the behavior:

• Question 2: At least how many bit lines to finish read?

11/48
6T SRAM Write

• Drive one bitline high, the other low

• Then turn on wordline

• Bitlines overpower cell with new value

• Writability
• Must overpower feedback inverter
• N4 >> P2
• N2 >> P1 (symmetry)

12/48
EX: 6T SRAM Write

• Question 1: A = 0, A_b = 1, discuss the behavior:

• Question 2: At least how many bit lines to finish write?

13/48
6T SRAM Sizing

• High bitlines must not overpower inverters during reads

• But low bitlines must write new value into cell

14/48
Memory Arrays

15/48
DRAM
Dynamic RAM (DRAM)

• Basic Principle: Storage of information on capacitors

• Charge & discharge of capacitor to change stored value

• Use of transistor as "switch" to:


• Store charge
• Charge or discharge

17/48
4T DRAM Cell

Remove the two p-MOS


transistors from static
RAM cell, to get a
four-transistor dynamic
RAM cell.

• Data must be refreshed regularly

• Dynamic cells must be designed very carefully

• Data stored as charge on gate capacitors (complementary nodes)

18/48
3T DRAM Cell

• No constraints on device ratios

• Reads are non-destructive

• Value stored at node X when writing a "1" = VDD − VT

19/48
3T DRAM Layout

• 576 λ 3T DRAM v.s. 1092 λ 6T SRAM

• Further simplified

20/48
1T DRAM Cell

• Need sense amp helping reading


21/48
1T DRAM Cell

• Read
• Pre-charge large tank to VDD2
• If Ts = 0, for large tank: VDD2 - V1
• If Ts = 1, for large tank: VDD2 + V1
• V1 is very insignificant
• Need sense amp 22/48
1T DRAM Cell

• Write: Cs is charged or discharged by asserting WL and BL

• Read: Charge redistribution takes place between bit line and storage capacitance

• Voltage swing is small; typically around 250 mV

23/48
EX. 1T DRAM Cell

• Question: VDD =4V, CS =100pF, CBL =1000pF. What’s the voltage swing value?

• Note: ∆V = VDD CS
2 · CS +CBL

24/48
SRAM v.s. DRAM

• Static (SRAM)
• Data stored as long as supply is applied
• Large (6 transistorscell)
• Fast
• Compatible with current CMOS manufacturing

• Dynamic (DRAM)
• Periodic refresh required
• Small (1-3 transistors/cell)
• Slower
• Require additional process for trench capacitance

25/48
Array Architecture
Array Architecture

• 2ˆn words of 2ˆm bits each

• Good regularity - easy to design

27/48
SRAM Memory Structure

• Latch based memory

28/48
Array Architecture

• 2ˆn words of 2ˆm bits each

• How to design if n >> m?

• Fold by 2k into fewer rows of more columns

29/48
Decoders

• n:2n decoder consists of 2n n-input AND gates


• One needed for each row of memory
• Build AND with NAND or NOR gates

Static CMOS Using NOR gates

30/48
EX. Decoder

• Question: AND gates => NAND gate structure

31/48
Larger Decoder

• For n > 4, NAND gates become slow


• Break large gates into multiple smaller gates

32/48
Predecoding

• Many of these gates are redundant


• Factor out common gates
• => Predecoder
• Saves area
• Same path effort

• Question: How many NANDs can be saved?


33/48
Appendix
*Decoder Layout

• Decoders must be pitch-matched to SRAM cell


• Requires very skinny gates

35/48
*Column Circuitry

• Some circuitry is required for each column


• Bitline conditioning
• Column multiplexing
• Sense amplifiers (DRAM)

36/48
*Bitline Conditioning

• Precharge bitlines high before reads

• Equalize bitlines to minimize voltage difference when using sense amplifiers

37/48
*Twisted Bitlines

• Sense amplifiers also amplify noise


• Coupling noise is severe in modern processes
• Try to couple equally onto bit and bit_b
• Done by twisting bitlines

38/48
*SRAM Column Example

read write

39/48
*Column Multiplexing

• Recall that array may be folded for good aspect ratio

• Ex: 2 kword x 16 folded into 256 rows x 128 columns


• Must select 16 output bits from the 128 columns
• Requires 16 8:1 column multiplexers

40/48
*Ex: 2-way Muxed SRAM

41/48
*Tree Decoder Mux

• Column mux can use pass transistors


• Use nMOS only, precharge outputs

• One design is to use k series transistors for 2k :1 mux


• No external decoder logic needed

42/48
*SRAM from ARM

43/48
Sense Amp Operation for 1T DRAM

• 1T DRAM read is destructive

• Read and refresh for 1T DRAM


44/48
*Sense Amplifiers (DRAM)

• Bitlines have many cells attached


• Ex: 32-kbit SRAM has 256 rows x 128 cols
• 256 cells on each bitline

• tpd ∝ (C/I)∆V
• Ex: Even with shared diffusion contacts, 64C of diffusion capacitance (big C)
• Discharged slowly through small transistors (small I)

• Sense amplifiers are triggered on small voltage swing (reduce ∆V)

45/48
*Differential Pair Amp

• Differential pair requires no clock

• But always dissipates static power

46/48
*Clocked Sense Amp

• Clocked sense amp saves power

• Requires sense_clk after enough bitline swing

• Isolation transistors cut off large bitline capacitance

47/48
Thank You :-)

You might also like