0% found this document useful (0 votes)

633 views

GPU Architecture

The document discusses GPU architectures from a CPU perspective. It begins with an introduction to data parallelism and how graphics workloads that involve identical, independent computations on multiple data inputs can be executed in parallel on GPUs. It then covers GPU programming models and execution models like SIMT. The document outlines different approaches for exploiting data parallelism, including using multiple CPUs or a single program multiple data (SPMD) model on multiple CPUs.

Uploaded by

gautamd07

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

633 views

GPU Architecture

Uploaded by

gautamd07

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

GPU

Architectures
A CPU Perspec+ve

Derek Hower AMD Research 5/21/2013

With updates by David Wood

Goals
Data Parallelism: What is it, and how to exploit it?
Workload characterisHcs

Execu1on Models / GPU Architectures

MIMD (SPMD), SIMD, SIMT

GPU Programming Models

Terminology translaHons: CPU AMD GPU Nvidia GPU
Intro to OpenCL

Modern GPU Microarchitectures

i.e., programmable GPU pipelines, not their xed-funcHon predecessors

Advanced Topics: (Time permiYng)

The Limits of GPUs: What they can and cannot do
The Future of GPUs: Where do we go from here?

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallel
ExecuHon
on GPUs
Data Parallelism, Programming Models, SIMT

GPU ARCHITECTURES: A CPU PERSPECTIVE

Graphics Workloads
Streaming computaHon

GPU

GPU ARCHITECTURES: A CPU PERSPECTIVE

Graphics Workloads
Streaming computaHon on pixels

GPU

GPU ARCHITECTURES: A CPU PERSPECTIVE

Graphics Workloads
Iden2cal, Streaming computaHon on pixels

GPU

GPU ARCHITECTURES: A CPU PERSPECTIVE

Graphics Workloads
Iden2cal, Independent, Streaming computaHon on pixels

GPU

GPU ARCHITECTURES: A CPU PERSPECTIVE

Architecture Spelling Bee

P-A-R-A-L-L-E-L

Spell
Independent

GPU ARCHITECTURES: A CPU PERSPECTIVE

Generalize: Data Parallel Workloads

Iden2cal, Independent computaHon on mul2ple data inputs

0,7

=()

7,0

1,7

=()

6,0

2,7

=()

5,0

3,7

=()

4,0

GPU ARCHITECTURES: A CPU PERSPECTIVE

Nave Approach
Split independent work over mul1ple processors
0,7

1,7

2,7

3,7

CPU0

7,0

()
=

CPU1

6,0

()
=

CPU2

5,0

()
=

CPU3

4,0

()
=

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallelism: A MIMD Approach

MulHple InstrucHon MulHple Data
Split independent work over mul1ple processors
Program

=()
Program

=()

0,7

CPU0

7,0

Memory Writeback
Fetch Decode Execute

1,7

CPU1

6,0

Memory Writeback
Fetch Decode Execute

2,7

CPU2

5,0

Memory Writeback
Fetch Decode Execute

3,7

CPU3

4,0

Memory Writeback
Fetch Decode Execute

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallelism: A MIMD Approach

MulHple InstrucHon MulHple Data
Split independent work over mul1ple processors
Program

=()

0,7

CPU0

7,0

Memory Writeback
Fetch
Decode
Execute
When
work
is iden1cal
(same program):

1,7
CPU1
Program
Single Program MulHple
Data (SPMD)

Fetch Decode Execute Memory Writeback

of MIMD)
(Subcategory
=()
Program

=()
Program

=()

2,7

6,0

CPU2

5,0

Memory Writeback
Fetch Decode Execute

3,7

CPU3

4,0

Memory Writeback
Fetch Decode Execute

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallelism: An SPMD Approach

Single Program MulHple Data
Split iden1cal, independent work over mul1ple processors
Program

=()
Program

=()

0,7

CPU0

7,0

Memory Writeback
Fetch Decode Execute

1,7

CPU1

6,0

Memory Writeback
Fetch Decode Execute

2,7

CPU2

5,0

Memory Writeback
Fetch Decode Execute

3,7

CPU3

4,0

Memory Writeback
Fetch Decode Execute

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallelism: A SIMD Approach

Single InstrucHon MulHple Data
Split iden1cal, independent work over mul1ple execuHon units (lanes)
More ecient: Eliminate redundant fetch/decode
0,7
1,7
2,7

Program

=()

CPU0

3,7
Execute Memory

Execute
Memory
Fetch Decode
Memory
Execute
Memory
Execute

GPU ARCHITECTURES: A CPU PERSPECTIVE

7,0

Writeback

6,0

Writeback
Writeback

5,0

Writeback

4,0

SIMD: A Closer Look

One Thread + Data Parallel Ops Single PC, single register le
0,7
1,7
2,7

Program

=()

CPU0

3,7
Execute
Memory
Memory
Execute

Fetch Decode
Memory
Execute
Memory
Execute

7,0

Writeback

6,0

Writeback
Writeback

5,0

Writeback

4,0

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallelism: A SIMT Approach

Single InstrucHon MulHple Thread
Split iden1cal, independent work over mul1ple lockstep threads
MulHple Threads + Scalar Ops One PC, MulHple register les
0,7

Program

=()

WF0
1,7
Memory Writeback
Execute

2,7

3,7
Execute
Memory Writeback

Fetch Decode

Execute Memory Writeback

7,0
6,0
5,0
4,0

Memory Writeback
Execute

GPU ARCHITECTURES: A CPU PERSPECTIVE

Terminology Headache #1

Its common to interchange

SIMD and SIMT

GPU ARCHITECTURES: A CPU PERSPECTIVE

Data Parallel ExecuHon Models

MIMD/SPMD

MulHple independent
threads

SIMD/Vector

One thread with wide

execuHon datapath

GPU ARCHITECTURES: A CPU PERSPECTIVE

SIMT

MulHple lockstep threads

ExecuHon Model Comparison

Example
Architecture

MIMD/SPMD

SIMD/Vector

MulHcore CPUs

x86 SSE/AVX

More general:
supports TLP

Can mix sequenHal & Easier to program

parallel code
Gather/Scamer
operaHons

Inecient for data

parallelism

Gather/Scamer can
be awkward

Pros

Cons

GPU ARCHITECTURES: A CPU PERSPECTIVE

SIMT

GPUs

Divergence kills
performance
19

GPUs and Memory

Recall: GPUs perform Streaming computaHon
Streaming memory access

GPU

DRAM latency: 100s of GPU cycles

How do we keep the GPU busy (hide memory latency)?

GPU ARCHITECTURES: A CPU PERSPECTIVE

Hiding Memory Latency

OpHons from the CPU world:
Caches

Need spaHal/temporal locality

OoO/Dynamic Scheduling
Need ILP

MulHcore/MulHthreading/SMT
Need independent threads

GPU ARCHITECTURES: A CPU PERSPECTIVE

MulHcore MulHthreaded SIMT

Many SIMT threads grouped together into GPU Core
SIMT threads in a group SMT threads in a CPU core
Unlike CPU, groups are exposed to programmers

MulHple GPU Cores

GPU
GPU Core

GPU Core

GPU ARCHITECTURES: A CPU PERSPECTIVE

MulHcore MulHthreaded SIMT

Many SIMT threads grouped together into GPU Core
SIMT threads in a group SMT threads in a CPU core
Unlike CPU, groups are exposed to programmers

MulHple GPU Cores

This is a GPU Architecture (Whew!)

GPU
GPU Core

GPU Core

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Component Names

AMD/OpenCL

Dereks CPU Analogy

Processing Element

Lane

SIMD Unit

Pipeline

Compute Unit

Core

GPU Device

Device

GPU Core

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Programming
Models
OpenCL

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Programming Models

CUDA Compute Unied Device Architecture
Developed by Nvidia -- proprietary
First serious GPGPU language/environment

OpenCL Open CompuHng Language

From makers of OpenGL
Wide industry support: AMD, Apple, Qualcomm, Nvidia (begrudgingly), etc.

C++ AMP C++ Accelerated Massive Parallelism

Microsos
Much higher abstracHon that CUDA/OpenCL

OpenACC Open Accelerator

Like OpenMP for GPUs (semi-auto-parallelize serial code)
Much higher abstracHon than CUDA/OpenCL

GPU Programming Models

CUDA Compute Unied Device Architecture
Developed by Nvidia -- proprietary
First serious GPGPU language/environment

OpenCL Open CompuHng Language

From makers of OpenGL
Wide industry support: AMD, Apple, Qualcomm, Nvidia (begrudgingly), etc.

C++ AMP C++ Accelerated Massive Parallelism

Microsos
Much higher abstracHon that CUDA/OpenCL

OpenACC Open Accelerator

Like OpenMP for GPUs (semi-auto-parallelize serial code)
Much higher abstracHon than CUDA/OpenCL

OpenCL
Early CPU languages were light abstracHons of physical hardware
E.g., C

Early GPU languages are light abstracHons of physical hardware

OpenCL + CUDA

GPU ARCHITECTURES: A CPU PERSPECTIVE

OpenCL
Early CPU languages were light abstracHons of physical hardware
E.g., C

Early GPU languages are light abstracHons of physical hardware

OpenCL + CUDA

GPU Architecture
GPU
GPU Core

GPU Core

GPU ARCHITECTURES: A CPU PERSPECTIVE

OpenCL
Early CPU languages were light abstracHons of physical hardware
E.g., C

Early GPU languages are light abstracHons of physical hardware

OpenCL + CUDA

GPU Architecture

OpenCL Model

GPU
GPU Core

NDRange
GPU Core

Workgroup

Work-item

Wavefront

GPU ARCHITECTURES: A CPU PERSPECTIVE

NDRange
N-Dimensional (N = 1, 2, or 3) index space
ParHHoned into workgroups, wavefronts, and work-items
NDRange
Workgroup

GPU ARCHITECTURES: A CPU PERSPECTIVE

Workgroup

Kernel
Run an NDRange on a kernel (i.e., a funcHon)
Same kernel executes for each work-item
Smells like MIMD/SPMD

Kernel
0,7

=()

7,0

1,7

=()

6,0

2,7

=()

5,0

3,7

=()

4,0

GPU ARCHITECTURES: A CPU PERSPECTIVE

Kernel
Run an NDRange on a kernel (i.e., a funcHon)
Same kernel executes for each work-item
Smells like MIMD/SPMDbut beware, its not!

Kernel

Workgroup

0,7

=()

7,0

1,7

=()

6,0

2,7

=()

5,0

3,7

=()

4,0

GPU ARCHITECTURES: A CPU PERSPECTIVE

OpenCL Code

__kernel
void flip_and_recolor(__global float3 **in_image,
__global float3 **out_image,
int img_dim_x, int img_dim_y)
{
int x = get_global_id(1); // get work-item id in dim 1
int y = get_global_id(2); // get work-item id in dim 2
out_image[img_dim_x - x][img_dim_y - y] =
recolor(in_image[x][y]);
}
GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU
Microarchitecture
AMD Graphics Core Next

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Hardware Overview

GPU
GDDR5
GPU

Local Memory

GPU ARCHITECTURES: A CPU PERSPECTIVE

SIMT

L1 Cache
SIMT

SIMT

L1 Cache

SIMT

GPU Core

SIMT

GPU Core

L2 Cache

Local Memory

Compute Unit A GPU Core

Compute Unit (CU) Runs Workgroups

Workgroup

Contains 4 SIMT Units

Picks one SIMT Unit per cycle for scheduling

SIMT Unit Runs Wavefronts

Each SIMT Unit has 10 wavefront instrucHon buer
Takes 4 cycles to execute one wavefront

SIMT

L1 Cache

Local Memory

10 Wavefront x 4 SIMT Units =

40 Ac1ve Wavefronts / CU
64 work-items / wavefront x 40 acHve wavefronts =
2560 Ac1ve Work-items / CU

GPU ARCHITECTURES: A CPU PERSPECTIVE

Compute Unit Timing Diagram

Time

On average: fetch & commit one wavefront / cycle

1
2
3
4
5
6
7
8
9
10
11
12

SIMT0

SIMT1

WF1_0
WF1_1
WF1_2
WF1_3
WF5_0
WF5_1
WF5_2
WF5_3
WF9_0
WF9_1
WF9_2
WF9_3

WF2_0
WF2_1
WF2_2
WF2_3
WF6_0
WF6_1
WF6_2
WF6_3
WF10_0
WF10_1
WF10_2

SIMT2

SIMT3

WF3_0
WF3_1
WF3_2
WF3_3
WF7_0
WF7_1
WF7_2
WF7_3
WF11_0
WF11_1

WF4_0
WF4_1
WF4_2
WF4_3
WF8_0
WF8_1
WF8_2
WF8_3
WF12_0

GPU ARCHITECTURES: A CPU PERSPECTIVE

SIMT Unit A GPU Pipeline

Like a wide CPU pipeline except one fetch for enHre width
16-wide physical ALU
Executes 64-wavefront over 4 cycles. Why??

64KB register state / SIMT Unit

Compare to x86 (Bulldozer): ~1KB of physical register le state (~1/64 size)

Address Coalescing Unit

Registers

A key to good memory performance

Address Coalescing Unit

GPU ARCHITECTURES: A CPU PERSPECTIVE

Address Coalescing
Wavefront: Issue 64 memory requests

NDRange
Workgroup

GPU ARCHITECTURES: A CPU PERSPECTIVE

Workgroup

Address Coalescing
Wavefront: Issue 64 memory requests
Common case:
work-items in same wavefront touch same cache block

Coalescing:
Merge many work-items requests into single cache block request

Important for performance:

Reduces bandwidth to DRAM

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Memory
GPUs have caches.

GPU ARCHITECTURES: A CPU PERSPECTIVE

Not Your CPUs Cache

By the numbers: Bulldozer FX-8170 vs. GCN Radeon HD 7970
CPU (Bulldozer)

GPU (GCN)

L1 data cache capacity

16KB

16 KB

AcHve threads (work-items)

sharing L1 D Cache

2560

L1 dcache capacity / thread

16KB

6.4 bytes

Last level cache (LLC) capacity

8MB

768KB

AcHve threads (work-items)

sharing LLC

81,920

LLC capacity / thread

1MB

9.6 bytes

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Caches
Maximize throughput, not hide latency
Not there for either spaHal or temporal locality

L1 Cache: Coalesce requests to same cache block by dierent work-items

i.e., streaming thread locality?
Keep block around just long enough for each work-item to hit once
UlHmate goal: Reduce bandwidth to DRAM

L2 Cache: DRAM staging buer + some instrucHon reuse

UlHmate goal: Tolerate spikes in DRAM bandwidth

If there is any spaHal/temporal locality:

Use local memory (scratchpad)

GPU ARCHITECTURES: A CPU PERSPECTIVE

Scratchpad Memory
GPUs have scratchpads (Local Memory)

Allocated to a workgroup
i.e., shared by wavefronts in workgroup

GPU ARCHITECTURES: A CPU PERSPECTIVE

SIMT

Rename address
Manage capacity manual ll/evicHon

SIMT

Separate address space

Managed by sosware:

L1 Cache

Local Memory

Example System: Radeon HD 7970

High-end part
32 Compute Units:
81,920 AcHve work-items
32 CUs * 4 SIMT Units * 16 ALUs = 2048 Max FP ops/cycle
264 GB/s Max memory bandwidth

925 MHz engine clock

3.79 TFLOPS single precision (accounHng trickery: FMA)

210W Max Power (Chip)

>350W Max Power (card)
100W idle power (card)

GPU ARCHITECTURES: A CPU PERSPECTIVE

Radeon HD 7990 - Cooking

Two 7970s on one card:
375W (AMD Ocial) 450W (OEM)

GPU ARCHITECTURES: A CPU PERSPECTIVE

A Rose by Any
Other Name
GPU ARCHITECTURES: A CPU PERSPECTIVE
48

Terminology Headaches #2-5

Nvidia/CUDA

AMD/OpenCL

Dereks CPU Analogy

CUDA Processor

Processing Element

Lane

CUDA Core
GPU Core

Streaming
MulHprocessor

GPU Device

SIMD Unit

Pipeline

Compute Unit

Core

GPU Device

Device

GPU ARCHITECTURES: A CPU PERSPECTIVE

Terminology Headaches #6-9

CUDA/Nvidia

OpenCL/AMD

Henn&Pai

Work-item

Sequence of
SIMD Lane
OperaHons

Wavefront

Thread of
SIMD
InstrucHons

Block

Workgroup

Body of
vectorized
loop

Grid

NDRange

Vectorized
loop

Thread

Warp
Group

GPU ARCHITECTURES: A CPU PERSPECTIVE

Terminology Headache #10

GPUs have scratchpads (Local Memory)

Allocated to a workgroup
i.e., shared by wavefronts in workgroup

SIMT

Rename address
Manage capacity manual ll/evicHon

SIMT

Separate address space

Managed by sosware:

L1 Cache

Local Memory

Nvidia calls Local Memory

Shared Memory.
AMD some1mes calls it Group Memory.
GPU ARCHITECTURES: A CPU PERSPECTIVE

Recap
Data Parallelism: IdenHcal, Independent work over mulHple data inputs
GPU version: Add streaming access pamern

Data Parallel Execu1on Models: MIMD, SIMD, SIMT

GPU Execu1on Model: MulHcore MulHthreaded SIMT
OpenCL Programming Model
NDRange over workgroup/wavefront

Modern GPU Microarchitecture: AMD Graphics Core Next (GCN)

Compute Unit (GPU Core): 4 SIMT Units
SIMT Unit (GPU Pipeline): 16-wide ALU pipe (16x4 execuHon)
Memory: designed to stream

GPUs: Great for data parallelism. Bad for everything else.

GPU ARCHITECTURES: A CPU PERSPECTIVE

Advanced Topics
GPU LimitaAons, Future of GPGPU

Choose Your Own Adventure!

SIMT Control Flow & Branch Divergence
Memory Divergence
When GPUs talk
Wavefront communicaHon
GPU coherence
GPU consistency

Future of GPUs: Whats next?

SIMT Control Flow

Consider SIMT condiHonal branch:
One PC
MulHple data (i.e., mulHple condiHons)

if (x <= 0)
y = 0;
else
y = x;

SIMT Control Flow

Work-items in wavefront run in lockstep
Dont all have to commit

Branching through predica1on

AcHve lane: commit result

InacHve lane: throw away result

All lanes acHve at start: 1111

if (x <= 0)
y = 0;
else
y = x;

Branch set execuHon mask: 1000

Else invert execuHon mask: 0111
Converge Reset execuHon mask: 1111

SIMT Control Flow

Work-items in wavefront run in lockstep
Dont all have to commit

Branching through predica1on

AcHve lane: commit result

InacHve lane: throw away result

All lanes
acHve at start: 1111
Branch
divergence

if (x <= 0)
y = 0;
else
y = x;

Branch set execuHon mask: 1000

Else invert execuHon mask: 0111
Converge Reset execuHon mask: 1111

Branch Divergence
When control ow diverges, all lanes take all paths

Divergence Kills Performance

GPU ARCHITECTURES: A CPU PERSPECTIVE

Beware!
Divergence isnt just a performance problem:

__global int lock = 0;

void mutex_lock()
{

// acquire lock
while (test&set(lock, 1) == false) {
// spin
}
return;
}

GPU ARCHITECTURES: A CPU PERSPECTIVE

Beware!
Divergence isnt just a performance problem:

__global int lock = 0;

void mutex_lock()
{
Deadlock:
work-items cant enter mutex together!

// acquire lock
while (test&set(lock, 1) == false) {
// spin
}
return;
}

GPU ARCHITECTURES: A CPU PERSPECTIVE

Memory Bandwidth
SIMT

DRAM

Lane 0

Bank 0

Lane 1

Bank 1

Lane 2

Bank 2

Lane 3

Bank 3

-- Parallel Access
GPU ARCHITECTURES: A CPU PERSPECTIVE

Memory Bandwidth
SIMT

DRAM

Lane 0

Bank 0

Lane 1

Bank 1

Lane 2

Bank 2

Lane 3

Bank 3

-- Sequen1al Access
GPU ARCHITECTURES: A CPU PERSPECTIVE

Memory Bandwidth
Memory divergence
SIMT

DRAM

Lane 0

Bank 0

Lane 1

Bank 1

Lane 2

Bank 2

Lane 3

Bank 3

-- Sequen1al Access
GPU ARCHITECTURES: A CPU PERSPECTIVE

Memory Divergence
One work-item stalls enHre wavefront must stall
Cause: Bank conicts, cache misses

Data layout & parHHoning is important

GPU ARCHITECTURES: A CPU PERSPECTIVE

Memory Divergence
One work-item stalls enHre wavefront must stall
Cause: Bank conicts, cache misses

Data layout & parHHoning is important

Divergence Kills Performance

GPU ARCHITECTURES: A CPU PERSPECTIVE

CommunicaHon and SynchronizaHon

Work-items can communicate with:
Work-items in same wavefront
No special sync neededthey are lockstep!

Work-items in dierent wavefront, same workgroup (local)

Local barrier

Work-items in dierent wavefront, dierent workgroup (global)

OpenCL 1.x: Nope
OpenCL 2.x: Yes, but
CUDA 4.x: Yes, but complicated

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Consistency Models

Very weak guarantee:
Program order respected within single work-item
All other bets are o

Safety net:
Fence make sure all previous accesses are visible before proceeding
Built-in barriers are also fences

A wrench:
GPU fences are scoped only apply to subset of work-items in system
E.g., local barrier

Take-away: Area of acHve research

See Hower, et al. Heterogeneous-race-free Memory Models, ASPLOS 2014

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Coherence?
NoHce: GPU consistency model does not require coherence
i.e., Single Writer, MulHple Reader

MarkeHng claims they are coherent

GPU Coherence:
Nvidia: disable private caches
AMD: ush/invalidate enHre cache at fences

GPU ARCHITECTURES: A CPU PERSPECTIVE

GPU Architecture Research

Blending with CPU architecture:
Dynamic scheduling / dynamic wavefront re-org
Work-items have more locality than we think

Tighter integraHon with CPU on SOC:

Fast kernel launch
Exploit ne-grained parallel region: Remember Amdahls law

Common shared memory

Reliability:
Historically: Who noHces a bad pixel?
Future: GPU compute demands correctness

Power:
Mobile, mobile mobile!!!
GPU ARCHITECTURES: A CPU PERSPECTIVE

Computer Economics 101

GPU Compute is cool + gaining steam, but
Is a 0 billion dollar industry (to quote Mark Hill)

GPU design prioriHes:

1. Graphics
2. Graphics

N-1. Graphics
N. GPU Compute

Moral of the story:

GPU wont become a CPU (nor should it)

GPU ARCHITECTURES: A CPU PERSPECTIVE

Advanced Computer Architecture
88% (17)
Advanced Computer Architecture
170 pages
RTOS Based Embedded System Design
No ratings yet
RTOS Based Embedded System Design
16 pages
ARM Assembly Language Examples
100% (1)
ARM Assembly Language Examples
12 pages
D2c Software MGMT, Bkup, Restore, PM With AMS PDF
100% (1)
D2c Software MGMT, Bkup, Restore, PM With AMS PDF
230 pages
Commissioning Checklist
100% (4)
Commissioning Checklist
16 pages
Microprocessor Archetecture Cheat Sheet
No ratings yet
Microprocessor Archetecture Cheat Sheet
3 pages
Main GPU
No ratings yet
Main GPU
87 pages
Arm Basics
No ratings yet
Arm Basics
26 pages
Performance of Systemverilog Coding
No ratings yet
Performance of Systemverilog Coding
8 pages
FPGA Architecture
No ratings yet
FPGA Architecture
39 pages
Processors
100% (4)
Processors
44 pages
ARM Processors
No ratings yet
ARM Processors
16 pages
Memory Unit Bindu Agarwalla
No ratings yet
Memory Unit Bindu Agarwalla
62 pages
Computer Architecture Organization PDF
No ratings yet
Computer Architecture Organization PDF
101 pages
Modern GPU
100% (1)
Modern GPU
221 pages
II - Software Design For Low Power
No ratings yet
II - Software Design For Low Power
11 pages
Fpga Vs Asic Design Flow
No ratings yet
Fpga Vs Asic Design Flow
32 pages
Asic Vs Fpga
No ratings yet
Asic Vs Fpga
34 pages
Computer Organization
No ratings yet
Computer Organization
67 pages
SERDES
No ratings yet
SERDES
47 pages
Designing A SoC With Arm Cortex-M (2.0) PDF
No ratings yet
Designing A SoC With Arm Cortex-M (2.0) PDF
18 pages
Chip Design Made Easy
No ratings yet
Chip Design Made Easy
10 pages
Fpga Based System Design
100% (1)
Fpga Based System Design
30 pages
FPGA - Based Accelerators of Deep LearningNetworks For Learning and Classification
100% (1)
FPGA - Based Accelerators of Deep LearningNetworks For Learning and Classification
37 pages
Cad For Vlsi 1
No ratings yet
Cad For Vlsi 1
284 pages
GPU
No ratings yet
GPU
17 pages
Implementation of Digital Clock On FPGA: Industrial Training Report ON
No ratings yet
Implementation of Digital Clock On FPGA: Industrial Training Report ON
26 pages
Unit 1 and 2 PPTs
100% (2)
Unit 1 and 2 PPTs
81 pages
DSP Architecture Design Essentials
No ratings yet
DSP Architecture Design Essentials
353 pages
FPGA Interview Questions, FPGA Interview Questions & Answers, FPGA
No ratings yet
FPGA Interview Questions, FPGA Interview Questions & Answers, FPGA
9 pages
Introduction To RISC-V
100% (1)
Introduction To RISC-V
31 pages
Embedded Systems - RTOS
No ratings yet
Embedded Systems - RTOS
23 pages
Understanding DDR - DDR Protocol - Truechip VIPs
No ratings yet
Understanding DDR - DDR Protocol - Truechip VIPs
9 pages
GPU Architecture
0% (2)
GPU Architecture
28 pages
Layout Reference Guide
100% (1)
Layout Reference Guide
296 pages
Embedded C
100% (2)
Embedded C
48 pages
3.intertask Communication - Embedded OS PDF
67% (3)
3.intertask Communication - Embedded OS PDF
8 pages
Embedded Systems Notes
No ratings yet
Embedded Systems Notes
115 pages
Embedded C Notes
100% (2)
Embedded C Notes
16 pages
Asic and Fpga Design
No ratings yet
Asic and Fpga Design
24 pages
Flynns Taxonomy
0% (1)
Flynns Taxonomy
79 pages
GOOD DRAM Interface Tutorial
No ratings yet
GOOD DRAM Interface Tutorial
91 pages
Cache Memory
No ratings yet
Cache Memory
29 pages
SILICA Xilinx Zynq ZedBoard Vivado Workshop Ver1.0
No ratings yet
SILICA Xilinx Zynq ZedBoard Vivado Workshop Ver1.0
61 pages
RISC-V - Control Unit
100% (1)
RISC-V - Control Unit
25 pages
Cpu Vs Gpu
No ratings yet
Cpu Vs Gpu
12 pages
RTL Design FSM With Datapath
No ratings yet
RTL Design FSM With Datapath
47 pages
ECE 327 Slides VHDL Verilog Digital Hardware Design
No ratings yet
ECE 327 Slides VHDL Verilog Digital Hardware Design
705 pages
Analog Design Methodology Jnotor r3
No ratings yet
Analog Design Methodology Jnotor r3
17 pages
Low Power Design of Digital Systems
No ratings yet
Low Power Design of Digital Systems
28 pages
Verilog by Example
100% (1)
Verilog by Example
18 pages
Approaches For Power Management Verification of SoC Having Dynamic Power and Voltage Switching
No ratings yet
Approaches For Power Management Verification of SoC Having Dynamic Power and Voltage Switching
30 pages
Embedded Systems: Theory and Design
50% (2)
Embedded Systems: Theory and Design
27 pages
Soc Design Flow
No ratings yet
Soc Design Flow
41 pages
Hardware - Software Codesign PDF
No ratings yet
Hardware - Software Codesign PDF
20 pages
ARM Architecture
No ratings yet
ARM Architecture
547 pages
DDR Basics Frescale
No ratings yet
DDR Basics Frescale
53 pages
Application-Specific Integrated Circuit ASIC A Complete Guide
From Everand
Application-Specific Integrated Circuit ASIC A Complete Guide
Gerardus Blokdyk
No ratings yet
Logic synthesis Standard Requirements
From Everand
Logic synthesis Standard Requirements
Gerardus Blokdyk
No ratings yet
Cognitive Radio Networks
From Everand
Cognitive Radio Networks
Kwang-Cheng Chen
5/5 (1)
Parralel 01
No ratings yet
Parralel 01
38 pages
1
No ratings yet
1
44 pages
Justification of News Placement
75% (4)
Justification of News Placement
14 pages
Parts Manual: Hinged Forks
No ratings yet
Parts Manual: Hinged Forks
5 pages
BIT Project EIL 91020
No ratings yet
BIT Project EIL 91020
41 pages
Design of Siw Fed Antipodal Linearly Tapered Slot Antennas With Curved and Hat Shaped Dielectric Loadings at 60 GHZ For Wireless Communications
No ratings yet
Design of Siw Fed Antipodal Linearly Tapered Slot Antennas With Curved and Hat Shaped Dielectric Loadings at 60 GHZ For Wireless Communications
2 pages
Student Housing Feasibility Study Bellevue College: January 9, 2015
No ratings yet
Student Housing Feasibility Study Bellevue College: January 9, 2015
57 pages
Good Rank: Photo
No ratings yet
Good Rank: Photo
3 pages
Klausmeyer 2018
No ratings yet
Klausmeyer 2018
26 pages
Steel Melt Shop Raw Materials
No ratings yet
Steel Melt Shop Raw Materials
5 pages
Syncro
No ratings yet
Syncro
2 pages
Catalogo Xecro 2011-v21.2
No ratings yet
Catalogo Xecro 2011-v21.2
136 pages
Astrolux S43 Usermanualdraft 05 Withadvanced Settings 201809281109231
No ratings yet
Astrolux S43 Usermanualdraft 05 Withadvanced Settings 201809281109231
2 pages
Casting Defect in Slab PDF
No ratings yet
Casting Defect in Slab PDF
55 pages
Receipt ABM Parking
No ratings yet
Receipt ABM Parking
1 page
NanoCare Brochure EN PDF
No ratings yet
NanoCare Brochure EN PDF
20 pages
Calatrava
No ratings yet
Calatrava
1 page
Chlorine Pure Gas
No ratings yet
Chlorine Pure Gas
2 pages
ECE8803 Syllabus Sp2015
No ratings yet
ECE8803 Syllabus Sp2015
2 pages
NetSPI Scott Sutherland RedvsBlue v3.2
0% (1)
NetSPI Scott Sutherland RedvsBlue v3.2
1 page
ECB-641 2nd Quick Installation Guide PDF
No ratings yet
ECB-641 2nd Quick Installation Guide PDF
20 pages
RGB-H-CBCR Skin Colour Model For Human Face Detection
No ratings yet
RGB-H-CBCR Skin Colour Model For Human Face Detection
6 pages
Greenstar CDi Compact Combi ErP Operating Instructions
No ratings yet
Greenstar CDi Compact Combi ErP Operating Instructions
24 pages
Myers and Montgomery 2002
No ratings yet
Myers and Montgomery 2002
1 page
Assignment 1
0% (1)
Assignment 1
3 pages
ACT IR224UN Lplus Manual v1.1.6 080115
No ratings yet
ACT IR224UN Lplus Manual v1.1.6 080115
23 pages
James Deen Returns for Romantic Fuck With AJ Appl…
No ratings yet
James Deen Returns for Romantic Fuck With AJ Appl…
1 page
Notes For Resume Summary
No ratings yet
Notes For Resume Summary
3 pages
Credit-Risk Evaluation of A Tunisian Commercial Bank: Logistic Regression vs. Neural Network Modelling
No ratings yet
Credit-Risk Evaluation of A Tunisian Commercial Bank: Logistic Regression vs. Neural Network Modelling
12 pages
Lecture 4
No ratings yet
Lecture 4
101 pages