0% found this document useful (0 votes)

21 views

CPUs_GPUs_accelerators_and_memory_v1.0

Uploaded by

test test

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

CPUs_GPUs_accelerators_and_memory_v1.0

Uploaded by

test test

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

CPUs, GPUs, accelerators and memory

Andrea Sciabà
On behalf of the Technology Watch WG

HOW Workshop
18-22 March 2019
Jefferson Lab, Newport News
Introduction
• The goal of the presentation is to give a broad overview of the
status and prospects of compute technologies
– Intentionally, with a HEP computing bias
• Focus on processors and accelerators and volatile memory
• The wider purpose of the working group is to provide information
that can be used to optimize investments
– Market trends, price evolution
• More detailed information is already available in a document
– Soon to be added to the WG website

2
Outline
• General market trends
• CPUs
– Intel, AMD
– ARM
– Other architectures
• GPUs
• FPGAs
• Supporting technologies
• Memory technologies

3
Semiconductor device market and trends

• Global demand for semiconductors

topped 1 trillion units shipped for the
first time
• Global semiconductor sales got off to a
slow start in 2019, as year-to-year sales
decreased
• Long-term outlook remains promising,
due to the ever-increasing
semiconductor content in a range of
consumer products
• Strongest unit growth rates foreseen
for components of
– smartphones
– automotive electronics systems
– devices for deep learning applications

4
Semiconductor fabrication

• Taiwan leading all regions/countries in wafer capacity

• TSMC held 67% of Taiwan’s capacity and is leading
• Samsung and SK Hynix represent 94% of the installed IC wafer capacity in South Korea
– They are likely to influence memory prices (which are now very high)
• New manufacturing lines are expected to boost industry capacity by 8% in both 2018 and 2019
5
Process technology

• Performance scaling in process technology continues to grow Moore’s law prediction

• Embedded processors benefit the most from process manufacturing improvements
• EUV is forecast to be the dominant lithography technology in the coming years
– Already used for 7nm by TSMC for AMD, Apple, Nvidia and Qualcomm

6
Intel and AMD market share
• AMD server market share is rapidly Source: Passmark website

increasing since 2017, but from almost

nothing (statistics are voluntarily submitted
benchmark results)

– Zen architecture released in 2017

– Achieved 5% of server shipment market
share on Q4 2018, projected to 10% in one
year
• AMD always had a reasonable (20-30%)
share overall
• EPYC revenue was $58m in the second
2018 quarter vs $36m in the prior quarter

7
Internet and smart population growth and effects

• Small changes in smart population trend from 2018

• Significant increase in mobile social media usage over the past year

8
CPUS AND ACCELERATORS

9
Intel server CPU line-up
• Intel Xeon Scalable Processors
– Currently based on Skylake-SP and
coming in four flavours, up to 28 cores
• Only minor improvements foreseen for
2019
– Adding support for Optane DC Persistent
Memory and hardware security patches
• New microarchitecture (Sunny Cove) to
become available late 2019
– Several improvements benefiting both
generic and specialised applications

10
Current and future Intel server architectures
Microarchitecture Technology Launch year Highlights
Skylake-SP 14nm 2017 Improved frontend and execution units
More load/store bandwidth
Improved hyperthreading
AVX-512
Cascade Lake 14nm++ 2019 Vector Neural Network Instructions (VNNI) to improve inference performance
Support 3D XPoint-based memory modules and Optane DC
Security mitigations
Cooper Lake 14nm++ 2020 bfloat16 (brain floating point format)
Sunny Cove 10nm+ 2019 Single threaded performance
(aka Ice Lake) New instructions
Improved scalability
Larger L1, L2, μop caches and 2nd level TLB
More execution ports
Willow Cove 10nm 2020? Cache redesign
New transistor optimization
Security Features
Golden Cove 7/10nm? 2021? Single threaded performance
AI Performance
Networking/5G Performance
Security Features

11
Other Intel x86 architectures
• Xeon Phi
– Features 4-way hyperthreading and AVX-512 support
– Elicited a lot of interest in the HEP community and for deep learning applications
– Announced to be discontinued in summer 2018
• Networking processors (Xeon D)
– SoC design
– Used to accelerate networking functionality or to process encrypted data streams
– Two families, D-500 for networking and D-100 for higher performance, based on
Skylake-SP with on-package chipset
– Hewitt Lake just announced, probably based on Cascade Lake
• Hybrid CPUs
– Will be enabled by Foveros, the 3D chip stacking technology recently demonstrated

12
AMD server CPU line-up
• EPYC 7000 line-up from 2017
– Resurgence after many years of
Bulldozer CPUs thanks to the
Zen microarchitecture
• +40% in IPC, almost on par with
Intel
• 2x power efficiency vs Piledriver
– Up to 32 cores
• Already being tested and
used at some WLCG sites
13
EPYC Naples
• EPYC Naples (Zen) consists of up to 4 separate dies,
interconnected via Infinity Fabric
– Chiplets allow a significant reduction in cost and
higher yield
• Main specifications
– up to 32 cores
– 4 dies per chip (14nm), each die embedding IO and
memory controllers
– 2.0-3.1 GHz of base frequency
– 8 DDR4 memory channels with hardware encryption
– up to 128 PCI gen3 lanes per processor (64 in dual )
– TDP range: 120W-200W
• Similar per-core and per-GHz HS06 performance to
Xeon

14
EPYC Rome
• Next AMD EPYC generation (Zen 2), embeds 9 dies, including
one for I/O and memory access
– Should compete with Ice Lake
• Main specs:
– 9 dies per chip : a 14nm single IO/memory die and 8 CPU 7nm
chiplets
• +300-400 MHz for low core count CPUs
– 8 DDR4 memory channels, up to 3200 MHz
– up to 64 cores
– up to 128 PCI Gen3/4 lanes per processor
– TDP range: 120W-225W (max 190W for SP3 compatibility)
– Claimed +20% performance per-core over Zen, +75% through
the whole chip with similar TDP over Naples
– To be released during 2019

15
Recent experiences in WLCG
1. LHCb
– Using some nodes with EPYC 7301 CPUs (16 cores)
– Performance of LHCb trigger application almost equal to Xeon Silver 4114 (10 cores)
– Need to populate all 8 DIMM slots for maximum performance
– Testing it as potential hypervisor platform
– Will competitively tender with Intel next year
2. NIKHEF
– Have 93 single-socket 32 core EPYC 7551P nodes in production
– A single EPYC 7371 node (single socket, 16 cores), available for tests
3. INFN
– All WLCG sites have installed in 2018 a number of systems (40 in total) with EPYC 7351 (16 cores) in Twin Square configuration
– Experience very positive
4. BNL
– Extensive tests with several EPYC CPUs presented at HEPiX Fall 2018
– Measured performance from mid/upper range EPYC similar to mid/upper range Xeon Gold
• Caltech
– Two servers with EPYC 7551P (32 cores), soon available for benchmarking

16
ARM in the data center
• ARM is ubiquitous in the mobile and
embedded CPU word
• Data center implementations have been
relatively unsuccessful so far
– Performance/power and performance/$ not
competitive with Intel and AMD
• LHC experiments are capable of using ARM
CPUs if needed
– Some do nightly builds on ARM since years
• Only a few implementations (potentially)
relevant to the data center
– Cavium ThunderX2
– Fujitsu A64FX
– ARM Neoverse
– Ampere eMAG, Graviton

17
Marvell ThunderX2 and Fujitsu A64FX
• ThunderX2 for mainstream cloud and HPC data centers, from
2018
– Enjoys the greatest market visibility and reasonable
performance/$
• Used e.g. at CRAY XC-50 at Los Alamos and HPE Apollo 70 based Astra
HPC system at Sandia National Laboratory
– ARM V8.1 architecture
• Up to 32 cores, 4-way SMT
• Up to 8 DDR4 memory channels
• Up to 56 PCIe Gen3 lanes

• Fujitsu A64FX to be used in supercomputer at RIKEN center

– Based on the V8.2-A ISA architecture
• First to deliver scalable vector extensions (SVE)
• 48 cores
• 32 GB of HBM2 high bandwidth memory
• 7nm FinFET process
– Interesting to see what performance will achieve as it may lead
to a more competitive product
18
ARM Neoverse
• Two platforms for the data center
– N1 for cloud, E1 for throughput
• Based on the Neoverse N1 CPU
– Very similar architecture to Cortex A76 but optimized
for high clock speeds (up to 3.1 GHz)
– Two N1 cores each with L1 and L2 caches
– To be combined by licensees with memory controller,
interconnect and I/O IP
• Demonstrated the N1 Hyperscale Reference Design
– 64-128 N1 CPUs each with 1 MB of private L2
– 8x8 mesh interconnect with 64-128 MB of shared
cache
– 128x PCIe/CCIX lanes
– 8x DDR4 memory channels
• Intended to strengthen ARM’s server market share
– Not expected to be available for another 1-1.5 years Source: Anandtech
19
IBM POWER
• POWER9
– Used in Summit, the fastest supercomputer
– 4 GHz
– Available with 4-way (up to 24 cores)
– First supporting PCIe-Gen4
– CAPI 2.0 I/O to enable
• Coherent user-level access to accelerators and I/O
devices
• Access to advanced memories
– NVLink to increase bandwidth to Nvidia GPUs
– 14nm FINFET process
– Product line with full support for RHEL/CENTOS7
• POWER10
– 10nm process
– Several feature enhancements
– First to support PCIe Gen5

20
RISC-V and MIPS
• RISC-V is an open source ISA
– To be used by some companies for controllers (Nvidia and WD), for
FPGA (Microsemi), for fitness bands…
– For the time being, not targeting the data center
– Might compete with ARM in the mid term
– Completely eclipsed MIPS
• MIPS
– Considered dead

21
Discrete GPUs: current status
• GPU’s raw power follows the exponential trend on numbers of
transistors and cores
• New features appear unexpectedly, driven by market (e.g. tensor
cores)
– Tensor cores: programmable matrix-multiply-and-accumulate units
– Fast half precision multiplication and reduction in full precision
– Useful for accelerating deep learning training/inference

https://devblogs.nvidia.com/programming-tensor-cores-cuda-9/ 22
Nvidia and AMD
• Volta addressing the server market, • Vega 20
Turing the gaming market – Directly aimed at the server world (Instinct
MI50 and MI60)
Feature Volta (V100) Turing (2080 Ti) • Evolution of Vega 10 using a 7nm
Process 12nm 12nm process
CUDA cores yes yes
– more space for HBM2 memory, up to 32GB
Tensor cores yes yes
– 2x memory bandwidth
RT cores NA yes
FP16: 28 TFLOPS Same, but
– Massive FP64 gains
FP performance
FP32: 14 TFLOPS FP64: 1/32 of FP32 – PCIe Gen4
FP64: 7 TFLOPS
Tensor: 112 TFLOPS • Some improvements relevant for
Memory HBM2 GDDR6
inference scenarios
Memory bandwidth 900 GB/sec 616 GB/sec
– Support for INT8 and INT4 data types
Multi-GPU NVLink 2 NVLink 2/SLI
– Some new instructions
Applications AI, datacenter, AI, workstation,
workstation gaming
23
GPUs - Programmability
• NVIDIA CUDA:
– C++ based (supports C++14), de-facto standard
– New hardware features available with no delay in the API
• OpenCL:
– Can execute on CPUs, AMD GPUs and recently Intel FPGAs
– Overpromised in the past, with scarce popularity
• Compiler directives: OpenMP/OpenACC
– Latest GCC and LLVM include support for CUDA backend
• AMD HIP:
– Interfaces to both CUDA and AMD MIOpen, still supports only a subset of the CUDA
features
• GPU-enabled frameworks to hide complexity (Tensorflow)
• Issue is performance portability and code duplication
24
GPUs in LHC experiments software frameworks
• Alice, O2 • LHCb (online - standalone) Allen
– Tracking in TPC and ITS framework: HLT-1 reduces 5TB/s input
– Modern GPU can replace 40 CPU cores to 130GB/s:
• CMS, CMSSW – Track reconstruction, muon-id, two-tracks
vertex/mass reconstruction
– Demonstrated advantage of
heterogeneous reconstruction from RAW – GPUs can be used to accelerate the entire
to Pixel Vertices at the CMS HLT HLT-1 from RAW data
– ~10x both in speed-up and energy – Events too small, have to be batched:
efficiency wrt full Xeon socket makes the integration in Gaudi difficult
– Plans to run heterogeneous HLT during LHC • ATLAS
Run3 – Prototype for HLT track seed-finding,
calorimeter topological clustering and anti-
kt jet reconstruction
– No plans to deploy this in the trigger for
Run 3

25
FPGA
• Players: Xilinx (US), Intel (US), Lattice
Semiconductor (US), Microsemi (US),
and QuickLogic (US), TSMC (Taiwan),
Microchip Technology (US), United
Microelectronics (Taiwan),
GLOBALFOUNDRIES (US), Achronix (US),
and S2C Inc. (US)
• Market valued at USD 5 Billion in 2016 Process Technology
20 nm 16 nm 14 nm

and expected to be valued at 10 Billion Intel® Xilinx® Intel® Xilinx® Intel® Xilinx®

in 2023 Virtex®
Virtex®
UltraScale+®
Intel®
Stratix®
• Growing demand for advanced driver-
Top Performance Tier UltraScale
® Zynq®
10
UltraScale+®
assistance systems (ADAS), Intel® Arria®
Kintex
developments in IoT and reduction in Mid Performance Tier
10
UltraScale
®

time-to-market are the key driving Intel®

factors Low Performance Tier Cyclone® 10
GX

Source: https://www.intel.com/content/www/us/en/programmable/documentation/mtr1422491996806.html#qom1512594527835__fn_soc_variab_avail_xlx

26
FPGA programming
• Used as an application acceleration device • In HEP
– Targeted at specific use cases
• Neural inference engine – High Level Triggers
• MATLAB
• LabVIEW FPGA • https://cds.cern.ch/record/2647951
• OpenCL – Deep Neural Networks
– Very high level abstraction • https://arxiv.org/abs/1804.06913
– Optimized for data parallelism
• https://indico.cern.ch/event/703881/
• C / C++ / System C
– High level synthesis (HLS) – High Throughput Data Processing
– Control with compiler switches and • https://indico.cern.ch/event/669298/
configurations
• VHDL / Verilog
– Low level programming

27
Other Machine Learning processors and accelerators

• Intel Nervana AI Processor NNP-L-1000 (H2 2019-)

– Accelerates AI inference for companies with high workload
demands
– Optimized across memory, bandwidth, utilization and
power
– Spring Crest 3-4x faster training than Lake Crest,
introduced in 2017
– Supports bfloat16
• Google TPU
– Huge increase in perf/watt for ML compared to CPUs and
GPUs
• Intel Configurable Spatial Accelerator (CSA)
– Dataflow engines that explicitly map the parallelism of the
code onto an array of processing, storage and switching
elements
– Heavily customized for specific applications

28
MEMORY TECHNOLOGIES

29
Static RAM (SRAM)
• On die memory on the CPU used for
L1/L2/L3 cache
– SRAM cell size not scaling with node
• SRAM cache constitutes large fraction of
area on modern CPUs
• Power consumption is an issue
• Applications driving larger caches
• No direct replacement in sight for
L1/L2
• Alternate L3 cache technologies
– eDRAM - Used in IBM Power CPUs
– STT-MRAM - proposed as possible
replacement
https://www.sigarch.org/whats-the-future-of-technology-scaling/

30
Dynamic RAM (DRAM)
• Dominant standards continue to
evolve
– DDR4 -> DDR5
• 3200MT/s -> 6400MT/s
• 16Gb -> 32Gb chips
– GDDR5 - > GDDR5X
• 14 Gbps/pin -> 16Gbps/pin
• 8Gb -> 16Gb chips
– HBM -> HBM2
• 1 Gbps/pin -> 2.4 Gbps/pin
• 4 die stack -> 12 die stack
• 2Gb die -> 8Gb die
• Note memory latency remains mostly (Youngwoo Kim, KAIST’s Terabyte Labs)
https://www.3dincites.com/2019/02/designcon-2019-shows-board-and-system-designers-
unchanged the-benefits-of-advanced-ic-packaging/

31
DRAM Outlook
• Major vendors showing next
generation chips (DDR5/GDDR6)
• Multiple technologies being
investigated for future DRAM
• EUV lithography not needed for at
least 3 more generations (Micron)
• Contract DRAM pricing fell ~30% in
Q1 2019
• Pressure expected on DRAM prices
thru 2019 due to additional
production capacity coming online
https://www.techinsights.com/technology-intelligence/overview/technology-roadmaps/

32
Performance gaps in memory hierarchy

https://www.opencompute.org/files/OCP-GenZ-March-2018-final.pdf https://www.eetimes.com/author.asp?section_id=36&doc_id=1334088#

33
Emerging technologies
• May eventually fill the gap
– STT-MRAM between SRAM and
DRAM (work in progress)
– “Persistent Memory” in NVDIMM
package for the DRAM/NAND gap
• Low latency NAND (e.g. Z-NAND)
• 3D XPoint (aka “Optane”)
– Technologies still in the lab
• MRAM
• NRAM
• FeRAM
• PCRAM https://www.snia.org/sites/default/files/PM-
Summit/2018/presentations/14_PM_Summit_18_Analysts_Session_Oros_Final_Post_UPD
ATED_R2.pdf
• ReRAM

34
SUPPORTING TECHNOLOGIES

35
Interconnect technology
• Increasing requirements on bandwidth
and latency driving the development
– E.g. moving data between CPU and GPU
is often a bottleneck
– Several standards competing (PCIe
Gen4/5, CCIX, Gen-Z, OpenCAPI, CXL…)
• Proprietary technologies
– NVLink (GPU-to-GPU, GPU-to-POWER9)
– Ultra Path (Intel), CPU-to-CPU
– Infinity Fabric (AMD), chiplet-to-chiplet

36
Packaging technology
• Traditionally a silicon die is individually packaged,
but more and more CPUs package together more
(sometimes different) dies
• Classified according to how dies are arranged and
connected
– 2D packaging (e.g. AMD EPYC): multiple dies on a
substrate
– 2.5D packaging (e.g. Intel Kaby Lake-G, CPU+GPU):
interposer between die and substrate for higher
speed
– Intel Foveros, a 2.5D with an interposer with active
logic (Intel “Lake Field” hybrid CPU)
– 3D packaging (e.g. stacked DRAM in HBM), for lower
power, higher bandwidth and smaller footprint
• Can alleviate scaling issues with monolithic CPU
dies but at a cost, both financial and in power and
latency

37
What next?
• We do not really know what will be there in the HL-LHC era
(2026-2037)
• Some “early indicators” of what might come next
– Several nanoelectronics projects might help in
• Increasing density of memory chips
• Reducing size of transistors in IC
– Nanocrystals, silicon nanophotonics, carbon nanotubes, single-atom
thick graphene film, etc.
– https://www.understandingnano.com/nanotechnology-electronics.html

38
Conclusions
• Market trends
– Server market is increasing, AMD share as well
– EUV lithography driving 7nm mass production
• CPU, GPUs and accelerators
– AMD EPYC promising from a cost perspective
– Nvidia GPUs still dominant due to the better software support
– Recent developments for GPUs greatly favor inference workloads
– FPGA market dominated by telecom, industry and automotive but there is also some HEP usage
• Memory technologies
– SDRAM still the on-chip memory of choice, DRAM still for the main memory, no improvements in
latency
– NVDIMM – emerging memory packaging for memory between DRAM and NAND flash (see next
talk)
– Other non-volatile memory technologies in development

39
Additional resources
• All subgroups
– https://gitlab.cern.ch/hepix-techwatch-wg
• CPUs, GPUs and accelerators
– Document (link)
• Memory technologies
– Document (link)

40
Acknowledgments
• Special thanks to Shigeki Misawa, Servesh Muralidharan, Peter
Wegner, Eric Yen, Andrea Chierici, Chris Hollowell, Charles
Leggett, Michele Michelotto, Niko Neufeld, Harvey Newman,
Felice Pantaleo, Bernd Panzer-Steindel, Mattieu Puel and
Tristan Suerink

41
BACKUP SLIDES

42
Market share of technology companies
Server companies PC companies

• Worldwide server market increased 38%, year over year to $23

billion during the third quarter of 2018
43
eMAG, Graviton

• eMAG from Ampere is a V8 64 bit

single socket SoC meant to compete
with Xeon processors
– Available in 16 and 32 cores
– Eight DDR4 memory channels
– 42 PCI-E Gen3 lanes
– Using the TSMC 16nm FinFET+ process
• Graviton is available only via AWS
– Could be the beginning of a new trend
among hyperscalers, avoiding
commercially available processors
– not a good thing for HEP if it results in
higher CPU prices due to drop of sales!

04 AMD Edge AI TechDay_Singapore_2024_FrankWang
No ratings yet
04 AMD Edge AI TechDay_Singapore_2024_FrankWang
29 pages
Conectores para Cable
No ratings yet
Conectores para Cable
64 pages
MESA White Paper 52 - Smart Manufacturing - Landscape Explained
100% (1)
MESA White Paper 52 - Smart Manufacturing - Landscape Explained
50 pages
Chapter 1 (System Concepts and The Information System Environment)
100% (2)
Chapter 1 (System Concepts and The Information System Environment)
19 pages
System Administrator Checklist
No ratings yet
System Administrator Checklist
4 pages
CPUs GPUs Accelerators
No ratings yet
CPUs GPUs Accelerators
22 pages
PCWorld 01 2024
No ratings yet
PCWorld 01 2024
112 pages
48423B Fusion Whitepaper WEB
No ratings yet
48423B Fusion Whitepaper WEB
8 pages
No Exaflops For You
No ratings yet
No Exaflops For You
61 pages
Giulio Corradi Presentation PDF
No ratings yet
Giulio Corradi Presentation PDF
64 pages
Robotics Webinar Series Session 3 Slides
No ratings yet
Robotics Webinar Series Session 3 Slides
46 pages
Hyperion Research HPC and AI Processors
No ratings yet
Hyperion Research HPC and AI Processors
14 pages
Hc2024.Amd.vpeng
No ratings yet
Hc2024.Amd.vpeng
36 pages
Latest Processors Microcontrollers
No ratings yet
Latest Processors Microcontrollers
13 pages
0930 18.07.18 Neel Gala InCore Semiconductors PDF
No ratings yet
0930 18.07.18 Neel Gala InCore Semiconductors PDF
33 pages
Arm Holdings PLC Q4 FYE24 Results
No ratings yet
Arm Holdings PLC Q4 FYE24 Results
11 pages
02 AMD Tech Day AECG Portfolio Overview
No ratings yet
02 AMD Tech Day AECG Portfolio Overview
34 pages
Hot Chips - Aug 23 - BHS and Granite Rapid - Xeon - Architecture - Public
No ratings yet
Hot Chips - Aug 23 - BHS and Granite Rapid - Xeon - Architecture - Public
15 pages
EUF-DeS-T1741 Scalable Multicore QorIQ Layerscape Processors Based on 64-Bit Software Environment for Enterprise, Home and Industrial Applications
No ratings yet
EUF-DeS-T1741 Scalable Multicore QorIQ Layerscape Processors Based on 64-Bit Software Environment for Enterprise, Home and Industrial Applications
50 pages
HC31 2.6 Intel SPH 2019 v3
No ratings yet
HC31 2.6 Intel SPH 2019 v3
12 pages
Lecture 02 Intel & ADM processor
No ratings yet
Lecture 02 Intel & ADM processor
35 pages
Processor
No ratings yet
Processor
27 pages
Special_Issue_on_Contemporary_Industry_Products_2024
No ratings yet
Special_Issue_on_Contemporary_Industry_Products_2024
2 pages
Micro systems package
No ratings yet
Micro systems package
44 pages
CC Unit 1
No ratings yet
CC Unit 1
24 pages
Isolation and Virtualization Solutions For Automotive Real Time Processors
No ratings yet
Isolation and Virtualization Solutions For Automotive Real Time Processors
44 pages
Modle 01 - HPC Introduction To Pipeline
No ratings yet
Modle 01 - HPC Introduction To Pipeline
124 pages
L1.2 HPC Introduction
No ratings yet
L1.2 HPC Introduction
42 pages
Computer Architecture Slides
No ratings yet
Computer Architecture Slides
274 pages
Microprocessors and Microcontrollers_ An Overview
No ratings yet
Microprocessors and Microcontrollers_ An Overview
15 pages
Strategic_Pivot_of_Long-Standing_x86_Rivals
No ratings yet
Strategic_Pivot_of_Long-Standing_x86_Rivals
2 pages
9 - CPU vs. GPU
No ratings yet
9 - CPU vs. GPU
14 pages
60_HC2024.Intel.RomanKaplan.Gaudi3-0826
No ratings yet
60_HC2024.Intel.RomanKaplan.Gaudi3-0826
16 pages
White Paper X12
No ratings yet
White Paper X12
20 pages
14_HC2024.Intel.Xeon_6_SoC.Praveen.Mosur
No ratings yet
14_HC2024.Intel.Xeon_6_SoC.Praveen.Mosur
28 pages
Review of LSS CSC
No ratings yet
Review of LSS CSC
21 pages
L 3 GPU
No ratings yet
L 3 GPU
33 pages
AMD 2020 Premium Research 2
No ratings yet
AMD 2020 Premium Research 2
15 pages
Definition - What Does Processor Mean?
No ratings yet
Definition - What Does Processor Mean?
47 pages
Ventana HotChips23 - Final
No ratings yet
Ventana HotChips23 - Final
16 pages
Group16_ITST_PPT
No ratings yet
Group16_ITST_PPT
11 pages
Arm Future Perspective
No ratings yet
Arm Future Perspective
16 pages
PowerEdge Architecture Technical Overview
No ratings yet
PowerEdge Architecture Technical Overview
24 pages
Intel Architecture Day 2021 Presentation
No ratings yet
Intel Architecture Day 2021 Presentation
195 pages
HPC Trends For 2015
No ratings yet
HPC Trends For 2015
19 pages
Intel Xeon Processor Roadmap
No ratings yet
Intel Xeon Processor Roadmap
7 pages
Overview of CPU Architectures and Recent Trends
No ratings yet
Overview of CPU Architectures and Recent Trends
7 pages
An Arm Processor Is One of A Family of Central Processing Units
No ratings yet
An Arm Processor Is One of A Family of Central Processing Units
6 pages
2.2 Survey of Processor Technology
No ratings yet
2.2 Survey of Processor Technology
9 pages
List of CPU Architectures
0% (1)
List of CPU Architectures
3 pages
intel-entry-processors-the-right-balance-of-performance-and-power-for-the-industrial-iot-market-1709278712
No ratings yet
intel-entry-processors-the-right-balance-of-performance-and-power-for-the-industrial-iot-market-1709278712
4 pages
Lecture1 ch1 Fundamentals of Quantitative Design and Analysis
No ratings yet
Lecture1 ch1 Fundamentals of Quantitative Design and Analysis
28 pages
What Is Processor ?
No ratings yet
What Is Processor ?
8 pages
CAQA6e ch1
No ratings yet
CAQA6e ch1
31 pages
Recent Development and Future Trends On Microprocessors: TH TH
No ratings yet
Recent Development and Future Trends On Microprocessors: TH TH
1 page
Modul 10. Compute
No ratings yet
Modul 10. Compute
109 pages
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
No ratings yet
Introduction To GP-GPU and CUDA: High Performance Computing Center Hanoi University of Science & Technology
43 pages
Computer Architecture and Organization Case Study GROUP 6
No ratings yet
Computer Architecture and Organization Case Study GROUP 6
5 pages
IT Technology and Markets, Status and Evolution: 26. March 2018 Bernd Panzer-Steindel, CTO CERN/IT 1
No ratings yet
IT Technology and Markets, Status and Evolution: 26. March 2018 Bernd Panzer-Steindel, CTO CERN/IT 1
21 pages
An Introduction To Computer Architecture: © 2019 Arm Limited
No ratings yet
An Introduction To Computer Architecture: © 2019 Arm Limited
46 pages
Script
No ratings yet
Script
2 pages
GPU Overclocking Guide
From Everand
GPU Overclocking Guide
Alisa Turing
No ratings yet
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
From Everand
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
Rodrigo Copetti
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Essbase: Stable Release Operating System Type License
No ratings yet
Essbase: Stable Release Operating System Type License
9 pages
Fibro Katalog (Stamp Tool)
No ratings yet
Fibro Katalog (Stamp Tool)
18 pages
Codigos de Falla Del BHM
No ratings yet
Codigos de Falla Del BHM
14 pages
10 Advanced PC Tweaks
No ratings yet
10 Advanced PC Tweaks
3 pages
Industrial Training Project Report
No ratings yet
Industrial Training Project Report
36 pages
Handbook Dsiklavier Pianos
No ratings yet
Handbook Dsiklavier Pianos
37 pages
United Security CVD-2025 User Manual
No ratings yet
United Security CVD-2025 User Manual
2 pages
3406c Fuel Pump-Governor Drive
No ratings yet
3406c Fuel Pump-Governor Drive
8 pages
Free Form Origami Tessellations
No ratings yet
Free Form Origami Tessellations
10 pages
Pt. Recsalog Exploration: Pemanfaatan Metoda Geofisika Magnetik (Geomagnetic) Untuk Pekerjaan Eksplorasi
No ratings yet
Pt. Recsalog Exploration: Pemanfaatan Metoda Geofisika Magnetik (Geomagnetic) Untuk Pekerjaan Eksplorasi
10 pages
Introduction To The POSBANK USA ANYSHOP POS Computer
No ratings yet
Introduction To The POSBANK USA ANYSHOP POS Computer
13 pages
PCB Design Tips
No ratings yet
PCB Design Tips
29 pages
Teves Mk60 - Abs - Eds - Asr & Esp: Brake Technology
No ratings yet
Teves Mk60 - Abs - Eds - Asr & Esp: Brake Technology
2 pages
Practical File: Guru Gobind Singh Indraprastha University
No ratings yet
Practical File: Guru Gobind Singh Indraprastha University
35 pages
Run 100
No ratings yet
Run 100
65 pages
Wii Cheat Codes: Lego Indiana Jones-The Original Adventures
100% (1)
Wii Cheat Codes: Lego Indiana Jones-The Original Adventures
2 pages
Reliability Avalilability Serviceability
No ratings yet
Reliability Avalilability Serviceability
5 pages
Raspberry Pi 3 Projects Master - Timothy Short
No ratings yet
Raspberry Pi 3 Projects Master - Timothy Short
42 pages
RMF
No ratings yet
RMF
245 pages
Memristor F
No ratings yet
Memristor F
31 pages
History of Microprocessor
No ratings yet
History of Microprocessor
4 pages
Reemplazo Bateria Triconex
100% (1)
Reemplazo Bateria Triconex
2 pages
What'S New With Emc Controlcenter V6.0
No ratings yet
What'S New With Emc Controlcenter V6.0
30 pages
(Ebook) Achieving Accuracy: A Legacy of Computers and Missiles by McMurran, Marshall William ISBN 9781462810659, 1462810659 download
100% (3)
(Ebook) Achieving Accuracy: A Legacy of Computers and Missiles by McMurran, Marshall William ISBN 9781462810659, 1462810659 download
49 pages
Bosch Aoftware
No ratings yet
Bosch Aoftware
2 pages
24EC102
No ratings yet
24EC102
3 pages

CPUs_GPUs_accelerators_and_memory_v1.0

Uploaded by

CPUs_GPUs_accelerators_and_memory_v1.0

Uploaded by

CPUs, GPUs, accelerators and memory

• Global demand for semiconductors

• Taiwan leading all regions/countries in wafer capacity

• Performance scaling in process technology continues to grow Moore’s law prediction

increasing since 2017, but from almost

– Zen architecture released in 2017

• Small changes in smart population trend from 2018

• Fujitsu A64FX to be used in supercomputer at RIKEN center

time-to-market are the key driving Intel®

• Intel Nervana AI Processor NNP-L-1000 (H2 2019-)

• Worldwide server market increased 38%, year over year to $23

• eMAG from Ampere is a V8 64 bit

You might also like