May 9, 2016
Configurable, Coherent SoC Interconnect for
Heterogeneous Multiprocessing and Storage
Applications
Dr. John Bainbridge
ChipEx 2016
May 9th, 2016
May 9, 2016 2
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
MARKET TRENDS AND CHALLENGES
May 9, 2016 3
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Discontinuities in the Compute Infrastructure
New Demand Profile Driving New Architectures
MAIN
FRAMES
1980s
Compute
CLIENT /
SERVER
1990s
Desktop
DATA
CENTERS
2000s
Internet
CLOUD
COMPUTING
Now
Software Defined
Everything
General Purpose > Workload specific
Hardware acceleration for Software
Customized SKUs
May 9, 2016 4
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Cloud Architectures Driving Heterogeneous Computing
Increased Efficiency From Hardware Specialization
Source: ISSC
Proceedings
May 9, 2016 5
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Cloud Architectures Driving Heterogeneous Computing
Reconfigurable Computing – ASIC + FPGA
Source: Hot Chips ’14 Proceedings
May 9, 2016 6
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Discontinuities in Chip Design Techniques
Changing Abstraction Levels
CUSTOM
1980s
Sea of xtors
ASIC
1990s
Sea of Cells
SOC
2000s
Sea of Blocks
NEXT-GEN
SOCS
Now
Sea of IPs
Time-to-Market Pressure
Performance Analysis from Day #1
Create Differentiated Platforms
May 9, 2016 7
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Evolution of Coherency Solutions
Coherency Participation Has Increased
CPU
Memory
L1
L2
CPU
L1
CPU
L1
Memory
L3
Memory
L2
CPU
L1
CPU
L1
L2
CPU
L1
CPU
L1
Last Level Cache
Memory
L2
CPU
L1
CPU
L1
L2
CPU
L1
CPU
L1
GPU
Accelerator
 Number of IP blocks have exploded and
 More agents are participating in coherency
 Requirements moving from homogeneous  heterogeneous
May 9, 2016 8
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
The Need for Flexible Coherency Solutions
Flexibility Creates Opportunities
APPLICATION
 Improve latency of critical
traffic
 Improve compute cluster
latency
FLOORPLAN
 Reduce interconnect
congestion
 Utilize unused die area with
additional cache capacity
POWER
 Partition solution to
enhance opportunities
for power-gating
 Lower BW requirement
May 9, 2016 9
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
SOLUTION
May 9, 2016 10
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
NetSpeed Technology Overview
Built From Requirements, Optimized For Specific Applications
PERFORMANCE,
POWER, AREA
WORKLOAD
MODELS
COMPUTE
ENGINES
NETSPEED IP AND
SOC ARCHITECTURE
SYNTHESIS
CUSTOMIZED SOC
PERFORMANC
E OPTIMIZED
CORRECT-BY-
CONSTRUCTIO
N
CACHE
COHERENT
May 9, 2016 11
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
NetSpeed Gemini Overview
Scalable, Configurable Cache Coherent NoC
CONFIGURABLE
SOLUTION
LATENCY OPTIMIZED
SOLUTION
SCALABLE
SOLUTION
May 9, 2016 12
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
TECHNOLOGY DETAILS
May 9, 2016 13
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Configurable Solution
Floorplan Aware Design Flow
• Components, Floorplan
• Coherency Requirements
• PPA, SoC Level Use Cases
Step 1: Specify
• Rapid Architecture Exploration
• Real-time Customization
• Power Optimization
Step 2: Customize
Synthesizable RTL
Verification IP
Physical Design Collateral
IPXACT
Performance stats,
Programmers Guide, etc.
• Customized SoC
Interconnect IP
Step 3:
Generate
Configurable
Latency Optimized
Scalable
May 9, 2016 14
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Configurable Solution
Enabling Cache Hierarchy Customization
 Add last level caches to
reduce critical latencies
 Use multiple coherency
controllers or caches to
increase bandwidth
 Improve cache utilization by
controlling address ranges
serviced by each cache
 Support coherency across a
mix of different slaves
devices
Configurable
Latency Optimized
Scalable
May 9, 2016 15
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Correct-by-Construction
Formal Analysis to Build Deadlock-free NoC
 Formally Proven
Formal techniques and
graph theory algorithms
 Correct-by-construction
User-driven traffic
dependencies
 Robust
Handles complex
topologies and routing
Configurable
Latency Optimized
Scalable
May 9, 2016 16
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Latency Optimized Solution
Traffic Based Rapid Reconfiguration
 Fully heterogeneous in determining channel & buffer sizes
Ring
TreeHeterogeneous
Mesh
NetSpeed Router
NetSpeed Auto-generated Links
Configurable
Latency Optimized
Scalable
May 9, 2016 17
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Latency Optimized Solution
Physically Distributed Coherency
GPULast
Level
Cache
CPU CPU CPU
CPU
Last Level
Cache
CCC
CCC
CCC
DSP
 Lower latency by placing coherency
controllers and caches where they
are accessed the most
 Reduce congestion by handling
requests locally and using caches to
reduce traffic to memory
 Adjust cache hierarchy to support
floorplan requirements
 Improve die utilization by placing
caches in empty die space
Configurable
Latency Optimized
Scalable
May 9, 2016 18
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Latency Optimized Solution
Physically Distributed Coherency
 FastPath – Dedicated
connections for latency
sensitive traffic to reduce
arbitration, congestion
 MultiLayer – Isolate traffic
with different performance
requirements
 QoS configurability for
bandwidth allocation and
prioritization
ACE Master
#0
ACE Master
#1
ACE
ACE
FastPathTM
Bridge
FastPathTM
Bridge
ACE
CCC
NoC
Master
Bridge
ACE
Master
Bridge
FastPathTM
Reducing CPU
latency
Configurable
Latency Optimized
Scalable
May 9, 2016 19
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Scalable Solution
Scalable Coherency Bandwidth
Configurable
Latency Optimized
Scalable
 More coherent lookups per
cycle
– Increased coherency
bandwidth through
address-sliced coherency
controllers
– Automated determination
of coherency controllers
May 9, 2016 20
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Scalable Solution
Importance of Filtering Snoops
Configurable
Latency Optimized
Scalable
 Scalability requires significant filtering of snoops
 Directory structure delivers efficient solution in lesser area
1.0
2.0
1.5
0.5
0.0
3.0
2.5
90
%
85
%
Filter Rate (%)
SizeofDirectory/SnoopFilter(MB)
100
%
95
%
50
%
80
%
70
%
75
%
65
%
May 9, 2016 21
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Scalable Solution
Directory Scaling
Configurable
Latency Optimized
Scalable
NETSPEED SOLUTION:
– Superior directory design
– Linear growth vs. O(n2) growth in directory
 Directory size grows O(n2)
– Number of Entries & Number of Agents
TAG
Number of
Entries
DIRECTORY
Ownership
Vector
Number of
Agents
May 9, 2016 22
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Scalable Solution
Reducing Conflicts
Configurable
Latency Optimized
Scalable
Power
Performance
ONE-TO-ONE DIRECTORY
– Directory = sum(cache associativity)
– Dynamic power  with traffic & number
of agents
– O(n2)
GLOBAL SHARED DIRECTORY
– Address conflicts reduce performance
– Limited associativity reduces power
– Conflict rate increases with cache
associativity
NETSPEED SOLUTION:
– Limits associativity and dynamic power
– Advanced techniques to avoid address conflicts
May 9, 2016 23
John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems
Summary
PROBLEM
 New architectures driving need for
heterogeneous computing
 Exploding complexity with shrinking timelines
 Need configurable coherency solutions
SOLUTION
 NetSpeed Architecture Synthesis Platform
 NetSpeed Gemini: Configurable and Scalable
Coherent NoC IP
BENEFITS
 Flexible, correct-by-construction IP
 Higher performance with lower latency
 Build customized application-specific solution

More Related Content

PDF
Aerospike Roadmap Overview - Meetup Dec 2019
PPTX
ExxonMobil’s journey to unleash time-series data with open source technology
PPTX
ThunderX ARMV8 Servers: Disruption and Innovation in the Server Market
PDF
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
PPTX
Red Hat Storage Day Seattle: Why Software-Defined Storage Matters
PDF
RedisConf17 - Redis Enterprise on IBM Power Systems
PPTX
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
PPTX
Red Hat Storage Day Seattle: Persistent Storage for Containerized Applications
Aerospike Roadmap Overview - Meetup Dec 2019
ExxonMobil’s journey to unleash time-series data with open source technology
ThunderX ARMV8 Servers: Disruption and Innovation in the Server Market
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Seattle: Why Software-Defined Storage Matters
RedisConf17 - Redis Enterprise on IBM Power Systems
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Red Hat Storage Day Seattle: Persistent Storage for Containerized Applications

What's hot (20)

PPTX
Keynote Relacional SQL Server para hobbits y enanos
PDF
Alluxio Architecture and Performance
PDF
Red Hat Storage: Emerging Use Cases
PPTX
Rebuilding Web Tracking Infrastructure for Scale
PDF
IoT Edge Data Processing with NVidia Jetson Nano oct 3 2019
PDF
High Performance Interconnects: Landscape, Assessments & Rankings
PDF
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
PDF
Seagate - ceph day taiwan 2017 opening session
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PDF
RedisConf17 - Explosion of Data at the Edge in Equinix
PDF
Accelerating Data Computation on Ceph Objects
PDF
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PPTX
Accelerating the Hadoop data stack with Apache Ignite, Spark and Bigtop
PDF
Red Hat Storage Day Boston - Persistent Storage for Containers
PDF
Scalable POSIX File Systems in the Cloud
PDF
Red Hat Storage Day LA - Persistent Storage for Linux Containers
PDF
Případová studie Fortuna aneb Veeam dostupnost v praxi
PDF
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
PPTX
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Keynote Relacional SQL Server para hobbits y enanos
Alluxio Architecture and Performance
Red Hat Storage: Emerging Use Cases
Rebuilding Web Tracking Infrastructure for Scale
IoT Edge Data Processing with NVidia Jetson Nano oct 3 2019
High Performance Interconnects: Landscape, Assessments & Rankings
Red Hat Storage Day Boston - Red Hat Gluster Storage vs. Traditional Storage ...
Seagate - ceph day taiwan 2017 opening session
HBase Global Indexing to support large-scale data ingestion at Uber
RedisConf17 - Explosion of Data at the Edge in Equinix
Accelerating Data Computation on Ceph Objects
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Accelerating the Hadoop data stack with Apache Ignite, Spark and Bigtop
Red Hat Storage Day Boston - Persistent Storage for Containers
Scalable POSIX File Systems in the Cloud
Red Hat Storage Day LA - Persistent Storage for Linux Containers
Případová studie Fortuna aneb Veeam dostupnost v praxi
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Ad

Viewers also liked (20)

PDF
What's so special about the number 512?
PDF
ENRZ Advanced Modulation for Low Latency Applications
PDF
OIF CEI 56-G-FOE-April2015
PDF
OIF 2015 FOE Architecture Presentation
PDF
TCAMのしくみ
PPT
NoC & Modem design_strategies_35A
PDF
CEI-56G - Testing Considerations
PDF
CEI-56G - Signal Integrity to the Forefront
PDF
BGP in 2014
DOCX
Cisco catalyst 6500 architecture white paper
PPTX
Prof. Uri Weiser,Technion
PDF
OIF on 400G for Next Gen Optical Networks Conference
PDF
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
PPT
ECOC Panel on OIF CEI 56G
PDF
Implementing Useful Clock Skew Using Skew Groups
PPTX
Network Operations Center - Marlabs
PDF
Fujitsu 100G Overview
PDF
Network Operations Center Processes- Isaac Mwesigwa
PPSX
NETWORK OPERATION CENTER
PDF
Beyond 100GE
What's so special about the number 512?
ENRZ Advanced Modulation for Low Latency Applications
OIF CEI 56-G-FOE-April2015
OIF 2015 FOE Architecture Presentation
TCAMのしくみ
NoC & Modem design_strategies_35A
CEI-56G - Testing Considerations
CEI-56G - Signal Integrity to the Forefront
BGP in 2014
Cisco catalyst 6500 architecture white paper
Prof. Uri Weiser,Technion
OIF on 400G for Next Gen Optical Networks Conference
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
ECOC Panel on OIF CEI 56G
Implementing Useful Clock Skew Using Skew Groups
Network Operations Center - Marlabs
Fujitsu 100G Overview
Network Operations Center Processes- Isaac Mwesigwa
NETWORK OPERATION CENTER
Beyond 100GE
Ad

Similar to Dr. John Bainbridge, Principal Application Architect, NetSpeed (20)

PPTX
Performance is not an Option - gRPC and Cassandra
PDF
Trends towards the merge of HPC + Big Data systems
PPT
Unit-6-Final-answers 20072018- Applicaton of WSN.ppt
PDF
Open vSwitch Implementation Options
PPTX
Cumulus Linux CAPEX and OPEX savings
PPTX
SoC Solutions Enabling Server-Based Networking
PPTX
Adobe Ask the AEM Community Expert Session Oct 2016
PDF
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
PDF
ODSA - PoC Requirements and Use Cases
PDF
PoC Requirements and Use Cases
PDF
C5 journey to_the_cloud_with_oracle_sparc
PDF
Best Practices for Building Open Source Data Layers
PDF
UCS Update: Efficiently Managing your server environment for traditional ente...
PDF
Interconnect your future
PDF
Capi snap overview
PDF
OpenPOWER Update
PDF
The best of Windows Server 2016 - Thomas Maurer
PDF
Integrated Plan-Build-Operate for effective Multi-Access Rollout
PPTX
What You Missed: Red Hat Summit 2016
PDF
BXI: Bull eXascale Interconnect
Performance is not an Option - gRPC and Cassandra
Trends towards the merge of HPC + Big Data systems
Unit-6-Final-answers 20072018- Applicaton of WSN.ppt
Open vSwitch Implementation Options
Cumulus Linux CAPEX and OPEX savings
SoC Solutions Enabling Server-Based Networking
Adobe Ask the AEM Community Expert Session Oct 2016
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
ODSA - PoC Requirements and Use Cases
PoC Requirements and Use Cases
C5 journey to_the_cloud_with_oracle_sparc
Best Practices for Building Open Source Data Layers
UCS Update: Efficiently Managing your server environment for traditional ente...
Interconnect your future
Capi snap overview
OpenPOWER Update
The best of Windows Server 2016 - Thomas Maurer
Integrated Plan-Build-Operate for effective Multi-Access Rollout
What You Missed: Red Hat Summit 2016
BXI: Bull eXascale Interconnect

More from chiportal (20)

PDF
Prof. Zhihua Wang, Tsinghua University, Beijing, China
PPTX
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
PPTX
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
PDF
Ken Liao, Senior Associate VP, Faraday
PDF
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
PPTX
Dr.Efraim Aharoni, ESD Leader, TowerJazz
PPTX
Eddy Kvetny, System Engineering Group Leader, Intel
PPTX
Xavier van Ruymbeke, App. Engineer, Arteris
PPTX
Asi Lifshitz, VP R&D, Vtool
PPTX
Zvika Rozenshein,General Manager, EngineeringIQ
PPTX
Lewis Chu,Marketing Director,GUC
PPTX
Kunal Varshney, VLSI Engineer, Open-Silicon
PDF
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
PPSX
Tuvia Liran, Director of VLSI, Nano Retina
PPTX
Sagar Kadam, Lead Software Engineer, Open-Silicon
PPTX
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
PDF
Prof. Emanuel Cohen, Technion
PPTX
Prof. Ramez Daniel, Technion
PPTX
Rotem Ben-Hur,Graduate Student,Technio
PPTX
Misbah Ramadan, Graduate Student,Technion
Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Ken Liao, Senior Associate VP, Faraday
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
Dr.Efraim Aharoni, ESD Leader, TowerJazz
Eddy Kvetny, System Engineering Group Leader, Intel
Xavier van Ruymbeke, App. Engineer, Arteris
Asi Lifshitz, VP R&D, Vtool
Zvika Rozenshein,General Manager, EngineeringIQ
Lewis Chu,Marketing Director,GUC
Kunal Varshney, VLSI Engineer, Open-Silicon
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
Tuvia Liran, Director of VLSI, Nano Retina
Sagar Kadam, Lead Software Engineer, Open-Silicon
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Prof. Emanuel Cohen, Technion
Prof. Ramez Daniel, Technion
Rotem Ben-Hur,Graduate Student,Technio
Misbah Ramadan, Graduate Student,Technion

Recently uploaded (20)

PPT
Retail Management and Retail Markets and Concepts
DOCX
Center Enamel Can Provide Pressure Vessels for Maldives Chemical Industry.docx
PDF
Clouds that Assimilate the Build Parts I&II .pdf
PPTX
Capital Investment in IS Infrastracture and Innovation (SDG9)
PDF
QT INTRODUCTION chapters that help to study
DOCX
ola and uber project work (Recovered).docx
PDF
Challenges of Managing International Schools (www.kiu. ac.ug)
PPTX
Accounting Management SystemBatch-4.pptx
PPTX
IMM.pptx marketing communication givguhfh thfyu
PDF
Chembond Chemicals Limited Presentation 2025
PDF
Pink Cute Simple Group Project Presentation.pdf
PDF
Handouts for Housekeeping.pdfbababvsvvNnnh
PDF
HQ #118 / 'Building Resilience While Climbing the Event Mountain
PDF
The Dynamic CLOs Shaping the Future of the Legal Industry in 2025.pdf
PDF
The Impact of Historical Events on Legal Communication Styles (www.kiu.ac.ug)
PPTX
Biomass_Energy_PPT_FIN AL________________.pptx
PPTX
Chapter 2 strategic Presentation (6).pptx
PPTX
Oracle Cloud Infrastructure Overview July 2020 v2_EN20200717.pptx
PDF
From Legacy to Velocity: how we rebuilt everything in 8 months.
PPTX
Market and Demand Analysis.pptx for Management students
Retail Management and Retail Markets and Concepts
Center Enamel Can Provide Pressure Vessels for Maldives Chemical Industry.docx
Clouds that Assimilate the Build Parts I&II .pdf
Capital Investment in IS Infrastracture and Innovation (SDG9)
QT INTRODUCTION chapters that help to study
ola and uber project work (Recovered).docx
Challenges of Managing International Schools (www.kiu. ac.ug)
Accounting Management SystemBatch-4.pptx
IMM.pptx marketing communication givguhfh thfyu
Chembond Chemicals Limited Presentation 2025
Pink Cute Simple Group Project Presentation.pdf
Handouts for Housekeeping.pdfbababvsvvNnnh
HQ #118 / 'Building Resilience While Climbing the Event Mountain
The Dynamic CLOs Shaping the Future of the Legal Industry in 2025.pdf
The Impact of Historical Events on Legal Communication Styles (www.kiu.ac.ug)
Biomass_Energy_PPT_FIN AL________________.pptx
Chapter 2 strategic Presentation (6).pptx
Oracle Cloud Infrastructure Overview July 2020 v2_EN20200717.pptx
From Legacy to Velocity: how we rebuilt everything in 8 months.
Market and Demand Analysis.pptx for Management students

Dr. John Bainbridge, Principal Application Architect, NetSpeed

  • 1. May 9, 2016 Configurable, Coherent SoC Interconnect for Heterogeneous Multiprocessing and Storage Applications Dr. John Bainbridge ChipEx 2016 May 9th, 2016
  • 2. May 9, 2016 2 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems MARKET TRENDS AND CHALLENGES
  • 3. May 9, 2016 3 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Discontinuities in the Compute Infrastructure New Demand Profile Driving New Architectures MAIN FRAMES 1980s Compute CLIENT / SERVER 1990s Desktop DATA CENTERS 2000s Internet CLOUD COMPUTING Now Software Defined Everything General Purpose > Workload specific Hardware acceleration for Software Customized SKUs
  • 4. May 9, 2016 4 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Cloud Architectures Driving Heterogeneous Computing Increased Efficiency From Hardware Specialization Source: ISSC Proceedings
  • 5. May 9, 2016 5 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Cloud Architectures Driving Heterogeneous Computing Reconfigurable Computing – ASIC + FPGA Source: Hot Chips ’14 Proceedings
  • 6. May 9, 2016 6 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Discontinuities in Chip Design Techniques Changing Abstraction Levels CUSTOM 1980s Sea of xtors ASIC 1990s Sea of Cells SOC 2000s Sea of Blocks NEXT-GEN SOCS Now Sea of IPs Time-to-Market Pressure Performance Analysis from Day #1 Create Differentiated Platforms
  • 7. May 9, 2016 7 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Evolution of Coherency Solutions Coherency Participation Has Increased CPU Memory L1 L2 CPU L1 CPU L1 Memory L3 Memory L2 CPU L1 CPU L1 L2 CPU L1 CPU L1 Last Level Cache Memory L2 CPU L1 CPU L1 L2 CPU L1 CPU L1 GPU Accelerator  Number of IP blocks have exploded and  More agents are participating in coherency  Requirements moving from homogeneous  heterogeneous
  • 8. May 9, 2016 8 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems The Need for Flexible Coherency Solutions Flexibility Creates Opportunities APPLICATION  Improve latency of critical traffic  Improve compute cluster latency FLOORPLAN  Reduce interconnect congestion  Utilize unused die area with additional cache capacity POWER  Partition solution to enhance opportunities for power-gating  Lower BW requirement
  • 9. May 9, 2016 9 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems SOLUTION
  • 10. May 9, 2016 10 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems NetSpeed Technology Overview Built From Requirements, Optimized For Specific Applications PERFORMANCE, POWER, AREA WORKLOAD MODELS COMPUTE ENGINES NETSPEED IP AND SOC ARCHITECTURE SYNTHESIS CUSTOMIZED SOC PERFORMANC E OPTIMIZED CORRECT-BY- CONSTRUCTIO N CACHE COHERENT
  • 11. May 9, 2016 11 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems NetSpeed Gemini Overview Scalable, Configurable Cache Coherent NoC CONFIGURABLE SOLUTION LATENCY OPTIMIZED SOLUTION SCALABLE SOLUTION
  • 12. May 9, 2016 12 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems TECHNOLOGY DETAILS
  • 13. May 9, 2016 13 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Configurable Solution Floorplan Aware Design Flow • Components, Floorplan • Coherency Requirements • PPA, SoC Level Use Cases Step 1: Specify • Rapid Architecture Exploration • Real-time Customization • Power Optimization Step 2: Customize Synthesizable RTL Verification IP Physical Design Collateral IPXACT Performance stats, Programmers Guide, etc. • Customized SoC Interconnect IP Step 3: Generate Configurable Latency Optimized Scalable
  • 14. May 9, 2016 14 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Configurable Solution Enabling Cache Hierarchy Customization  Add last level caches to reduce critical latencies  Use multiple coherency controllers or caches to increase bandwidth  Improve cache utilization by controlling address ranges serviced by each cache  Support coherency across a mix of different slaves devices Configurable Latency Optimized Scalable
  • 15. May 9, 2016 15 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Correct-by-Construction Formal Analysis to Build Deadlock-free NoC  Formally Proven Formal techniques and graph theory algorithms  Correct-by-construction User-driven traffic dependencies  Robust Handles complex topologies and routing Configurable Latency Optimized Scalable
  • 16. May 9, 2016 16 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Latency Optimized Solution Traffic Based Rapid Reconfiguration  Fully heterogeneous in determining channel & buffer sizes Ring TreeHeterogeneous Mesh NetSpeed Router NetSpeed Auto-generated Links Configurable Latency Optimized Scalable
  • 17. May 9, 2016 17 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Latency Optimized Solution Physically Distributed Coherency GPULast Level Cache CPU CPU CPU CPU Last Level Cache CCC CCC CCC DSP  Lower latency by placing coherency controllers and caches where they are accessed the most  Reduce congestion by handling requests locally and using caches to reduce traffic to memory  Adjust cache hierarchy to support floorplan requirements  Improve die utilization by placing caches in empty die space Configurable Latency Optimized Scalable
  • 18. May 9, 2016 18 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Latency Optimized Solution Physically Distributed Coherency  FastPath – Dedicated connections for latency sensitive traffic to reduce arbitration, congestion  MultiLayer – Isolate traffic with different performance requirements  QoS configurability for bandwidth allocation and prioritization ACE Master #0 ACE Master #1 ACE ACE FastPathTM Bridge FastPathTM Bridge ACE CCC NoC Master Bridge ACE Master Bridge FastPathTM Reducing CPU latency Configurable Latency Optimized Scalable
  • 19. May 9, 2016 19 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Scalable Solution Scalable Coherency Bandwidth Configurable Latency Optimized Scalable  More coherent lookups per cycle – Increased coherency bandwidth through address-sliced coherency controllers – Automated determination of coherency controllers
  • 20. May 9, 2016 20 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Scalable Solution Importance of Filtering Snoops Configurable Latency Optimized Scalable  Scalability requires significant filtering of snoops  Directory structure delivers efficient solution in lesser area 1.0 2.0 1.5 0.5 0.0 3.0 2.5 90 % 85 % Filter Rate (%) SizeofDirectory/SnoopFilter(MB) 100 % 95 % 50 % 80 % 70 % 75 % 65 %
  • 21. May 9, 2016 21 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Scalable Solution Directory Scaling Configurable Latency Optimized Scalable NETSPEED SOLUTION: – Superior directory design – Linear growth vs. O(n2) growth in directory  Directory size grows O(n2) – Number of Entries & Number of Agents TAG Number of Entries DIRECTORY Ownership Vector Number of Agents
  • 22. May 9, 2016 22 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Scalable Solution Reducing Conflicts Configurable Latency Optimized Scalable Power Performance ONE-TO-ONE DIRECTORY – Directory = sum(cache associativity) – Dynamic power  with traffic & number of agents – O(n2) GLOBAL SHARED DIRECTORY – Address conflicts reduce performance – Limited associativity reduces power – Conflict rate increases with cache associativity NETSPEED SOLUTION: – Limits associativity and dynamic power – Advanced techniques to avoid address conflicts
  • 23. May 9, 2016 23 John Bainbridge, ChipEx 2016 | © Copyright 2016 NetSpeed Systems Summary PROBLEM  New architectures driving need for heterogeneous computing  Exploding complexity with shrinking timelines  Need configurable coherency solutions SOLUTION  NetSpeed Architecture Synthesis Platform  NetSpeed Gemini: Configurable and Scalable Coherent NoC IP BENEFITS  Flexible, correct-by-construction IP  Higher performance with lower latency  Build customized application-specific solution