Network Algorithmics
황인욱
Atto Research
2018. 6. 15.
순서
• Building faster routers
– Scheduling packets
• BW guarantee & DRR
• Random early detection
• Token bucket
– Traffic measurement
– Lookups
• Prefix Match
• Exact-Match: inventing bridge
• “Network Algorithmics” 소개
2
Router Bottlenecks
• Exact match-lookup
• Prefix match-lookup
• Switching
• QoS
3
Scheduling Packet
• Output queue에서 packet처리
– FIFO with tail-drop의 문제점
• 할일
– BW guarantee, rate-limiting, TCP congestion control
– 네트워크를 A와 B가 사용. A는 우선적으로 80%의 대역폭 보장
하려면?
– 동영상 트래픽은 1Mbps를 넘지 못하게 하자.
4
BW guarantee
• FIFO queue
• Multi queue with Round robin
A B A B B B
200 200 200 200 200 200
200 600 400
A
B
5
A
B
BW guarantee
• Multi queue with Priority
• 문제
– Priority 관리 – 가장 높은 priority 찾기 (heap – log n)
• 이렇게까지 하지말고, long term에서 맞춰주자
200 200 200 200 200
200 600
A B
200 400A
B
6
timestamp
Deficit RR (DRR)
• O(1)
– Active List: 보낼 패킷이 있는 queue의 목록
– Quantum은 최소 packet size보다 크게: queue 방문하면 반드시 packet 보냄
200 200 200 200 200
200 600
200
200 600 400
0
0
0
Round robin
pointer Deficit counter
A B C
A
B
C
7
300
100
100
400
0
0
DRR extention
• Class based queuing (CBQ)
– Hierarchical DRR
• Node 당 scheduler 하나
• Modified DRR (cisco/juniper)
– voIP는 top priority
Tenant A Tenant B
Web 그외 Web 그외
70% 30%
40% 60% 50% 50%
8
TCP congestion control
• IPv4에는 congestion 예방을 위한 DECbit가 없다?
– Proposal on table for a ECN bit for IPv6
9
Random Early Detection (RED)
• TCP restart를 최소화하자.
– Output queue가 어느 크기 이상이면 packet drop.
– 일종의 신호.
• 대부분의 라우터에서 구현.
– de facto standard
• Weighted RED (WRED)
– Cisco
– IP TOS bit에 따라서 threshold 다르게
• Adaptive RED(ARED), robust RED(RRED)
10
Token bucket
• 언제 필요한가
– 어떤 flow에 대해 100Kbps로 대역폭 제한
– 하지만 4KB 정도는 burstiness 허용
OpenFlow 1.3
11
Token bucket
12
• Bucket에 B 이상은 담기지 않는다.
• 실제로는 counter와 timer로 구현
Shaping .vs. policing
https://www.cisco.com/c/en/us/support/docs/quality-of-service-qos/qos-policing/19645-policevsshape.html
13
Traffic Measurement
• Traffic 측정은 중요
– Internet Backbones에서 Accounting/Billing
– Traffic engineering
– Capacity planning
– Network diagnostics and forensics: Intrusion detection, denial-
of-service attacks
– Products: NetFlow (Cisco), cflowd (Juniper), NetStream (Huawei)
• 그리고 “떠오르는 분야”
14
Counting
: 0
: 0
: 0
: 0
: 0
: 1: 2
: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24
: 1: 2: 3
: 1
: 1: 2: 3
• Counting: 대표적인 measurement
– Interface에 대한 counting: 쉽다
– Filter-based, per-prefix: 어렵다
15
Hybrid DRAM-SRAM architecture
• DRAM(large) .vs. SRAM(fast)
– DRAMs have access times of 50 - 60 ns
– SRAMs have access times of 4.5 -7 ns, but around 50 - 60 Mb (Micro
n Tech.)
16
“Expensive infeasible”
Approximate counting
• 더 단순하게 만들고, 메모리를 줄이기 위해 정확성을 희생시키자.
• Randomized counting
– 확률적으로 counter 증가
• Large flows (elephants) 만 측정하고 small flows (mice)는 무시해도
될 것 같다.
– 그런데 elephant 인지는 어떻게 알지?
– Elephant인지 알기 위해서 모든 flow에 대해서 counting하면 똑같음 -> hashing
– False positive 줄이기 위해 multi-hash
17
Overall Architecture
Elephant Traps
Few, deep counters
Mouse Traps
Many, shallow counters
Status bit
Indicates overflow
flows
18
Sampling
• Basic Netflow
– DRAM에 부담, collection overhead
– 보통 1/16, 1/1000
• Sampled charging
• Trajectory sampling
– 라우터들의 hash를 동일하게
19
Trajectory sampling
Longest Prefix Matching
20
10.0.0.0/8 00001010 XXXXXXXX XXXXXXXX XXXXXXXX R1
128.0.0.0/9 10000000 0XXXXXXX XXXXXXXX XXXXXXXX R2
10.0.1.0/24 00001010 00000000 00000001 XXXXXXXX R3
0.0.0.0/0 XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX R4
10.0.1.5 와 match되는 가장 긴 prefix는?
Longest Prefix Matching
• Non-algorithmic
– Caching
• Map: 32-bit address에서 next hop로
• “Cache hit ratios in backbone: poor”
– TCAM Issues (Ternary Content-Addressable Memories)
• Density Scaling
• Power Scaling
• Time Scaling
• Extra Chips
• Algorithmic
– TRIE: 문자열에 특화된 tree 자료구조
– Binary search
• Semiconductor manufacturers
– “양쪽에 베팅” - algorithmic, CAM-based
21
TCAM
• Ternary Content Addressable Memory
– 0, 1, X에 대해서 match된다 (ternary)
– data를 넣으면 주소가 나온다.
• Great for partial match
– Longest prefix
– Access lists
00
1
2
3
4
5
6
7
1 0 1 1 1
1 0 1 1 X X
1 0 1 X X X
1 1 0 1 X X
0 0 1 0 X X
X 0 0 X 0 0
X 0 0 X 1 0
X X X X X X
22
SRAM TCAM
Power 6x
Area 7x
Latency 4x
00
1
2
3
4
5
6
7
1 0 1 1 1
1 0 1 1 X X
1 0 1 X X X
1 1 0 1 X X
0 0 1 0 X X
X 0 0 X 0 0
X 0 0 X 1 0
X X X X X X
1 0 1 1 0 0
1
One-bit TRIE
23
DRAM access: 60ns
32 * 60ns = 1.92 us
Multi-bit TRIE
여러가지 최적화: stride, compression 등
“10Gbps는 이걸로 충분” 24
Binary search
• Multi-bit trie보다 느림
• 두가지 필요성
– 특허
• “이것 때문에 하고 있는 vendor들이 있다”
– IPv6
• 8bit stride multibit TRIE: 16 acceses
• Binary search on prefix lengths: 7 accesses
25
Binary search
• Binary search on ranges
• Binary search on prefix lengths
26
Length-1 Length-2 Length-3 …
101
100 111
110
Exact match lookup: history of
bridge
• 1980년대 후반
– Ethernet의 한계
– Ethernet을 확장 필요.
• Filter repeater with learning (Mark Kempf, DEC)
– “훌륭한 아이디어”
27
Wire Speed를 위해 한 것들
• 10Mbps
– 2 lookups per port in 51.2 usec
• Architecture
– 4-port cheap DRAM with cycle time of 100 nsec for packet bufers and lookup
memory. Bus parallelism, memory bandwidth, page mode.
• Data Copying
– Ethernet chips used DMA, packets copied from one port to other by flipping
pointers.
• Control Overhead
– Interrupt overhead minimized by processor polling, staying in a loop after a
packet interrupt.
• Lookups
– Used caveats. Wrote software to verify lookup bottleneck
28
Scaling lookups
• 1990년대
– DEC의 결정: 100Mbps ethernet ring 연결 위한 FD야
bridge
– 패킷 최소크기: 64b -> 40b
– Lookup DB: 8K -> 64K
• Two approaches
– Perfect Hashing (pre-computation)
– HW parallelism
29
Network Algorithmics
30
Network algorithmics is the use of an interdisciplinary systems approach, seasoned
with algorithmic thinking, to design fast implementations of network processing tasks
at servers, routers, and other networking devices
Topics
• Endnode bottlenecks
– Data copy: DMA, programmed IO
– Context switching
• service model (process/thread/event-driven), select()
– Timer: timing wheel
– Demultiplexing
– Protocol processing
• UDP checksum, buffer 관리, Reassembly
• Router bottlenecks
– exact match: bridge
– prefix match: router의 longest first match
– switching
– packet classification
• service differentiation (router)
– QoS: rate-limiting, RED
• 그외
– Network Measurement: counter, trajectory sampling
– Network Security: exact/approximate string matching 31
15 implementation principles
32
Polya, “How to solve it”
33
ROUTE COMPUTATION USING DIJKSTRA’S ALGORITHM (4.3)
0
8
9
8
10
1
3 12
7
11
8
∞
∞
∞
∞
∞
∞
∞
2
4
5
6 0
11
9
88
9
8
10
1
3 12
7
11
8
∞
∞
∞
∞
2
4
5
6 0
11
9
18
8
8
10
1
3 12
7
8
∞
∞
∞
2
4
5
6
0
11
9
10
8
8
12
7
8
12
∞
∞
2
4
5
0
19
11
9
19
10
8
8
12
7
8
12
4
5
6 0
11
9
10
8
8
1
3 12
7
8
∞
∞
∞
2
4
5
6
34
ROUTE COMPUTATION USING DIJKSTRA’S ALGORITHM (4.3)
35
2
5 10
12 15 16
Heap
n log n -> n + diam*maxlinkcost
Updating TCAM (3.1)
00
1
2
3
4
5
6
7
1 0 1 1 1
1 0 1 1 X X
1 0 1 X X X
1 1 0 1 X X
0 0 1 0 X X
X 0 0 X 0 0
X 0 0 X 1 0
X X X X X X
1 0 1 1 0 0
1
36
principles 나름의 요약
• 자료구조, 하드웨어를 잘 사용하자
– TCAM, TRIE, Hash 등
• Common case를 최적화하라
– Cache
• 제한조건을 완화시켜서 더 쉬운 알고리즘 적용.
– 예: Real number 대신에 integer
• 그것도 안되면, 정확도를 희생하거나 확률적인 방법도 고려하라
– 아주 정확하지 않아도 되는 값 (ranking)
– Ethernet, RED
37
Summary
• Building faster routers
– Scheduling packets
• BW guarantee & DRR
• Random early detection
• Token bucket
– Traffic measurement
– Lookups
• Prefix Match
• Exact-Match: inventing bridge
• “Network Algorithmics” 소개
38

More Related Content

PPTX
Neutron qos overview
PDF
What is new in neutron QoS?
PPTX
DBodle QoS Exam Study Notes
PDF
USENIX NSDI 2016 (Session: Resource Sharing)
PDF
クラウド時代の半導体メモリー技術
PDF
BGP zombie routes
PDF
ゼロから作るパケット転送用OS (Internet Week 2014)
PPTX
Stress your DUT
Neutron qos overview
What is new in neutron QoS?
DBodle QoS Exam Study Notes
USENIX NSDI 2016 (Session: Resource Sharing)
クラウド時代の半導体メモリー技術
BGP zombie routes
ゼロから作るパケット転送用OS (Internet Week 2014)
Stress your DUT

What's hot (20)

ODP
A Baker's dozen of TCP
PPTX
Spy hard, challenges of 100G deep packet inspection on x86 platform
PPTX
Debug dpdk process bottleneck & painpoints
PDF
Fast & Energy-Efficient Breadth-First Search on a Single NUMA System
PDF
Training Slides: Intermediate 201: Single and Multi-Site Tungsten Clustering ...
PDF
Performance Lessons learned in vRouter - Stephen Hemminger
PPTX
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
PDF
100 M pps on PC.
PDF
PFQ@ 10th Italian Networking Workshop (Bormio)
ODP
Dpdk performance
PPTX
PDF
On heap cache vs off-heap cache
PDF
How to Speak Intel DPDK KNI for Web Services.
PPTX
Netmap presentation
PDF
On the feasibility of 40 Gbps network data capture and retention with general...
PDF
Enabling a Secure Multi-Tenant Environment for HPC
PDF
Performance challenges in software networking
ODP
Bridging and its use in KVM
PPTX
Demystifying Networking Webinar Series- Routing on the Host
PDF
Simplemux traffic optimization
A Baker's dozen of TCP
Spy hard, challenges of 100G deep packet inspection on x86 platform
Debug dpdk process bottleneck & painpoints
Fast & Energy-Efficient Breadth-First Search on a Single NUMA System
Training Slides: Intermediate 201: Single and Multi-Site Tungsten Clustering ...
Performance Lessons learned in vRouter - Stephen Hemminger
Opensample: A Low-latency, Sampling-based Measurement Platform for Software D...
100 M pps on PC.
PFQ@ 10th Italian Networking Workshop (Bormio)
Dpdk performance
On heap cache vs off-heap cache
How to Speak Intel DPDK KNI for Web Services.
Netmap presentation
On the feasibility of 40 Gbps network data capture and retention with general...
Enabling a Secure Multi-Tenant Environment for HPC
Performance challenges in software networking
Bridging and its use in KVM
Demystifying Networking Webinar Series- Routing on the Host
Simplemux traffic optimization
Ad

Similar to Network Algorithmics (20)

PDF
Network State Awareness & Troubleshooting
PDF
Mum bandwidth management and qos
PDF
NUMA-aware Scalable Graph Traversal on SGI UV Systems
PPTX
Presentacion qos-
PPTX
Presentacion qos-
PPTX
Presentacion qos-
PDF
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
PPT
Application-engaged Dynamic Orchestration of Optical Network Resources
PDF
2009-01-28 DOI NBC Red Hat on System z Performance Considerations
PDF
Tutorial: Network State Awareness Troubleshooting
PPTX
Play With Streams
PPTX
Collaborate nfs kyle_final
PPTX
Next-gen Network Telemetry is Within Your Packets: In-band OAM
PPTX
400-101 CCIE Routing and Switching IT Certification
PDF
Approved MikroTik training programs and certificates outlines
PPTX
Network protocols and vulnerabilities
PPT
Cisco crs1
PDF
数据中心网络研究:机遇与挑战
PPTX
Automotive network and gateway simulation
PDF
Performance & Monitoring Performance.pdf
Network State Awareness & Troubleshooting
Mum bandwidth management and qos
NUMA-aware Scalable Graph Traversal on SGI UV Systems
Presentacion qos-
Presentacion qos-
Presentacion qos-
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Application-engaged Dynamic Orchestration of Optical Network Resources
2009-01-28 DOI NBC Red Hat on System z Performance Considerations
Tutorial: Network State Awareness Troubleshooting
Play With Streams
Collaborate nfs kyle_final
Next-gen Network Telemetry is Within Your Packets: In-band OAM
400-101 CCIE Routing and Switching IT Certification
Approved MikroTik training programs and certificates outlines
Network protocols and vulnerabilities
Cisco crs1
数据中心网络研究:机遇与挑战
Automotive network and gateway simulation
Performance & Monitoring Performance.pdf
Ad

Recently uploaded (20)

PPTX
Human-Computer Interaction for Lecture 1
PPTX
Bandicam Screen Recorder 8.2.1 Build 2529 Crack
PPTX
HackYourBrain__UtrechtJUG__11092025.pptx
PPTX
Folder Lock 10.1.9 Crack With Serial Key
PDF
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
PPTX
Presentation - Summer Internship at Samatrix.io_template_2.pptx
PPTX
ROI from Efficient Content & Campaign Management in the Digital Media Industry
PPTX
Why 2025 Is the Best Year to Hire Software Developers in India
PDF
Module 1 - Introduction to Generative AI.pdf
PDF
Engineering Document Management System (EDMS)
PPTX
Chapter_05_System Modeling for software engineering
PPTX
StacksandQueuesCLASS 12 COMPUTER SCIENCE.pptx
PDF
Coding with GPT-5- What’s New in GPT 5 That Benefits Developers.pdf
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
PDF
Mobile App Backend Development with WordPress REST API: The Complete eBook
PPTX
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
PDF
infoteam HELLAS company profile 2025 presentation
PPTX
AI Tools Revolutionizing Software Development Workflows
PDF
Mobile App for Guard Tour and Reporting.pdf
PPTX
Swiggy API Scraping A Comprehensive Guide on Data Sets and Applications.pptx
Human-Computer Interaction for Lecture 1
Bandicam Screen Recorder 8.2.1 Build 2529 Crack
HackYourBrain__UtrechtJUG__11092025.pptx
Folder Lock 10.1.9 Crack With Serial Key
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
Presentation - Summer Internship at Samatrix.io_template_2.pptx
ROI from Efficient Content & Campaign Management in the Digital Media Industry
Why 2025 Is the Best Year to Hire Software Developers in India
Module 1 - Introduction to Generative AI.pdf
Engineering Document Management System (EDMS)
Chapter_05_System Modeling for software engineering
StacksandQueuesCLASS 12 COMPUTER SCIENCE.pptx
Coding with GPT-5- What’s New in GPT 5 That Benefits Developers.pdf
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Mobile App Backend Development with WordPress REST API: The Complete eBook
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
infoteam HELLAS company profile 2025 presentation
AI Tools Revolutionizing Software Development Workflows
Mobile App for Guard Tour and Reporting.pdf
Swiggy API Scraping A Comprehensive Guide on Data Sets and Applications.pptx

Network Algorithmics

  • 2. 순서 • Building faster routers – Scheduling packets • BW guarantee & DRR • Random early detection • Token bucket – Traffic measurement – Lookups • Prefix Match • Exact-Match: inventing bridge • “Network Algorithmics” 소개 2
  • 3. Router Bottlenecks • Exact match-lookup • Prefix match-lookup • Switching • QoS 3
  • 4. Scheduling Packet • Output queue에서 packet처리 – FIFO with tail-drop의 문제점 • 할일 – BW guarantee, rate-limiting, TCP congestion control – 네트워크를 A와 B가 사용. A는 우선적으로 80%의 대역폭 보장 하려면? – 동영상 트래픽은 1Mbps를 넘지 못하게 하자. 4
  • 5. BW guarantee • FIFO queue • Multi queue with Round robin A B A B B B 200 200 200 200 200 200 200 600 400 A B 5 A B
  • 6. BW guarantee • Multi queue with Priority • 문제 – Priority 관리 – 가장 높은 priority 찾기 (heap – log n) • 이렇게까지 하지말고, long term에서 맞춰주자 200 200 200 200 200 200 600 A B 200 400A B 6 timestamp
  • 7. Deficit RR (DRR) • O(1) – Active List: 보낼 패킷이 있는 queue의 목록 – Quantum은 최소 packet size보다 크게: queue 방문하면 반드시 packet 보냄 200 200 200 200 200 200 600 200 200 600 400 0 0 0 Round robin pointer Deficit counter A B C A B C 7 300 100 100 400 0 0
  • 8. DRR extention • Class based queuing (CBQ) – Hierarchical DRR • Node 당 scheduler 하나 • Modified DRR (cisco/juniper) – voIP는 top priority Tenant A Tenant B Web 그외 Web 그외 70% 30% 40% 60% 50% 50% 8
  • 9. TCP congestion control • IPv4에는 congestion 예방을 위한 DECbit가 없다? – Proposal on table for a ECN bit for IPv6 9
  • 10. Random Early Detection (RED) • TCP restart를 최소화하자. – Output queue가 어느 크기 이상이면 packet drop. – 일종의 신호. • 대부분의 라우터에서 구현. – de facto standard • Weighted RED (WRED) – Cisco – IP TOS bit에 따라서 threshold 다르게 • Adaptive RED(ARED), robust RED(RRED) 10
  • 11. Token bucket • 언제 필요한가 – 어떤 flow에 대해 100Kbps로 대역폭 제한 – 하지만 4KB 정도는 burstiness 허용 OpenFlow 1.3 11
  • 12. Token bucket 12 • Bucket에 B 이상은 담기지 않는다. • 실제로는 counter와 timer로 구현
  • 14. Traffic Measurement • Traffic 측정은 중요 – Internet Backbones에서 Accounting/Billing – Traffic engineering – Capacity planning – Network diagnostics and forensics: Intrusion detection, denial- of-service attacks – Products: NetFlow (Cisco), cflowd (Juniper), NetStream (Huawei) • 그리고 “떠오르는 분야” 14
  • 15. Counting : 0 : 0 : 0 : 0 : 0 : 1: 2 : 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24 : 1: 2: 3 : 1 : 1: 2: 3 • Counting: 대표적인 measurement – Interface에 대한 counting: 쉽다 – Filter-based, per-prefix: 어렵다 15
  • 16. Hybrid DRAM-SRAM architecture • DRAM(large) .vs. SRAM(fast) – DRAMs have access times of 50 - 60 ns – SRAMs have access times of 4.5 -7 ns, but around 50 - 60 Mb (Micro n Tech.) 16 “Expensive infeasible”
  • 17. Approximate counting • 더 단순하게 만들고, 메모리를 줄이기 위해 정확성을 희생시키자. • Randomized counting – 확률적으로 counter 증가 • Large flows (elephants) 만 측정하고 small flows (mice)는 무시해도 될 것 같다. – 그런데 elephant 인지는 어떻게 알지? – Elephant인지 알기 위해서 모든 flow에 대해서 counting하면 똑같음 -> hashing – False positive 줄이기 위해 multi-hash 17
  • 18. Overall Architecture Elephant Traps Few, deep counters Mouse Traps Many, shallow counters Status bit Indicates overflow flows 18
  • 19. Sampling • Basic Netflow – DRAM에 부담, collection overhead – 보통 1/16, 1/1000 • Sampled charging • Trajectory sampling – 라우터들의 hash를 동일하게 19 Trajectory sampling
  • 20. Longest Prefix Matching 20 10.0.0.0/8 00001010 XXXXXXXX XXXXXXXX XXXXXXXX R1 128.0.0.0/9 10000000 0XXXXXXX XXXXXXXX XXXXXXXX R2 10.0.1.0/24 00001010 00000000 00000001 XXXXXXXX R3 0.0.0.0/0 XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX R4 10.0.1.5 와 match되는 가장 긴 prefix는?
  • 21. Longest Prefix Matching • Non-algorithmic – Caching • Map: 32-bit address에서 next hop로 • “Cache hit ratios in backbone: poor” – TCAM Issues (Ternary Content-Addressable Memories) • Density Scaling • Power Scaling • Time Scaling • Extra Chips • Algorithmic – TRIE: 문자열에 특화된 tree 자료구조 – Binary search • Semiconductor manufacturers – “양쪽에 베팅” - algorithmic, CAM-based 21
  • 22. TCAM • Ternary Content Addressable Memory – 0, 1, X에 대해서 match된다 (ternary) – data를 넣으면 주소가 나온다. • Great for partial match – Longest prefix – Access lists 00 1 2 3 4 5 6 7 1 0 1 1 1 1 0 1 1 X X 1 0 1 X X X 1 1 0 1 X X 0 0 1 0 X X X 0 0 X 0 0 X 0 0 X 1 0 X X X X X X 22 SRAM TCAM Power 6x Area 7x Latency 4x 00 1 2 3 4 5 6 7 1 0 1 1 1 1 0 1 1 X X 1 0 1 X X X 1 1 0 1 X X 0 0 1 0 X X X 0 0 X 0 0 X 0 0 X 1 0 X X X X X X 1 0 1 1 0 0 1
  • 23. One-bit TRIE 23 DRAM access: 60ns 32 * 60ns = 1.92 us
  • 24. Multi-bit TRIE 여러가지 최적화: stride, compression 등 “10Gbps는 이걸로 충분” 24
  • 25. Binary search • Multi-bit trie보다 느림 • 두가지 필요성 – 특허 • “이것 때문에 하고 있는 vendor들이 있다” – IPv6 • 8bit stride multibit TRIE: 16 acceses • Binary search on prefix lengths: 7 accesses 25
  • 26. Binary search • Binary search on ranges • Binary search on prefix lengths 26 Length-1 Length-2 Length-3 … 101 100 111 110
  • 27. Exact match lookup: history of bridge • 1980년대 후반 – Ethernet의 한계 – Ethernet을 확장 필요. • Filter repeater with learning (Mark Kempf, DEC) – “훌륭한 아이디어” 27
  • 28. Wire Speed를 위해 한 것들 • 10Mbps – 2 lookups per port in 51.2 usec • Architecture – 4-port cheap DRAM with cycle time of 100 nsec for packet bufers and lookup memory. Bus parallelism, memory bandwidth, page mode. • Data Copying – Ethernet chips used DMA, packets copied from one port to other by flipping pointers. • Control Overhead – Interrupt overhead minimized by processor polling, staying in a loop after a packet interrupt. • Lookups – Used caveats. Wrote software to verify lookup bottleneck 28
  • 29. Scaling lookups • 1990년대 – DEC의 결정: 100Mbps ethernet ring 연결 위한 FD야 bridge – 패킷 최소크기: 64b -> 40b – Lookup DB: 8K -> 64K • Two approaches – Perfect Hashing (pre-computation) – HW parallelism 29
  • 30. Network Algorithmics 30 Network algorithmics is the use of an interdisciplinary systems approach, seasoned with algorithmic thinking, to design fast implementations of network processing tasks at servers, routers, and other networking devices
  • 31. Topics • Endnode bottlenecks – Data copy: DMA, programmed IO – Context switching • service model (process/thread/event-driven), select() – Timer: timing wheel – Demultiplexing – Protocol processing • UDP checksum, buffer 관리, Reassembly • Router bottlenecks – exact match: bridge – prefix match: router의 longest first match – switching – packet classification • service differentiation (router) – QoS: rate-limiting, RED • 그외 – Network Measurement: counter, trajectory sampling – Network Security: exact/approximate string matching 31
  • 33. Polya, “How to solve it” 33
  • 34. ROUTE COMPUTATION USING DIJKSTRA’S ALGORITHM (4.3) 0 8 9 8 10 1 3 12 7 11 8 ∞ ∞ ∞ ∞ ∞ ∞ ∞ 2 4 5 6 0 11 9 88 9 8 10 1 3 12 7 11 8 ∞ ∞ ∞ ∞ 2 4 5 6 0 11 9 18 8 8 10 1 3 12 7 8 ∞ ∞ ∞ 2 4 5 6 0 11 9 10 8 8 12 7 8 12 ∞ ∞ 2 4 5 0 19 11 9 19 10 8 8 12 7 8 12 4 5 6 0 11 9 10 8 8 1 3 12 7 8 ∞ ∞ ∞ 2 4 5 6 34
  • 35. ROUTE COMPUTATION USING DIJKSTRA’S ALGORITHM (4.3) 35 2 5 10 12 15 16 Heap n log n -> n + diam*maxlinkcost
  • 36. Updating TCAM (3.1) 00 1 2 3 4 5 6 7 1 0 1 1 1 1 0 1 1 X X 1 0 1 X X X 1 1 0 1 X X 0 0 1 0 X X X 0 0 X 0 0 X 0 0 X 1 0 X X X X X X 1 0 1 1 0 0 1 36
  • 37. principles 나름의 요약 • 자료구조, 하드웨어를 잘 사용하자 – TCAM, TRIE, Hash 등 • Common case를 최적화하라 – Cache • 제한조건을 완화시켜서 더 쉬운 알고리즘 적용. – 예: Real number 대신에 integer • 그것도 안되면, 정확도를 희생하거나 확률적인 방법도 고려하라 – 아주 정확하지 않아도 되는 값 (ranking) – Ethernet, RED 37
  • 38. Summary • Building faster routers – Scheduling packets • BW guarantee & DRR • Random early detection • Token bucket – Traffic measurement – Lookups • Prefix Match • Exact-Match: inventing bridge • “Network Algorithmics” 소개 38