0% found this document useful (0 votes)
40 views

Peripheral Component Interface

PCIe Important Links

Uploaded by

yamini
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Peripheral Component Interface

PCIe Important Links

Uploaded by

yamini
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 153

PCI (Peripheral

Component Interconnect)

Presented by
SHIJU N.P
Youzentech Technologies
1
1
Today topics
• Introduction
• PCI and PCI-X difference
• Introduction to PCIe
• Overview of PCIe topology
*Root complex
* switches
* bridges

2
Introduction

• The PCI (Peripheral


Component Interconnect)
bus was developed in the
early 1990’s to address
the shortcomings of the
peripheral buses that
were used in PCs
(personal computers) at
the time
• Parallel communication is
used here

3
2
Data rate on each version

4 3
PCI BUS
TRANSFER
• The round arrow symbol shown on the AD
bus indicates that the tri‐stated bus is
undergoing a “turn‐around cycle” as
ownership of the signals changes.
• For address and date the same signal
is going to use .

5 4
PCI Bus Arbitration
• PCI devices today are almost all capable of
being bus‐master, they are able to do both
DMA and peer‐to‐peer transfers. In a
shared bus architecture like PCI, they have
to take turns on the bus, so a device that
wants to initiate transactions must first
request ownership of the bus from the bus
arbiter.

6 5
5
PCI Transaction
Models
1. Programed I/o
2. DMA
3. Peer to peer

• initiator target model


• A DMA engine, handles the
details of memory transfers to a
peripheral on behalf of the
processor, off‐loading this
7 6
• PCI Retry Protocol
PCI • PCI Disconnect Protocol
Inefficiencie • PCI Interrupt Handling
s • PCI Error Handling

8 7
Retry protocol

• When a PCI master initiates a transaction to


access a target device and the target device is
not ready, the target signals a transaction retry.
• The target does not data transferred
immediately The Ethernet device has two
choices by which to delay the data transfer. The
first is to insert wait‐states in the data phase. If
only a few wait‐states are needed, then the data
is still transferred efficiently. If however the
target device requires more time (more than 16
clocks from the beginning of the transaction),
then the second option the target has is to signal
a retry with a signal called STOP#.

9 8
Error
Handling
• No hardware mechanism
to recover error

10 9
Signal Timing Problems
with the Parallel PCI Bus
Model beyond 66 MHz

• Over 66 MHZ Will leads to the


arrive the metastability state
• Below 15ns of clock chance to
done improper sampling
• Set up time and hold time
violation may take place

11 10
Introducing • PCI‐X is backward compatible with PCI in both
hardware and software, but provides better
performance and higher efficiency. It uses the same

PCI-X connector format, so PCI‐X devices can be plugged into


PCI slots and vice‐versa. And it uses the same
configuration model, so device drivers, operating
systems, and applications that run on a PCI system also
run on a PCI‐X system.

• To achieve higher speeds without changing the PCI


signaling model, PCI‐X added a few tricks to improve
the bus timing. First, they implement PLL (phase‐
locked loop) clock generators that provide phase‐
shifted clocks internally. That allows the outputs to be
driven a little earlier and the inputs to be sampled a
little later, improving the timing on the bus. Likewise,
PCI‐X inputs are registered (latched) at the input pin of
the target device, resulting in shorter setup times. The
time gained by these means increased the time
available for signal propagation on the bus and
allowed higher clock frequencies.

12 11
Transaction
• The bus is not going to the
wait state after the data
phase because target
device know exactly what
is the need of initiator

13 12
PCI-X Features

Split- Message
Transaction
Transaction signaled
attributes
Model interrupt

14 13
Split
transaction
• To help keep track of what each device is
doing, the device initiating the read is now
called the Requester, and the device fulfilling
the read request is called the Completer
• Avid wait the wait state and retry transaction
that in the PCI
• Bus utilization is achieved

15 14
Transaction Attributes

• PCI‐X also added another phase to the beginning of each transaction called
the Attribute Phase
• In this time slot the requester delivers information that can be used to help
improve the efficiency of transactions on the bus, such as the byte count for
this request and who the requester is (Bus ,Device ,Function number).
• In addition to those items, two new bits were added to help characterize
this transaction: the ʺNo Snoopʺ bit and the ʺRelaxed Orderingʺ bit.

16 15
Problems with the
Common Clock Approach
of PCI and PCI-X

• Signal skew
• Clock skew
• Time period and flight time
relation

17 16
PCI-X 2.0 Source-Synchronous Model
• The term “source synchronous” means that
the device transmitting the data also
provides another signal that travels the
same basic path as the data. That signal in
PCI‐X 2.0 is called a “strobe” and is used by
the receiver for latching the incoming data
bits.
• Here we can avoid the flight time and the
clock skew

18 17
Introduction to PCIe

19 18
Terms and
Acronyms
Transaction
For RC
1-Downstream port or egress port

For switch 1
2-upstream port or ingress port
3-Downstream port or egress port
2
3

20
Dual simplex link
• PCIe uses a bidirectional connection and is capable of
sending and receiving information at the same time
• The term for this path between the devices is a Link, and is
made up of one or more transmit and receive pairs. One
such pair is called a Lane, and the spec allows a Link to be
made up 1, 2, 4, 8, 12, 16, or 32 Lanes.
• The number of lanes is called the Link Width and is
represented as x1, x2, x4, x8, x16, and x32
• The serial communication is avoid the clock skew , flight
time issue even if the time period is also too small
because the clock is aligned with the data stream
• The signal skew also Eliminated by the reason of 1
bit transfer .

21
19
Differential Signals

• Increase the pin count by double


• Increase the noise immunity
• anything that affects one signal will also affect the other
by about the same amount and in the same direction.
The receiver is looking at the difference between them
and the noise doesn’t really change that difference, so
the result is that most noise affecting the signals doesn’t
affect the receiver’s ability to accurately distinguish the
bits.
22
20
Decoding
of clock

• The PLL is done the job of taking the original clock that we are used at
the transmitting side by the Reference clock that is generated
internally
• PLL is continuously adjust the clock using the phase comparison
method
• No transaction is coming this may leads to the wrong frequency state
because the clock is only available when the data stream are there

23
21
PCIe Topology
• The link is point to point connection rather
than the shared that we are used in the PCI
and PCI-x
• Flexible topology option is achieved by
using bridge and switch
• The forward bridge and reverse bridge are
the two type of bridges by these older card
can plug into the PCIe and new card plug
into the PCI and the PCIX
• Newer PCIe devices are MIMO Supported

24 22
Root complex
and switch
• This bus is internal to the Root, its actual logical
design doesn’t have to conform to any standard
and can be vendor specific.
• This software are simpler to the PCI so the
compatibility is maintained .
• Internal bus0 is used for the inside software
connection of the root complex
• the process by which configuration software
discovers the system topology and assigns bus
numbers and system resources, works the same
way, too.

25 23
Low cost
PCI System
• Its consumer desktop machine intel
processor
• This is often called the "Uncore"
processor
• Here the memory controller and some
routing logic has been integrated into
the CPU package.
• Root must also inside the CPU package

26 24
Server PCIe System

27 25
Introduction to
Device Layers

28 26
Why We
need layer
partition ?

29 27
To Achieve

New version migrate is make easier

Trouble shoot make simpler

Understandability of spec is more simpler

30 28
PCIe Layers
• The Device core will act as the end point, switch
and root core also with different functionality for
each case
• Transaction layer is responsible for Transaction
Layer Packet (TLP) creation on the transmit side
and TLP decoding on the receive side. Some other
duties of this layer are Quality of Service
functionality, Flow Control functionality and
Transaction Ordering functionality.
• Date link layer is responsible for Data Link Layer
Packet (DLLP) creation on the transmit side and
decoding on the receive side. This layer is also
responsible for Link error detection and correction
• This layer is responsible for Ordered‐Set packet
creation on the transmit side and Ordered‐Set
packet decoding on the receive side. This layer
processes all three types of packets (TLPs, DLLPs
and Ordered‐Sets) to be transmitted on the Link
and processes all types of packets received from
the Link. 31 29
Switch Port
Layer
• Each port of the switch is need to follow the PCL layers
architecture because the action need to perform by the
component is followed by the packet information
• Here the rooting information is inside the packet and its
done by the transaction layer

32 30
Detailed Block Diagram of PCI Express
Device’s Layers

33
31
Non posted
read
• The requester include the
ID of its on in the request
packet for the completer
understanding
purpose(device ID)
• Tag is different for
different request.

34 32
Locked
reads
• The processor can only
permit to do the read
• Avoid the race condition
on the bus with the use of
semaphore

35 33
Flow Control
• The completer is need to
update the buffer size
continuously

36 34
Ack/Nak Protocol
• Data Link Layer Replay
Mechanism
• The replay buffer will send the
packets when the error is
reported.

37 35
Configuration
Overview
• This includes the space in which a Function’s
configuration registers are implemented, how a Function
is discovered, how configuration transactions are
generated and routed, the difference between PCI‐
compatible configuration space and PCIe extended
configuration space, and how software differentiates
between an Endpoint and a Bridge

38 36
WHAT IS
BUS ,DEVICE AND
FUNCTION

39 37
BDF

• Every PCIe Function is uniquely identified by the Device it resides within and the Bus to
which the Device connects. This unique identifier is commonly referred to as a ‘BDF’.
Configuration software is responsible for detecting every Bus, Device and Function (BDF)
within a given topology.

40 38
The System
of BDF

41 40
Specialties of BDF

• Depth to first search


• Multifunction Devices are there
• Identification of Devices is simple (device 0 and others differences)
• Multi functions are implemented parallelly.

42 41
Configuration Address Space
• The configuration file is shipped into the card (plug and play)
• Standardized configuration registers that permit generic shrink‐wrapped OSs to manage
virtually all system resources.
• PCI Contain 256 Bytes of configuration space.
• When PCIe was introduced, there was not enough room in the original 256‐byte
configuration region to contain all the new capability structures needed. So the size of
configuration space was expanded from 256 bytes per function to 4KB, called the
Extended Configuration Space.

43 42
PCI configuration register space

44 43
4KB Configuration
Space per PCI Express
Function
• The extended space was only accessible
through the enhanced configuration
mechanism
• 4 KB for each function

45 44
Generating Configuration Transactions
• Processors are generally unable to perform configuration read and write requests directly
because they can only generate memory and IO requests. That means the Root Complex
will need to translate certain of those accesses into configuration requests in support of
this process. Configuration space can be accessed using either of two mechanisms

• 1)The legacy PCI configuration mechanism, using IO‐indirect accesses.


• 2)The enhanced configuration mechanism, using memory‐mapped accesses

46 45
Single Host
System or
single Root
system

47 46
Multi Root
System
• One Bridge is active at the time of
enumeration while other is passive
• Multi thread problem are solved by
memory mapping .

48 47
Configuration Requests
Two request
types, Type 0 Type 0 ----> Target Bus matches
or Type 1,
may be bridges secondary bus number
generated by
bridges in
response to a
configuration Type 1 ------> Not match with
access
the secondary bus number
49 48
Field indication

50 49
Configuration Read access

• Read operation from Bus4,Device0,Function0 Access


is taken as the example
• This address is completely given by vendor

51
50
Enumeration - Discovering the Topology

• After a system reset or power up, configuration software has to scan


the PCIe fabric to discover the machine topology and learn how the
fabric is populated.
• The process of scanning the PCI Express fabric to discover its
topology is referred to as the enumeration process.
• The configuration software executing on the processor normally
discovers the existence of a Function by reading from its Vendor ID
register
• There is chance to arrive two problems that are device is not ready
and device is not present this is also considered by the enumeration
process

52
Single‐Root System

53
Multi root enumeration

54
Address Space and Transaction Routing
• All Devices have the internal registers and storage location.
• That must be accessible from the outside world
• Accessible is possible only if we addressed those locations
• In order to make this work, these internal locations need to be assigned addresses from
one of the address spaces supported in the system.
• PCI Express supports the exact same three address spaces that were supported in PCI
1) Configuration
2) Memory
3) IO

55
Memory and IO
maps
• Not exclusive to end points
• The switches and root complex
has also have the Device specific
registers accessed via MMIO and
IO addresses

56
Prefetchable vs Non-prefetchable Memory
Space
• In prefetchable the cache process will take. Some additional data also
send back by the responder with respect to the request it will help for
the near future purpose and no date chance will occur when this
process is take place

57
Base Address
Registers
(BARs)
• According to the requirement of the
device The address space is
configured by the BARs register in
the header space
• Lower bits are configured by the
designer and upper bits are
configured through the software
• All BARs is executed sequentially
and capable of changing their size
also
58
BAR Location in Devices

59
Base and Limit Registers
• Register used in the Type 1 request to identify what are the range of
addresses that are in the downstream side.
• Three sets of registers are needed because there can be three
separate address ranges living below a bridge.

1 ) Prefetchable Memory space (P‐MMIO)


2) Non‐Prefetchable Memory space (NP‐MMIO)
3) IO space (IO)

60
Setting the base
and limit value
• BAR are configured with
respect to the three
address spaces

61
Prefetchable
Range (P-
MMIO) in the
BAR register

62
TLP
Routing
• PCIe Links are point‐to‐point, more routing will be
needed to deliver transactions between devices. The
multiport PCIe Devices is taking the duty of routing
the packet. if the ingress packet has no error then
1) Accept the traffic and use it internally
2) Forward the traffic to the appropriate outbound
(egress) port
3) Reject the traffic because it is neither the intended
target, nor an interface to it (Note that there are
other reasons why traffic may be rejected)

63
Three methods of TLP Routing
1) TLPs can be routed based on address (either memory or IO)
2) Based on ID (meaning Bus, Device, Function number)
3) Routed implicitly

Messages are the only TLP type that support more than one routing
method. Most of the message TLPs defined in the PCI Express spec use
implicit routing, however, the vendor‐defined messages could use
address routing or ID routing if desired. Messages are introduced in the
PCIe .It uses in band packets as a message

64
Header Fields
• It defines the packet format and type
• This packet format and type will defines the rest of the format as well
as the routing method related to that packet.
• Each TLP has different set of values in the format and type field
• The encoding will take place according to this values

65
66
Applying Routing Mechanisms
• Once the system addresses have been configured and transactions
are enabled, devices examine incoming TLPs and use the
corresponding configuration fields to route the packet.
1) ID routing
2) Address Routing
3) Implicit Routing

67
ID Routing
• The routing is done by the function of BDF
• Eight bits have given to the BUS number.
• Five bits for the Device
• Three bits for the function number
• Both 3DW and 4 DW are possible

68
Check by endpoint and switches

• End point will make the simple


check with comparing the
incoming BDF with the function
by the device and act as the
responder if match with it.
• Switches will make the two
check per port.
• Done by sub and sec bus values

69
Address
Routing
• The Type field indicates address routing is to be
used for a TLP, then the Address Fields in the
header are used to perform the routing check.
These can be 32‐bit addresses or 64‐bit
addresses.
• The endpoint address check is done with the
help of the BARs register.
• The endpoint has only one bus so type 0 header
will be taken into consideration.

70
Implicit Routing

• Implicit routing, used in some message packets,


is based on the awareness of routing elements
that the topology has upstream and downstream
directions and a single Root Complex at the top
• This is the simplest routing method without
need to assign target address or ID.
• The power management , interrupt and those
things are integrated into either source or target
• Switch and endpoint check is much simpler in
this implicit routing .
• Message are commonly used this method

71
In PCI Express, high‐level transactions
originate in the device core of the trans‐
TRANSACTI mitting device and terminate at the core of
the receiving device. The Transaction Layer
ON LAYER acts on these requests to assemble outbound
TLPs in the Transmitter and interpret them at
the Receiver

72
Packet Based
protocol
• Here parallel buses are used with no
control signal on the bus
• We can't find what happening in the
link at a given time
• Use of bit stream with expected size
and recognizable format for receiver
understanding
• Packet format will different for
different type of action performed

73
Advantages

Packet Formats Are Well Defined

Framing Symbols Define Packet Boundaries

CRC Protects Entire Packet

74
TLP Assembly
And Disassembly
• According with the request the header will
generate and append it with the packet then
Digest will create combine this two .
• Verify the receiver ability before it goes to
the data link layer
• Sequence number will added in this layer and
with respect to this two the CRC will added
again
• Thee entire decoding is done in the receiver side

75
Header format

76
Field summary

77
Transaction Descriptor Fields
• As transactions move between requester and completer, it’s
necessary to uniquely identify a transaction, since many split
transactions may be queued up from the same Requester at any
instant
• It's not in a single field collectively key transaction attributes are
1) Transaction ID
2) Traffic class
3) Transaction Attributes
There are some additional rule is also there with the data payload

78
Specific TLP Formats:
1) IO Request
2) Memory Request
3) Configuration Request
4) Message Request
5) Completion

79
Memory
Request
• Different for 3DW and 4
DW

80
Completion TLP
• Completion With data and without
data is there.
• Lowest address is used for the aligned
the address consider with the first
requested data
• Completion TLP also different
for different requests.

81
Message request
• It replace many of signals
• Most of the fields are not in use because
the routing is different from other requests
• 4 DW Header is used
• Depend on the signals different message
code field is used
• Each of the messages are associated with
some Rules
• Eg: Power Management Messages don’t
have a data payload, so the Length field is
reserved.

82
FLOW CONTROL

• Flow control is designed to ensure that transmitters never send Transaction


Layer Packets (TLPs) that a receiver can’t accept.

83
Flow control overview
• Improve Transmission efficiency with multiple VCs
• PCIe support up to 8 VC
• Credit based mechanism is used

84
Buffer
organization
• Flow control logic is the
combination of two layers
• Handlining differently with
different requests
• Header credit and data
credits are reported as
separately

85
Flow control
initialization
• This is done by the DLCMSM
known as the link training
• When the link is ready with
the assertion of linkup signal
the FSM goes to the two
state control
• Violation also be treated as
an error

86
Flow Control Elements

• CC ---> Credits consumed counter


• CL----> Credits Limit counter
• FCGL is decided the receiver has sufficient
buffer space to accept the pending TLP
• CC + PTLP Does not exceeds the CL
• The error check is optional and is done with
the comparison of received credits and the
allocated credits in the receiver side

87
Quality of services
• Mechanisms that support Quality of Service and describes the means
of controlling the timing and bandwidth of different packets traversing
the fabric. These mechanisms include application‐specific software
that assigns a priority value to every packet, and optional hardware
that must be built into each device to enable managing transaction
priority.That level of support is called “isochronous” service(Equal
time)

88
Traffic class and virtual channels
• Traffic class is denoted as TC and it will give the priority to the each
packet. It differentiate the packet and located in the header field with
3 bit wide
• Virtual channel is denoted as VC and it will act as a buffer and store
the outgoing packets.VC0 to VC7 combination is there
• Each TC is mapped to a VC for achieve this operation(TC-VC Mapping).
• The VC id and number of VC to be used by the device has to be
configured before start to use it.

89
VC Arbitration
• If a device has more than one VC and they all have a
packet ready to send, VC arbitration determines the
order of packet transmission.
• Three types of VC arbitration is there
1) Strict arbitration
2) Group arbitration
3) Hardware arbitration

90
Strict priority
It will take action when device
have multiple VCs and all are
comes together at the same
time

91
Group Arbitration
• Register 1 for setting the lowest priority
and Register 2 for the highest priority
• The Register 2 will work as the strict
priority manner.
• Register 1 will work in among two of
the method
1) Hardware Based Fixed Arbitration
2) Weighted Round Robin Arbitration
(WRR)

92
Port Arbitration
The port has to be decide which packet will goes to
the VC if the port have multiple VCs
1) Asynchronous (By default)
2) Isochronous
The vendor specific root complex register block will
serve the duty of arbitration for the memory

93
Port Arbitration
Buffering
• The buffer are implemented
between the ingress and the
egress port the routing will
take according to the TC/VC
Mapping
• The port arbitration is same
as VC arbitration the software
is take the role over here with
the different schemes they
are
• Hardware and WRRA.
94
Switch arbitration
• If mentioned traffic class is not
there is will treated as error
• VC arbiter select only sufficient
credit are available in the receiver

95
Transaction Ordering
• This is need because the situation may arrive with different packet
have the same TC which is came to the VC
• Transaction with different TC don not need this ordering mechanism
• The ordering mechanism is take into consideration based om some
features

96
Types of ordering
1) Strong ordering
2) Week ordering
3) Relaxed ordering

Strong will be treated every TC with same priority.


In week ordering waited state is implemented .
Relaxed ordering is done with the help of software with some enhancement over the week
ordering

97
Ordering Rules

• Ordering rule make sure that the


synchronization want to take care between
two TCs while they are put into the queue
• The ordering based on the packet is given
in the table
• Column represent previously arrived packet
and row represent newly arrived packet .
The outbound transaction will done as per
the table

98
Relaxed Ordering
• Switches and root complex will takes place the action of ordering when the RO bit is sees as
set in the packet . This bit is controlled by the software

99
ID Based
Ordering
(IDO)
• TLP stream(packet from
the same requester) have
no connection to the
other stream

100
DATA LINK LAYER

101
• Data Link layer packet DLLPs are used to
support Ack/Nak protocol, power
Roles management, flow control mechanism and
can even be used for vendor‐defined
purposes.

102
Compare over TLPs
• Immediately processed at the receiver
• No acknowledgement protocol instead of that recovery mechanism is
here
• The packet arrive at periodically new one will arrive only after the
missing will updated
• There Is no information carried by the DLLPs

103
Generic
format
• 4 Type of packets are
there
1) ACK/NAK
2) Power management
3) Flow control
4) Vendor specific

104
Ack / Nak DLLPs format

105
Elements of the Ack/Nak Protocol

106
Transmitter elements
1) NEXT_TRANSMIT_SEQ Counter : 12 bit wide counter that generate
the sequence number and assign this to the each incoming TLPs.
2) LCRC Generator : Generate the CRC and with respect to the header
and data
3) Replay Buffer :Store the sequence number and LCRC for the
retransmission purpose if any error occur at the receiver end
4) REPLAY_TIMER Count : watchdog timer to ensure that ack or nak
packet is received for each packet transmission
5) REPLAY_NUM Count : To track the number of replay attempt

107
Transmitter elements
6) ACKD_SEQ Register : Compare the current received TLP with the old
one to make the forward progress. Later term is used instead of less
than
7) DLLP CRC Check : Error check and report is take place . Not capable
to take the recovery action

108
Receiver
Elements
• NRS Means the expected
sequence number and
latency timer used for
making some control over
the timing of ack or nak

109
Ack/Nak
Examples

110
Ack with
Sequence
Number
Rollover

111
Transmitter’s
Response to a Nak

112
Physical Layer

113
Subdivision
• Both the blocks have the separate transmitter as well as the
receiver
• The contents of the layers are conceptual and don’t define precise
logic blocks
• User can make according their need
• Compactable with older versions
• Gen1 ‐ the first generation of PCIe (rev 1.x) operating at 2.5 GT/s
• Gen2 ‐ the second generation (rev 2.x) operating at 5.0 GT/s
• Gen3 ‐ the third generation (rev 3.x) operating at 8.0 GT/s

114
Transmit Logic Overview
• Buffer is used for making the delay for adding new set of
bits to the characters
• Byte striping is divided the characters for each lane
• Scrambler is used for the electrical resonance on the link
• Ath the bottom these bits are converted to serial to parallel
for the transmission

115
Receiver

• Here everything happens


opposite for the packet those
works we are done in the
transmitter

116
Mux and Control Logic
• Tx buffer is containing the incoming data
• Control token are used for identification of the boundary
of the packets
• Ordered set are used for maintaining the link operation
control characters are
• Logical idle are used to lock the PLL to the transmitter
frequency

117
Byte Striping
• Striping means that each consecutive outbound character in a
character stream is routed onto consecutive Lanes. The number of
Lanes used is configured during the Link training process based on
what is supported by both devices that share the Link.
• Packet format rule is there according with the link size we are goining
to use
• This size is put at the time of link training

118
x4 Format Rules
• STP and SDP characters are always sent on Lane 0.
• END and EDB characters are always sent on Lane 3.
• When an ordered set such as the SKIP is sent, it must appear on all
lanes simultaneously.
• When Logical Idles are transmitted, they must be sent on all lanes
simultaneously.
• Any violation of these rules may be reported as a Receiver Error to
the Data Link Layer.
• Any violation of these rules may be reported as a Receiver Error to the
Data Link Layer

119
x4 Packet
Format

120
Scrambler
• To avoid the repetitive patterns by achieving wide
range of frequency .
• It will avoid the cross talk between the adjacent lanes
• This can be done by the use of scrambler algorithm
• The disable option is also there in the scrambler for the
test purpose use.
• This scrambler is also need to meet some rules and
criteria at the time of implementation

121
8b/10b Encoding

122
Encoding Procedure

123
Control Characters

124
Receive Logic
Details
• The receiver work are just
opposite the work that we are
done in the transmitter
• The clock recovery and de-skew
process are taken care of the
receiver
• Code violation and disparity
checking, byte un-striping all are
take care of the receiver side

125
New Encoding Model in Gen3

126
Ordered Set Blocks

127
Data Block Frame
Construction

• SDP
• DLLP
• IDLA
These 3 are added immediately after the 2 bit syn
header
• EDS
• EDB
These two are the end of the data block

128
Framing
tokens
• The TLP and DLLP have
some requirement in the
both transmitter as well as
in the receiver side also
• If this is not meet by any
of the side it will leads to
the framing error
• Recovery option is there
also for the framing error
in the gen3
129
Gen3 Physical
Layer Transmit
Logic
• The blocks are remain
same as the gen1 and
gen2 but the format of the
packet has some changes
for achieving this the
block is making some
changes in the working
eg) The 2 bit sync header is
added by the mux logic

130
Byte Striping x8
Example

• Here Packets simply straddle


Block boundaries when necessary.
• The packet length for the next TLP
is 23 DW and that presents an
interesting situation because
there are only 20 dwords
available before the next Block
bound‐ary. When the Data Block
ends the transmitter sends Sync
and continues TLP transmission
during Symbol 0 of the next Block

131
Ordered
set
• The sync header is
changing to new state for
the identification of the
packet
• The SOS need to be follow
some rule in the
transmitter and receiver
side

132
Gen3 Physical
Layer Receive
Logic
• The receiver logic will
need to take care of the
things that are extra taken
by the transmitter side
• Need to deal the sync
header as the separate
thing(it's not the part of
the date)

133
Link Initialization & Training
• Link initialization and training is a hardware‐based (not software)
process con‐ trolled by the Physical Layer. The process configures and
initializes a device’s link and port so that normal packet traffic
proceeds on the link.
Some of the things are done at the time of link initialization and traning
They are
1. Bit lock
2. Symbol lock
3. Block lock

134
Process at 1. Link width
the link 2. Lane reversal
initialization 3. Polarity inversion
4. Link data rate
time 5. Lane to lane de skew

135
TS1 and TS2 Ordered Sets
• The link training involves the exchanging of
the ordered set of data these are include
training sequence 1 and training sequence
2
• Its symbol format are different for each
version of the spec
• Each symbol are assigned for some specific
function
• The symbol number functions are also
different for ts1 and ts2 also

136
Link Training and Status State Machine
(LTSSM)
• The LTSSM consists of 11 top‐level states: Detect, Polling,
Configuration, Recovery, L0, L0s, L1, L2, Hot Reset, Loopback, and
Disable. These can be grouped into five categories:
1. Link Training states
2. Re‐Training (Recovery) state
3. Software driven Power Management states
4. Active‐State Power Management (ASPM) states
5. Other states

137
LTSSM FSM
• Receiver detection in the detect state
• Re training is also a kind of recovery
state. It take place when the link
changes the mode of operation in any
of the section in the link
• Here receiver accomplish some of the
work that are mandatory for the
communication eg) bit lock
Some of the state are assigned for to
control the power states of the
communication

138
State and specification
• Detect : Electrically detect the receiver
• Polling : Receiver achieve bit lock, symbol lock with take the TS1 and
TS2
• Configuration : determines the parameters associated with the link
• L0 : normal state of the link which the TLP and DLLP exchanged
• Recovery : link is going to the restoring state
• L1: provides the greater power saving by trading off
• L2 : main power is turned off for achieving power save

139
State and specification
• Loopback : used for testing and validating and it is not need for the
normal operation
• Disable : disable the configurated link
• Hot reset : software reset the link by setting this bit

140
Detect

141
polling

142
Configuration State

• The primary changes in the links


width and lane numbering are
done in this state after the
recovery
• The numbers of the link and lane
is shared in the TS1 and TS2
• The downstream port is termed
as leader and upstream is called
as follower

143
Designing Devices with
Links that can be Merged

• During Link Training, the LTSSM of each


Downstream Port determines which of
the supported connection options is
actually implemented.

144
Link configuration
example

• Here a single Link that


implements lane sizes of x4, x2, or
x1. The Lane number assignments
are fixed by the device internally
and must be sequential starting
from zero.
• Pad is fore the lane numbering
• The downstream
LTSSM recognizes that 4 lane are
responded and they are used the
same link

145
Confirming Link and
Lane Numbers

• Using the training sequence TS2 the


downstream port conform the link and
lane numbers are set by the each ports
are equal
• Then it will goes to the next state of L0

146
Lane reversal
• Here the Lane numbers of the Link
on the left match between the
Downstream and Upstream Port.

147
Faid lane

• The downstream port will


wait sometimes for lane 2 .
• If it's not happened the downstream
port will take the action accordigly

148
Faild lane problem solving
• The downstream port will decided
to work with the x2 link and lane 2
and 3 will treated as under the
second link and repeat the process
of negotiation for link
• The same time the link1 will
complete the process and ready for
jump to the L0 state

149
Detailed Configuration Substates

150
L0 State
• This is the normal, fully‐operational Link state, during which Logical
Idle, TLPs and DLLPs are exchanged between Link neighbors
• The next state after the L0 is the recovery if the link want to change
the speed
• If the link partner can also initiated to the recovery as well as the
electrical idle

151
Recovery State

• If both the TX and RX support the


higher data rate after the L0 state the
FSM will goes to the recovery state
without changing the link and lane
number.
• The speed change variable is set and
speed negotiation should need to
reset in the recover.RcvrLock
• Recovery. Equalization is used for make
the link stable at higher data rate

152
Link equalization is employed to mitigate
these signal impairments and ensure reliable
and accurate data transfer. It involves applying
Link specific algorithms and techniques to the
Equalization received signal to compensate for the effects
of attenuation and ISI. By adjusting the signal
Overview characteristics, such as its voltage levels and
timing, link equalization helps to restore the
integrity of the transmitted data.

153

You might also like