Asdffdsaadsf
Asdffdsaadsf
Clocking Resources
User Guide
Overview
Virtex® UltraScale+™ devices provide the highest performance and integration capabilities
in a FinFET node, including both the highest serial I/O and signal processing bandwidth, as
well as the highest on-chip memory density. As the industry's most capable FPGA family,
the Virtex UltraScale+ devices are ideal for applications including 1+Tb/s networking and
data center and fully integrated radar/early-warning systems.
Virtex UltraScale devices provide the greatest performance and integration at 20 nm,
including serial I/O bandwidth and logic capacity. As the industry's only high-end FPGA at
the 20 nm process node, this family is ideal for applications including 400G networking,
large scale ASIC prototyping, and emulation.
Artix® UltraScale+ devices provide high serial bandwidth and signal compute density in a
cost-optimized device for critical networking applications, vision and video processing, and
secured connectivity. Coupled with the innovative InFO packaging, which provides excellent
thermal and power distribution, Artix UltraScale+ devices are perfectly suited to
applications requiring high compute density in a small footprint.
Zynq® UltraScale+ devices provide 64-bit processor scalability while combining real-time
control with soft and hard engines for graphics, video, waveform, and packet processing.
Integrating an Arm®-based system for advanced analytics and on-chip programmable
logic for task acceleration creates unlimited possibilities for applications including 5G
Wireless, next generation ADAS, and industrial Internet-of-Things.
This user guide describes the UltraScale architecture clocking resources and is part of the
UltraScale architecture documentation suite available at: www.xilinx.com/ultrascale.
Clocking Overview
This chapter provides an overview of clocking and a comparison between clocking in the
UltraScale architecture and previous FPGA generations. For detailed information on usage
of clocking resources, see Chapter 2, Clocking Resources and Chapter 3, Clock Management
Tile. For more information refer to the Clocking Guidelines section in the UltraFast Design
Methodology Guide for the Vivado Design Suite (UG949) [Ref 1].
• The device is subdivided into columns and rows of segmented clock regions (CRs). CRs
differ from previous families because they are arranged in tiles and do not span half the
width of a device. A CR contains configurable logic blocks (CLBs), DSP slices,
block RAMs, interconnect, and associated clocking. The height of a CR is 60 CLBs,
24 DSP slices, and 12 block RAMs with a horizontal clock spine (HCS) at its center. The
HCS contains the horizontal routing and distribution resources, leaf clock buffers, clock
network interconnections, and the root of the clock network. Clock buffers drive
directly into the HCS. There are 52 I/Os per bank and four gigabit transceivers (GTs)
that are pitch matched to the CRs. A core column contains configuration, System
Monitor (SYSMON), and PCIe® blocks to complete a basic device.
• Adjacent to the input/output block columns are the physical layer (PHY) blocks with
CMTs, global clock buffers, global clock multiplexing structures, and I/O logic
Each device has three global clock buffers: BUFGCTRL, BUFGCE, and BUFGCE_DIV. In
addition, there is a local BUFCE_LEAF clock buffer for driving leaf clocks from horizontal
distribution to various blocks in the device. BUFGCTRL has derivative software
representations of types BUFGMUX, BUFGMUX1, BUFGMUX_CTRL, and BUFGCE_1. BUFGCE
is for glitchless clock gating and has software derivative BUFG (BUFGCE with clock enable
tied High). The global clock buffers drive routing and distribution tracks into the device
logic via HCS rows. There are 24 routing and 24 distribution tracks in each HCS row. There
is also a BUFG_GT that generates divided clocks for GT clocking. The clock buffers:
• Can be used as a clock enable circuit to enable or disable clocks either globally, locally,
or within a CR for fine-grained power control.
• Can be used as a glitch-free multiplexer to:
Chapter 2, Clocking Resources, has further details on global clocks, I/O, and GT clocking. It
also describes which clock routing resources to utilize for various applications.
CMT Overview
Each device has a CMT as part of the PHY next to each of the I/O banks. A CMT consists of
one MMCM and two PLLs. The MMCM is the primary block for frequency synthesis for a
wide range of frequencies, and serves as a jitter filter for either external or internal clocks,
and deskew clocks among a wide range of other functions. The PLL’s primary purpose is to
provide clocking to the PHY I/Os, but can also be used for clocking other resources in the
device in a limited fashion. The device clock input connectivity allows multiple resources to
provide the reference clock(s) to the MMCM and PLL.
MMCMs have infinite fine phase shift capability in either direction and can be used in
dynamic phase shift mode. MMCMs also have a fractional counter in either the feedback
path or in one output path, enabling further granularity of frequency synthesis capabilities.
The LogiCORE™ IP clocking wizard is available to assist in utilizing MMCMs and PLLs to
create clock networks in UltraScale architecture designs. The GUI interface is used to collect
clock network parameters. The clocking wizard chooses the appropriate CMT resource and
optimally configures the CMT resource and associated clock routing resources.
Chapter 3, Clock Management Tile, has further details on the CMT block features and
connectivity.
Clocking Resources
Overview
UltraScale™ architecture-based devices have several clock routing resources to support
various clocking schemes and requirements, including high fanout, short propagation
delay, and extremely low skew. To best utilize the clock routing resources, the designer must
understand how to get user clocks from the PCB to the UltraScale devices, decide which
clock routing resources are optimal, and then access those clock routing resources by
utilizing the appropriate I/O and clock buffers.
Each I/O bank is located in a single clock region and includes 52 I/O pins. Of the 52 I/O pins
in each I/O bank in every I/O column, there are four global clock input pin pairs (a total of
eight pins). Each global clock input:
Single-ended clock inputs must be assigned to the P (master) side of the GC input pin pair.
If a single-ended clock is connected to the P-side of a differential clock pin pair, the N-side
cannot be used as another single-ended clock pin—it can only be used as a user I/O. For pin
naming conventions, refer to the UltraScale Architecture Packaging and Pinout User Guide
(UG575) [Ref 2].
GC inputs can be used as regular I/O if not used as clocks. When used as regular I/O, global
clock input pins can be configured as any single-ended or differential I/O standard. GC
inputs can connect to the PHY adjacent to the banks they reside in.
Understanding the signal path for a global clock expands the understanding of the various
global clocking resources. The global clocking resources and network consist of these paths
and components:
Clock Structure
The basic device architecture is composed of blocks of CRs. CRs are organized into tiles and
thus build columns and rows. Each CR contains slices (CLBs), DSPs, and 36K block RAM
blocks. The mix of slice, DSP, and block RAM columns in each CR can be different, but are
always identical when stacked in the vertical direction, thus building columns of those
resources for the entire device. I/O and GT columns are then inserted with columns of CRs.
In addition, there is a single column that contains the configuration logic, SYSMON, and
PCIe blocks. An HCS runs horizontally through the device in the center of each row of CRs,
I/Os, and GTs. The HCS contains the horizontal routing and distribution tracks as well as leaf
clock buffers and clock network interconnects between horizontal/vertical routing and
distribution. Vertical tracks of routing and distribution connect all CRs in a column, while
vertical routing spans an entire I/O column. There are 24 horizontal routing and 24
distribution tracks (Figure 2-1), and 24 vertical routing and 24 distribution tracks
(Figure 2-2). The purpose of the clock routing resources is to route a clock from the global
clock buffers to a central point from where it is connected to the loads via the distribution
resources. This central point of the clock network is called a clock root in the UltraScale
architecture. The root can be in any CR in a device from where it is routed to the loads via
the clock distribution resources. This architecture optimized clock skew. Routing and
distribution resources can either connect to adjacent CRs or disconnect (isolated) at the
border of the CR as needed. This concept extends to SSI devices as well.
PHY and
CR Column with PCIe,
Clocking
Configuration, and
SYSMON Column
I/O Column CR Column I/O Column CR Column GT Column
HCS
24 Distribution Tracks
24 Routing Tracks
X16662-111516
PHY and
CR Column with PCIe,
Clocking
Configuration, and
SYSMON Column
I/O Column CR Column I/O Column CR Column GT Column
The clocks can be distributed from their sources in one of two ways (Figure 2-3):
• The clocks can go onto routing tracks that take the clocks to a central point in a CR
without going to any loads. The clocks can then drive the distribution tracks
unidirectionally from which the clock networks fan out. In this way, the clock buffers
can drive to a specific point in the CRs from which the clock buffers travel vertically and
then horizontally on the distribution tracks to drive the clocking points. The clocking
points are driven via leaf clocks with clock enable (CE) in that CR and adjacent CRs, if
needed. Distribution tracks cannot drive routing tracks.
This distribution scheme is used to move the root for all the loads to be at a specific
location for improved, localized skew. Furthermore, both routing and distribution tracks
can drive into horizontally or vertically adjacent CRs in a segmented fashion. Routing
tracks can drive both routing and distribution tracks in the adjacent CRs while the
distribution tracks can drive other horizontal distribution tracks in adjacent CRs. The CR
boundary segmentation allows construction of either truly global, device-wide clock
networks or more local clock networks of variable sizes by reusing clocking tracks.
• Alternatively, clock buffers can drive straight onto the distribution tracks and distribute
the clock in that manner. This reduces the clock insertion delay.
X-Ref Target - Figure 2-3
CR Boundary
Columns of Columns of
CLBs, Block RAMs, 24 CLBs, Block RAMs,
24
DSPs DSPs
Horizontal BUFCE_LEAFs
Distribution CE CE CE CE
24
From/To Next TS From/To Next
CR or From Root CE
HCS CR or From
Clock Buffers Clock Buffers
TS
24
Horizontal CE CE CE CE
BUFCE_LEAFs
Routing
Columns of Columns of
CLBs, Block RAMs, CLBs, Block RAMs,
DSPs DSPs
From/To
CR Below
X16681-111516
• Each of the four bytes in the XIPHY BITSLICE have six connections from the HCS to their
global clocking pins. Therefore, only six BUFGs can drive the BITSLICE clocking pins in
either half of an I/O bank (a maximum of 6 clocks can drive any half of an I/O bank).
Clock Buffers
The PHY global clocking contains several sets of BUFGCTRLs, BUFGCEs, and BUFGCE_DIVs.
Each set can be driven by four GC pins from the adjacent bank, MMCMs, PLLs in the same
PHY, and interconnect. The clock buffers then drive the routing and distribution resources
across the entire device. Each PHY contains 24 BUFGCEs, 8 BUFGCTRLs, and 4 BUFGCE_DIVs
but only 24 of them can be used at the same time.
IMPORTANT: It is recommended to only allow the Vivado® Placer to assign all global clock buffers to
specific locations. Each CR contains 24 BUFGCEs, 8 BUFGCTRLs and 4 BUFGCE_DIVs. These clock
buffers share the 24 routing tracks and therefore collisions may occur resulting in unroutable designs.
If the design requires a number of global clock buffers to be in a certain CR then it is recommended to
attach the CLOCK_REGION property to these buffers instead of a specific LOCATION property.
In the clocking architecture, BUFGCTRL multiplexers and all derivatives can be cascaded to
adjacent clock buffers, effectively creating a ring of eight BUFGMUXes (BUFGCTRL
multiplexers). Figure 2-4 shows a simplified diagram of cascading BUFGCTRLs.
X-Ref Target - Figure 2-4
X16664-111516
BUFGCTRL
The BUFGCTRL primitive shown in Figure 2-5 can switch between two asynchronous clocks.
All other global clock buffer primitives are derived from certain configurations of
BUFGCTRL.
BUFGCTRL has four select lines, S0, S1, CE0, and CE1. It also has two additional control lines,
IGNORE0 and IGNORE1. These six control lines are used to control the inputs I0 and I1.
X-Ref Target - Figure 2-5
BUFGCTRL
IGNORE1
CE1
S1
I1
I0
S0
CE0
IGNORE0
X16665-111516
BUFGCTRL is designed to switch between two clock inputs without the possibility of a glitch.
When the presently selected clock transitions from High to Low after S0 and S1 change, the
output is kept Low until the other (to-be-selected) clock transitions from High to Low. Then,
the new clock starts driving the output.The default configuration for BUFGCTRL is
falling-edge sensitive and held at Low prior to the input switching. BUFGCTRL can also be
rising-edge sensitive and held at High prior to the input switching by using the INIT_OUT
attribute.
In some applications, the conditions previously described are not desirable. Asserting the
IGNORE pins bypasses the BUFGCTRL from detecting the conditions for switching between
two clock inputs. In other words, asserting IGNORE causes the MUX to switch the inputs at
the instant the select pin changes. IGNORE0 causes the output to switch away from the I0
input immediately when the select pin changes, while IGNORE1 causes the output to switch
away from the I1 input immediately when the select pin changes.
Selection of an input clock requires a “select” pair (S0 and CE0, or S1 and CE1) to be
asserted High. If either S or clock enable (CE) is not asserted High, the desired input is not
selected. In normal operation, both S and CE pairs (all four select lines) are not expected to
be asserted High simultaneously. Typically, only one pin of a “select” pair is used as a select
line, while the other pin is tied High. The truth table is shown in Table 2-2.
Notes:
1. Old input refers to the valid input clock before this state is achieved.
2. For all other states, the output becomes the value of INIT_OUT and does not toggle.
Although both S and CE are used to select a desired output, only S is suggested for
glitch-free switching. This is because when using CE to switch clocks, the change in clock
selection can be faster than when using S. A violation in the setup/hold time of the CE pins
causes a glitch at the clock output. On the other hand, using the S pins allows the user to
switch between the two clock inputs without regard to setup/hold times. As a result, using
S to switch clocks does not result in a glitch. See BUFGMUX_CTRL, page 22.
The timing diagram in Figure 2-6 illustrates various clock switching conditions using the
BUFGCTRL primitives. Exact timing numbers are best found using the speed specification.
1 2 3 4 5 6
I0
I1
TBCCKK_CE
CE0
CE1
S0
S1
IGNORE0
IGNORE1
TBCCKO_O TBCCKO_O TBCCKO_O
at I0 Begin I1 Begin I0
X16666-111516
• Pre-selection of the I0 and I1 inputs are made after configuration but before device
operation.
• The initial output after configuration can be selected as either High or Low.
• Clock selection using CE0 and CE1 only (S0 and S1 tied High) can change the clock
selection without waiting for a High-to-Low transition on the previously selected clock.
Notes:
1. Both PRESELECT attributes cannot be TRUE at the same time.
BUFGCE_1
BUFGCE_1 is a clock buffer with one clock input, one clock output, and a clock enable line.
This primitive is based on BUFGCTRL with some pins connected to logic High or Low.
Figure 2-7 illustrates the relationship of BUFGCE_1 and BUFGCTRL. The LOC constraint is
available for manually placing the BUFGCE_1 location. See the Vivado Design Suite User
Guide: Using Constraints (UG903) [Ref 4] for more information.
X-Ref Target - Figure 2-7
BUFGCE_1 as BUFGCTRL
IGNORE1
VDD
CE1
GND
BUFGCE_1 GND S1
CE
I1
VDD
O O
I
I0
I
VDD S0
CE CE0
GND IGNORE0
X16668-111516
IMPORTANT: Because the clock enable line uses the CE pin of the BUFGCTRL, the select signal must
meet the setup time requirement. Violating this setup time can result in a glitch.
BUFGCE_1(I)
TBCCCK_CE
BUFGCE_1(CE)
BUFGCE_1(O)
TBCCKO_O
X16669-111516
Figure 2-9 illustrates the relationship of BUFGMUX and BUFGCTRL. The LOC constraint is
available for manually placing the BUFGMUX and BUFGCTRL locations. See the Vivado
Design Suite User Guide: Using Constraints (UG903) [Ref 4] for more information.
X-Ref Target - Figure 2-9
IGNORE1
GND
S CE1
S1
VDD
BUFGMUX
I1 I1
O O
I0
I0
S
S0
VDD
CE0
IGNORE0
GND
X16670-111516
IMPORTANT: Because BUFGMUX uses the CE pins as select pins, when using the select, the setup time
requirement must be met. Violating this setup time can result in a glitch.
Switching conditions for BUFGMUX are the same as the CE pins on BUFGCTRL. Figure 2-10
illustrates the timing diagram for BUFGMUX.
X-Ref Target - Figure 2-10
TBCCCK_CE
I0
I1
O
TBCCKO_O
begin TBCCKO_O
switching using I1
X16671-111516
BUFGMUX_1 is rising-edge sensitive and held at High prior to input switch. Figure 2-11
illustrates the timing diagram for BUFGMUX_1. The LOC constraint is available for manually
placing the BUFGMUX and BUFGMUX_1 locations. See the Vivado Design Suite User Guide:
Using Constraints (UG903) [Ref 4] for more information.
TBCCCK_CE
I0
I1
TBCCKO_O
X16672-111516
BUFGMUX_CTRL
BUFGMUX_CTRL is a clock buffer with two clock inputs, one clock output, and a select line.
This primitive is based on BUFGCTRL with some pins connected to logic High or Low.
Figure 2-12 illustrates the relationship of BUFGMUX_CTRL and BUFGCTRL.
X-Ref Target - Figure 2-12
IGNORE1
GND
CE1
VDD
S S1
BUFGMUX_CTRL
I1 I1
O O
I0
I0
S
S0
CE0
VDD
IGNORE0
GND
X16673-111516
The setup/hold requirements for S0 and S1 are with respect to the falling clock edge, not
the rising edge as for CE0 and CE1.
Switching conditions for BUFGMUX_CTRL are the same as the S pin of BUFGCTRL.
Figure 2-13 illustrates the timing diagram for BUFGMUX_CTRL.
I0
I1
O
TBCCKO_O
TBCCKO_O
X16674-111516
IGNORE1
VDD
VDD CE1
S S1
Asynchronous MUX
Design Example
I1 I1
O
O
I0
I0
S0
VDD CE0
VDD IGNORE0
X16675-111516
I1
I0
TBCCKO_O TBCCKO_O
at I0 Begin I1
X16676-111516
In Figure 2-15:
IGNORE1
GND
CE CE1
S S1
BUFGMUX_CTRL+CE
Design Example
I1 I1
O O
I0 I0
S
S0
CE
CE0
IGNORE0
GND
X16677-111516
1 2 3
I0
I1
S
TBCCCK_CE
CE
TBCCKO_O TBCCKO_O
Begin I1
at I0 Clock Off
X16678-111516
In Figure 2-17:
BUFGCE
CE
X16679-111516
I
TBCCCK_CE
CE
TBCCKO_O
X16680-111516
BUFG
O
I
X16667-111516
The BUFGCE_LEAF is documented for information purpose only and is not user accessible in
Vivado design suite (e.g., for instantiation, placement, etc.).
BUFGCE_DIV
BUFGCE_DIV is a clock buffer with one clock input (I), one clock output (O), one clear input
(CLR) and a clock enable (CE) input. BUFGCE_DIV can directly drive the routing and
distribution resources and is a clock buffer with a single gated input and a reset. Its O
output is 0 when CLR is High (active). When CE is High, the I input is transferred to the O
output. CE is synchronous to the clock for glitch-free operation. CLR is an asynchronous
reset assertion and synchronous reset deassertion to this buffer. BUFGCE_DIV can also
divide the input clock by 1 to 8.
When CLR (reset) is deasserted, the output clock transitions from Low to High on the first
edge after the CLR is deasserted, regardless of the divide value. Therefore, BUFGCE_DIV
output clocks are always aligned, regardless of the divide value. The output clock then
toggles at the divided frequency. When CLR is asserted, the clock stops toggling after some
clock-to-out time. For an odd divide, the duty cycle is not 50% because the clock is High
one cycle less than it is Low. For example, for a divide value of 7, the clock is High for 3
cycles and Low for 4 cycles.
When CE is deasserted, the output stops at its current state, High or Low. When CE is
reasserted, the internal counter restarts from where it stopped. For example, if the divide
value is 8 and CE is deasserted two input clock cycles after the last output High transition,
the output stays High. Then when CE is reasserted, the output transitions Low after two
input clock cycles. If the reset input is used, upon assertion the output transitions Low
immediately if the current output is High, otherwise it stays Low.
Since reset is synchronously deasserted, when reset is deasserted in the previous example,
the output transitions High at the next input clock edge and transitions Low four input
clock cycles later.
IMPORTANT: In RFSoC devices, the ADC and DAC tiles replace the GTH transceivers that are present in
the MPSoC devices. Therefore, ADC and DAC utilize the existing BUFG_GT clock buffers to drive the
global clock trees in the device and then back into the ADC/DAC tiles from the fabric. However, the DIV
function cannot be used when connecting to the ADC/DAC clocks. Hence the BUFG_GT functions more
like a simple global clock buffer with CE and CLR.
IMPORTANT: For devices in Zynq Ultrascale+ and selected devices in Kintex Ultrascale+ families
(XCKU9P and above), assigning the clock root in the same region as the BUFG_GT driver (X0 column)
can cause an unroutable situation and prevent the output clocks from reaching loads that are placed
in the clock regions to the right of the Zynq UltraScale+ device PS or Kintex UltraScale+ empty PL
regions in the Y0, Y1, and Y2 rows. To avoid the issue, users need to assign clock root one clock region
to the right, in this case the X1 column.
3
DIV[2:0]
Clear Mask
Enable Mask
CEMASK
CLRMASK
MGT and DIV
ADC/DAC I O Clock
Clock
CLR
BUFG_GT
CE
BUFG_GT_SYNC
CLK
CESYNC
Clock Enable CE
CLRSYNC
Reset CLR
X19390-060717
When CLR (reset) is deasserted, the output transitions High at the next input clock edge and
transitions Low divide_value/2 input clock cycles later. Because reset is synchronously
deasserted, two clock cycles of synchronization latency need to be added to the output to
transition it to High. The next transition to Low then occurs four input clock cycles after that
(divide by 8). The output transitions to High a number of clock cycles later, determined by
the divide value specified, after which the output clock toggles at the divided frequency.
When CLR is asserted, the clock stops toggling at Low after some clock-to-out time. For an
odd divide, the duty cycle is not 50% because the clock is High one cycle less than it is Low.
For example, for a divide value of 7, the clock is High for 3 cycles and Low for 4 cycles.
When CE is deasserted, the output stops at its current state, High or Low. When CE is
reasserted, the internal counter restarts from where it stopped. For example, if the divide
value is 8 and CE is deasserted two input clock cycles after the last output High transition,
the output stays High. Then, when CE is reasserted, the output transitions Low four input
clock cycles later (two for synchronization and two to complete the High time period of the
output clock because of being a divide by 8). If the reset input is used, upon assertion the
output transitions Low immediately if the current output is High, otherwise it stays Low.
Because reset is synchronously deasserted, when reset is deasserted in the previous
example, the output transitions High two input clock cycles later due to synchronization
and transitions Low four input clock cycles after that (divide by 8).
The mask pins (CEMASK and CLRMASK) control how a specific, single BUFG_GT responds to
the CE/CLR control inputs. When a mask pin is deasserted, its respective control pin has
their normal function. When a mask pin is asserted, the respective control pin is ignored, in
effect allowing the clock to propagate through (i.e., CE is effectively High and reset is
effectively Low). The internal synchronizers phase align the clock outputs of the BUFG_GTs
that are not masked. Both edges of CE are synchronized while only the deassertion of reset
is synchronized. Assertion of reset immediately causes the output of the BUFG_GT to go
Low if it was previously High. This can cause a potential glitch or runt pulse. If this is not
acceptable, CE should be used to stop the output. A reset should then be asserted after two
input clock cycles plus half the “divide value.” This ensures that the output clock High time
(if the output clock happened to be disabled High) is no less than normal.
IMPORTANT: While the synchronizers ensure that all BUFG_GTs driven by the same clock come out of
reset in phase, they might not be in phase with BUFG_GTs that have not been reset (i.e., that have their
reset mask asserted).
BUFG_PS
The BUFG_PS is a simple clock buffer with one clock input (I), one clock output (O). This
clock buffer is a resource for the Zynq UltraScale+ MPSoC processor system (PS) and
provides access to the programmable logic (PL) clock routing resources for clocks from the
processor into the PL. Up to 18 PS clocks can drive the BUFG_PS. This clock buffer resides
next to the PS.
Overview
In UltraScale™ architecture-based devices, the clock management tile (CMT) includes a
mixed-mode clock manager (MMCM) and two phase-locked loops (PLLs). The main purpose
of the PLL is to generate clocking for the I/Os. But it also contains a limited subset of the
MMCM functions that can be used for general clocking purposes.
The clock input connectivity allows multiple resources to provide the reference clock(s) to
the MMCM. The number of output counters (dividers) is eight, with some of them capable
of driving out an inverted clock signal (180° phase shift). MMCMs have infinite fine phase
shift capability in either direction and can be used in dynamic phase shift mode. The
resolution of the fine phase shift depends on the voltage-controlled oscillator (VCO)
frequency. Fractional divide functionality in increments of 1/8th (0.125) for CLKFBOUT and
CLKOUT0 are available to support greater clock frequency synthesis capability. UltraScale
architecture-based devices have a spread spectrum (SS) capability. If the MMCM
spread-spectrum feature is not used, a spread spectrum on an external input clock will not
be filtered and thus passed on to the output clock.
MMCMs
UltraScale architecture-based devices contain one CMT per I/O bank. The MMCMs serve as
frequency synthesizers for a wide range of frequencies, and as jitter filters for either
external or internal clocks, and deskew clocks.
Input multiplexers select the reference and feedback clocks from either the global clock
I/Os or the clock routing or distribution resources. Each clock input has a programmable
counter divider (D). The phase-frequency detector (PFD) compares both phase and
frequency of the rising edges of both the input (reference) clock and the feedback clock. If
a minimum High/Low pulse is maintained, the duty cycle is ancillary. The PFD is used to
generate a signal proportional to the phase and frequency between the two clocks. This
signal drives the charge pump (CP) and loop filter (LF) to generate a reference voltage to
the VCO. The PFD produces an up or down signal to the charge pump and loop filter to
determine whether the VCO should operate at a higher or lower frequency. When VCO
operates at a frequency that is too high, the PFD activates a down signal causing the control
voltage to be reduced, thus decreasing the VCO operating frequency. When the VCO
operates at a frequency that is too low, an up signal increases voltage. The VCO produces
eight output phases and one variable phase for fine-phase shifting. Each output phase can
be selected as the reference clock to the output counters (Figure 3-1). Each counter can be
independently programmed for a given customer design. A special counter M is also
provided. This counter controls the feedback clock of the MMCM, allowing a wide range of
frequency synthesis.
In addition to integer divide output counters, MMCMs add a fractional counter for
CLKOUT0 and CLKFBOUT.
X-Ref Target - Figure 3-1
General Clock
Lock Detect
Routing Switch Lock
Circuit Lock Monitor
CLKIN1 9
D
CLKIN2 PFD CP LF VCO CLKOUT0
Fractional Divide
CLKOUT0B
CLKFB CLKOUT1
O1 CLKOUT1B
CLKOUT2
O2 CLKOUT2B
CLKOUT3
O3 CLKOUT3B
CLKOUT4
O4
CLKOUT5
O5
CLKOUT6
O6
M CLKFBOUT
(Fractional Divide) CLKFBOUTB
X16683-111516
CLKOUT5 CLKOUT4
DEN
LOCKED
CDDCREQ DO[15:0]
PSDONE
CLKINSTOPPED
MMCME3_BASE or
MMCME4_BASE CLKFBSTOPPED
CDDCDONE
MMCME3_ADV or
MMCME4_ADV
X16684-111516
The MMCME#_BASE primitives provide access to the most frequently used features of a
stand-alone MMCM. Clock deskew, frequency synthesis, coarse phase shifting, and duty
cycle programming are available to use with the MMCME#_BASE. The ports are listed in
Table 3-1.
The MMCME#_ADV primitive provides access to all MMCME#_BASE features plus additional
ports for clock switching, access to the dynamic reconfiguration port (DRP), and dynamic
fine-phase shifting. The MMCME#_ADV ports are listed in Table 3-2.
The MMCM is a mixed-signal block designed to support clock network deskew, frequency
synthesis, and jitter reduction. These three modes of operation are discussed in more detail
in this section. The VCO operating frequency can be determined by using the following
relationship:
M
F VCO = F CLKIN --- Equation 3-1
D
M
F OUT = F CLKIN ------------ Equation 3-2
DO
where the M, D, and O counters are shown in Figure 3-2, page 35. The value of M
corresponds to the CLKFBOUT_MULT_F setting, the value of D to the DIVCLK_DIVIDE, and O
to the CLKOUT_DIVIDE.
The seven “O” counters can be independently programmed. For example, O0 can be
programmed to do a divide-by-two while O1 is programmed for a divide-by-three. The only
constraint is that the VCO operating frequency must be the same for all the output counters
because a single VCO drives all the counters.
33 MHz
Reference D=1 PFD, CP, O0 = 2 Processor
Clock LF, VCO
O1 = 4 Gasket
M = 32
O2 = 6 CLBs
O3 = 8 Memory Interface
O4 = 16 66 MHz Interface
O5 = 32 33 MHz Interface
O6 = 1 not used
X16685-111516
When using the fractional divider, the duty cycle is not programmable for outputs used in
the fractional mode.
Jitter Filter
MMCMs reduce the jitter inherent on a reference clock. The MMCM can be instantiated as
a stand-alone function to only support filtering jitter from an external clock before it is
driven into another block. As a jitter filter, it is usually assumed that the MMCM act as a
buffer and regenerate the input frequency on the output (e.g., FIN = 100 MHz,
FOUT = 100 MHz). In general, greater jitter filtering is possible by using the MMCM attribute
BANDWIDTH set to Low. Setting the BANDWIDTH to Low can incur an increase in the static
offset of the MMCM.
Limitations
The MMCM has some restrictions that must be adhered to. These are summarized in the
MMCM electrical specifications in the UltraScale device data sheets [Ref 5]. In general, the
major limitations are VCO operation range, input frequency, duty cycle programmability,
and phase shift. In addition, there are connectivity limitations to other clocking elements
(pins, GTs, and clock buffers). Cascading MMCMs can only occur through the clock routing
network.
Phase Shift
In many cases, there needs to be a phase shift between clocks. The MMCM has multiple
options to implement phase shifting. Static phase shifting can be achieved by selecting one
of the eight VCO output phases with additional fine phase shifting available in the CLKOUT
output counters depending on the CLKOUT divide value. There is also an interpolated phase
shifting capability in either fixed or dynamic mode. The MMCM phase shifting capabilities
are very powerful, which can lead to complex scenarios. By using the Clocking Wizard, the
allowable phase shift values are determined based on the MMCM configuration settings.
The static phase shift (SPS) resolution in time units is defined as:
1 D
SPS = -------------- period or ------------- period Equation 3-3
8F VCO 8MF IN
Because the VCO can provide eight phase-shifted clocks at 45° each; always providing
possible settings for 0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315° of phase shift. The higher
the VCO frequency is, the smaller the phase shift resolution. Because the VCO has a distinct
operating range, it is possible to bound the phase shift resolution using from
1 1
---------------------- to ----------------------- period.
8F VCOMIN 8FVCOMAX
63
Maximum Phase Shift = ---------------------------------------- 360 + 7 Phase Shift Value Equation 3-4
CLKOUT_ DIVIDE
It is possible to phase shift the CLKFBOUT feedback clock. In that case, all CLKOUT output
clocks are negatively phase shifted with respect to CLKIN.
The two fractional counters (CLKFBOUT and CLKOUT0) also have static phase shift
capability. A phase shift step is defined as:
360 45
SPS frac = --------------------------------------------------------- or ------------------------------------------------- Equation 3-5
8 fractional_divide_value fractional_divide_value
For example, if the fractional divide value is 2.125, a static phase shift step is
360/(2.125 x 8) = 21.176 degrees.
Interpolated fine phase shift (IFPS) mode in the MMCM has linear shift behavior
independent of the CLKOUT_DIVIDE value, and the phase shift resolution only depends on
the VCO frequency. In this mode, the output clocks can be rotated 360° round robin
1
in linear increments of ---------
------- .
56F VCO
If the VCO runs at 600 MHz, the phase resolution is approximately (rounded) 30 ps, and at
1.6 GHz is approximately (rounded) 11 ps.
No initial phase shift value can be programmed during configuration. When using fine
phase shift, no initial phase shift amount can be set. The phase always starts at zero and can
then be dynamically incremented or decremented. The dynamic phase shift is controlled by
the PS interface of the MMCME#_ADV. This phase shift mode equally affects all CLKOUT
output clocks that are selected for this mode by setting the USE_FINE_PS attribute to TRUE.
In interpolated fine phase shift mode, a clock must always be connected to the PSCLK pin
of the MMCM. Regardless of the interpolated fine phase shift mode (fixed or dynamic) a
clock is in, the clock must always be connected to the PSCLK pin of the MMCM. Each
individual CLKOUT counter can independently either select the interpolated phase shift, the
previously described static phase shift mode, or none. Fractional divide is not allowed in
either fixed or dynamic interpolated fine phase shift mode. Fixed or dynamic phase shifting
of the feedback path results in a negative phase shift of all output clocks with respect to
CLKIN. The dynamic phase shift interface cannot be used when the phase shift mode is set
to fixed.
The variable phase shift is controlled by the PSEN, PSINCDEC, PSCLK, and PSDONE ports
(Figure 3-4). The phase of the MMCM output clock(s) increments/decrements according to
the interaction of PSEN, PSINCDEC, PSCLK, and PSDONE from the initial or previously
performed dynamic phase shift. PSEN, PSINCDEC, and PSDONE are synchronous to PSCLK.
When PSEN is asserted for one PSCLK clock period, a phase shift increment/decrement is
initiated. When PSINCDEC is High, an increment is initiated and when PSINCDEC is Low, a
decrement is initiated. Each increment adds to the phase shift of the MMCM clock outputs
by 1/56th of the VCO period. Similarly, each decrement decreases the phase shift by 1/56th
of the VCO period. PSEN must be active for one PSCLK period. PSDONE is High for exactly
one clock period when the phase shift is complete. The number of PSCLK cycles is
deterministic (12 PSCLK cycles). After initiating the phase shift by asserting PSEN and the
completion of the phase shift signaled by PSDONE, the MMCM output clocks gradually drift
from their original phase shift to an increment/decrement phase shift in a linear fashion.
The completion of the increment or decrement is signaled when PSDONE asserts High. After
PSDONE has pulsed High, another increment/decrement can be initiated. There is no
maximum phase shift or phase shift overflow. An entire clock period (360°) can always be
phase shifted regardless of frequency. When the end of the period is reached, the phase
shift wraps around round-robin style.
X-Ref Target - Figure 3-4
PSCLK
PSEN
PSDONE
PSINCDEC
X16686-111516
CLKOUT ports not affected by the CDDC change continue to function uninterrupted during
this operation and maintain their phase relationship to each other (Figure 3-5). However,
the output clocks (ports) that were changed via the CDDC procedure are not phase aligned
(synchronized) to the other output clocks not affected by CDDCREQ. Clocks affected by
CDDCREQ should not be used after the signal has been asserted because the output might
glitch and the clocks stop toggling. This feature is not available in fractional mode.
X-Ref Target - Figure 3-5
CLKOUT
No CDDC Change
CLKOUT
CDDREQ
Last DRP
Access Done
DRDY
CDDCDONE
X16698-111516
MMCM Programming
Programming of the MMCM must follow a set flow to ensure configuration that guarantees
stability and performance. This section describes how to program the MMCM based on
certain design requirements. A design can be implemented in two ways, directly through
the GUI interface (the Clocking Wizard) or implementing the MMCM through instantiation.
Regardless of the method selected, the following information is necessary to program the
MMCM:
As an example, consider FIN = 100 MHz, F VCO = between 600 MHz and 1600 MHz, and
FPFD = between 10 MHz and 550 MHz.
For a FPFDMIN of 10 MHz, the value of D can only be between 1 and 10.
The constraints used to determine the allowed M and D values are shown in these
equations:
f IN
D MIN = roundup ------------------- Equation 3-6
f PFD MAX
f IN
D MAX = rounddown -------------
---------- Equation 3-7
f PFD MIN( )
f
M MIN = roundup --VCOMIN
---------- D MIN
------
Equation 3-8
f IN
f
M MAX = rounddown ---VCOMAX
---------- D MAX
------
Equation 3-9
f IN
D f VCOMAX
M IDEAL = ----MIN
-----------------
--------------- Equation 3-10
f IN
The goal is to find the M value closest to the ideal operating point of the VCO. The
minimum D value is used to start the process. The goal is to make D and M values as small
as possible while keeping VCO as high as possible.
MMCM Ports
Table 3-3 summarizes the MMCM ports.
Notes:
1. All control and status signals except PSINCDEC are active High.
TIP: The port names generated by the clocking wizard can differ from the port names used on the
primitive.
CLKIN1 can be driven by a global clock I/O directly when in the same bank adjacent to the
PHY tile.
CLKIN2 can be driven by a global clock I/O directly when in the same bank adjacent to the
PHY tile.
CLKFBIN must be connected either directly to the CLKFBOUT for internal feedback, or to the
CLKFBOUT via a BUFG for clock buffer feedback matching, or IBUFG (through a global clock
pin for external deskew) or interconnect (not recommended). For clock alignment, the
feedback path clock buffer type should match the forward clock buffer type.
IMPORTANT: The internal compensation mode setting is determined by a direct connection (wire) from
the CLKFBOUT to the CLKFBIN port in the source. However, synthesis optimizes this connection away
such that the CLKFBOUT to CLKFBIN connection is removed from all subsequent representations in
Vivado design suite. However, the INTERNAL compensation attribute attached to the MMCM/PLL
indicates that the compensation is still internal to the MMCM/PLL.
For possible configuration of CLKFBOUT, see MMCM Use Models, page 56. CLKFBOUT can
also drive logic if the feedback path contains a clock buffer.
This signal should not be used for feedback. It provides an additional, inverted CLKFBOUT
output clock. CLKFBOUTB can drive logic if the feedback path contains a clock buffer.
The CLKINSEL signal controls the state of the clock input multiplexers. High = CLKIN1, Low
= CLKIN2 (see Reference Clock Switching, page 56). The MMCM must be held in RESET
during clock switchover.
The RST signal is an asynchronous reset for the MMCM. The MMCM is synchronously
re-enabled when this signal is deasserted.
This signal powers down instantiated but currently unused MMCMs. This mode can be used
to save power for temporarily inactive portions of the design and/or MMCMs that are not
active in certain system configurations. No MMCM power is consumed in this mode.
The dynamic reconfiguration data input (DI) bus provides reconfiguration data. The value of
this bus is written to the configuration cells. The data is presented in the cycle that DEN and
DWE are active. The data is captured in a shadow register and written at a later time. DRDY
indicates when the DRP port is ready to accept another write. When not used, all bits must
be set to zero.
The dynamic reconfiguration write enable (DWE) input pin provides the write/read enable
control signal to write the DI data into or read the DO data from the DADDR address. When
not used, it must be tied Low.
The dynamic reconfiguration enable strobe (DEN) provides the enable control signal to
access the dynamic reconfiguration feature and enables all DRP port operations. When the
dynamic reconfiguration feature is not used, DEN must be tied Low.
The DCLK signal is the reference clock for the dynamic reconfiguration port. The rising edge
of this signal is the timing reference for all other port signals. The setup time is specified in
the UltraScale device data sheets [Ref 5]. There is no hold time requirement for the other
input signals relative to the rising edge of the DCLK. The pin can be driven by an IBUF,
IBUFG, BUFGCE, or BUFGCTRL. There are no dedicated connections to this clock input.
This input pin provides the source clock for the dynamic phase shift interface. All other
inputs are synchronous to the positive edge of this clock. The pin can be driven by an IBUF,
IBUFG, BUFG, or BUFGCE. There are no dedicated connections to this clock input.
This input signal synchronously indicates if the dynamic phase shift is an increment or
decrement operation (positive or negative phase shift). PSENCDEC is asserted High for
increment and Low for decrement. There is no phase shift overflow associated with the
dynamic phase shift operation. If 360° or more are shifted, the phase wraps around, starting
at the original phase.
RECOMMENDED: CLKOUT0 is first used to place the root clock point. In ZHOLD mode, it is used to set
the compensation. Therefore, Xilinx recommends using CLKOUT0 as the main clock.
For possible configurations, see MMCM Use Models, page 56. In the MMCM, CLKOUT0 and
CLKFBOUT can be used in fractional divide mode. All CLKOUT outputs can be used in
non-fractional mode to provide a static or dynamic phase shift. In fractional mode, only
fixed phase shift is allowed. See Static Phase Shift Mode (MMCM and PLL), page 39 for more
information.
This is a status pin indicating that the input clock has stopped. This signal is asserted within
two CLKFBOUT clock cycles of clock stoppage. The signal is deasserted after the clock has
restarted and LOCKED is achieved, or the clock is switched to the alternate clock input and
the MMCM has re-locked.
This is a status pin indicating that the feedback clock has stopped. CLKFBSTOPPED is
asserted within one clock cycle of clock stoppage. The signal is deasserted after the
feedback clock has restarted and the MMCM has re-locked.
LOCKED
This is an output from the MMCM used to indicate when the MMCM has achieved phase
and frequency alignment of the reference clock and the feedback clock at the input pins.
Phase alignment is within a predefined window and frequency matching within a
predefined PPM range. The MMCM automatically locks after power on; no extra reset is
required. LOCKED is deasserted within one PFD clock cycle if the input clock stops, the
phase alignment is violated (e.g., input clock phase shift), or the frequency has changed.
The MMCM must be reset when LOCKED is deasserted. The clock outputs should not be
used prior to the assertion of LOCKED.
The dynamic reconfiguration output bus provides MMCM data output when using dynamic
reconfiguration. If DWE is inactive while DEN is active at the rising edge of DCLK, this bus
holds the content of the configuration cells addressed by DADDR. The DO bus must be
captured on the rising edge of DCLK when DRDY is active. The DO bus value is held until the
next DRP operation.
The dynamic reconfiguration ready output (DRDY) provides the response to the DEN signal
for the MMCM’s dynamic reconfiguration feature. This signal indicates that a DEN/ DCLK
operation has completed.
The phase shift done output signal is synchronous to the PSCLK. When the current phase
shift operation is completed, the PSDONE signal is asserted for one clock cycle indicating
that a new phase shift cycle can be initiated.
This is a request signal for dynamically changing the output clock divide value and
therefore the frequency. When asserted High, a request is sent to all affected counters and
must stay asserted until the last change via the DRP has been completed.
This is an acknowledge signal from the MMCM that the output clock divide change is
complete and the output is valid.
MMCM Attributes
Table 3-4 lists the attributes for the MMCME#_BASE and MMCME#_ADV primitives.
Notes:
1. The Vivado tools round up or down to the nearest multiple of 0.125 if the value is not specified as an exact 1/8th fraction.
2. When using the variable fine phase shift, the initial phase shift value is always zero and cannot be preset to a static, initial
phase.
3. The COMPENSATION attribute values are documented for informational purpose only. The Vivado tools automatically select
the appropriate compensation based on circuit topology. Do not manually select a compensation value, leave the attribute
at the default value.
4. The specifications for the VCO frequencies MMCM_FVCOMIN/MMCM_FVCOMAX and minimum out frequency
MMCM_FOUTMIN are different for the UltraScale and UltraScale+ families. Consult the appropriate data sheets.
5. The direct source code connection (wire) from CLKFBOUT to CLKFBIN is optimized away during synthesis.
6. When using an SEM-IP in Ultrascale devices only, additional noise is coupled into VCO of MMCM and PLL. This results in
higher TIE jitter value as described in AR:71314. Refer to the AR for guidance and mitigation techniques. To resolve the issue
of TIE jitter for SEM-IP, there are two new configurable properties: BITSTREAM.MMCM.BANDWIDTH and
BITSTREAM.PLL.BANDWIDTH. If the properties are set to POSTCRC, each MMCM instance that has the BANDWIDTH attribute
set to OPTIMIZED, or to a PLL instance with an implicit BANDWIDTH=OPTIMIZED attribute, will get configured to POSTCTRC
bandwidth settings. Refer to the Vivado Design Suite User Guide [Ref 13] for BITSTREAM property information.
• IBUF – Global clock input buffer. The MMCM compensates the delay of this path. IBUF
represents a global clock pin in the same region. The IBUF must be located at a global
clock pin location.
• BUFGCTRL – Internal global clock buffer. The MMCM does not compensate the delay of
this path.
• BUFGCE – Global clock buffer. The MMCM does not compensate the delay of this path.
Counter Control
The MMCM output counters provide a wide variety of synthesized clocks using a
combination of DIVIDE, DUTY_CYCLE, and PHASE. Figure 3-6 illustrates how the counter
settings impact the counter output.
Counter Clock
Input (VCO)
DIVIDE = 2
DUTY_CYCLE = 0.5
PHASE = 0
DIVIDE = 2
DUTY_CYCLE = 0.5
PHASE = 180
DIVIDE = 2
DUTY_CYCLE = 0.75
PHASE = 180
DIVIDE = 1
DUTY_CYCLE = 0.5
PHASE = 0
DIVIDE = 1
DUTY_CYCLE = 0.5
PHASE = 360
DIVIDE = 3
DUTY_CYCLE = 0.33
PHASE = 0
DIVIDE = 3
DUTY_CYCLE = 0.5
PHASE = 0
X16687-111516
0°
45°
90°
135°
VCO
8 Phases 180°
225°
270°
315°
O0
O1
Counter
Outputs O2
O3
If the MMCM is configured to provide a certain phase relationship and the input frequency
is changed, this phase relationship is also changed because the VCO frequency changes and
therefore the absolute shift in picoseconds changes. This aspect must be considered when
designing with the MMCM. When an important aspect of the design is to maintain a certain
phase relationship among various clock outputs, (e.g., CLK and CLK90), this relationship is
maintained regardless of the input frequency.
All O counters can be equivalent; anything O0 can do, O1 can do. The O0 counter has the
additional capability to be used in fractional divide mode. The MMCM outputs are flexible
when connecting to the global clock network because they are identical. In most cases, this
level of detail is imperceptible because the software and Clocking Wizard determine the
proper settings through the MMCM attributes and Wizard inputs.
CLKINSEL
CLKIN1
IBUFG (CC)
BUFGCTRL
BUFGCE
Local Routing
(Not Recommended)
MMCM
CLKIN
CLKIN2
IBUFG (CC)
BUFGCTRL
BUFGCE
Local Routing
(Not Recommended)
X16689-111516
IBUFG BUFG
1 2 4 5
CLKIN1 CLKOUT0 To Logic
3
CLKFBIN CLKOUT0B
RST CLKOUT1
CLKOUT1B
CLKOUT2
CLKOUT2B
CLKOUT3
CLKOUT3B 1
CLKOUT4
2
CLKOUT5
BUFG
CLKOUT6 3
6
CLKFBOUT
4
CLKFBOUTB
LOCKED 5
MMCM
6
X16690-111516
Another more complex scenario has an input frequency of 66.66 MHz and D = 2, M = 30,
and O = 4. The VCO frequency in this case is 1000 MHz and the CLKOUT output frequency
is 250 MHz. Therefore, the feedback frequency at the PFD is 1000/30 or 33.33 MHz,
matching the 66.66 MHz/2 input clock frequency at the PFD.
The MMCM feedback can be internal to the MMCM when the MMCM is used as a
synthesizer or jitter filter, and there is no required phase relationship between the MMCM
input clock and the MMCM output clock. The MMCM performance increases because the
feedback clock is not subjected to noise on the core supply since it never passes through a
block powered by this supply. However, noise introduced on the CLKIN signal and the BUFG
are still present (Figure 3-10).
X-Ref Target - Figure 3-10
IBUFG BUFG
CLKFBIN CLKOUT0B
RST CLKOUT1
CLKOUT1B
CLKOUT2
CLKOUT2B
CLKOUT3
CLKOUT3B
CLKOUT4
CLKOUT5
CLKOUT6
CLKFBOUT
CLKFBOUTB
LOCKED
MMCM
X16691-111516
X16692-111516
Electromagnetic compatibility (EMC) regulations are used to control the noise or EMI that
causes these disturbances. Typical solutions for meeting EMC requirements involve adding
expensive shielding, ferrite beads, or chokes. These solutions can adversely impact the cost
of the final product by complicating PCB routing and forcing longer product development
cycles.
SSCG spreads the electromagnetic energy over a large frequency band to effectively reduce
the electrical and magnetic field strengths measured within a narrow window of
frequencies. The peak electromagnetic energy at any one frequency is reduced by
modulating the SSCG output.
The MMCME# can generate a spread-spectrum clock from a standard fixed frequency
oscillator when SS_EN is set to TRUE (see Figure 3-12). Within the MMCME#, the VCO
frequency is modulated along with CLKFBOUT and CLKOUT[6:4,1,0]. Clock outputs
CLKOUT[3:2] are used to control the modulation period and are not available for general
use. As long as the clock frequency is adjusted slowly, the spread-spectrum does not affect
the period jitter of the MMCME#.
X-Ref Target - Figure 3-12
Modulation Period
Frequency
FIN
Frequency
Deviation
Time
X16693-111516
FIN CENTER_HIGH
Frequency
CENTER_LOW
Time
X16694-111516
FIN
Frequency
DOWN_LOW
DOWN_HIGH
Time
X16695-111516
Table 3-5: Manual SS Timing Adjustment Using Input Frequency for UltraScale Devices
Input Frequency Input Frequency Adjustment
Parameter M
(MHz) (FIN_SS)
25 < F IN < 35 M = 28 FIN_SS = F IN x 56/55
M = 21 FIN_SS = F IN x 42/41
35 < F IN < 50
M = 22 FIN_SS = FIN x 44/43
SS_MODE(CENTER_HIGH)
50 < F IN < 75 M = 28 FIN_SS = F IN x 56/55
M = 21 FIN_SS = F IN x 42/41
75 < F IN < 150
M = 22 FIN_SS = F IN x 44/43
25 < F IN < 35 M = 56 FIN_SS = F IN x 112/111
M = 42 FIN_SS = F IN x 84/83
35 < F IN < 50
M= 44 FIN_SS = FIN x 88/87
SS_MODE (CENTER_LOW)
50 < F IN < 75 M = 56 FIN_SS = F IN x 112/111
M = 42 FIN_SS = F IN x 84/83
75 < F IN < 150
M = 44 FIN_SS = F IN x 88/87
Table 3-5: Manual SS Timing Adjustment Using Input Frequency for UltraScale Devices (Cont’d)
Input Frequency Input Frequency Adjustment
Parameter M
(MHz) (FIN_SS)
25 < F IN < 35 M=28 FIN_SS = F IN
35 < F IN < 50 M = 21, 22 FIN_SS = F IN
SS_MODE (DOWN_HIGH) 50 < F IN < 75 M = 28 FIN_SS = F IN
75 < F IN < 100 M = 21, 22 FIN_SS = F IN
100 < F IN < 150 M = 21, 22 FIN_SS = F IN
25 < F IN < 35 M = 56 FIN_SS = F IN
35 < F IN < 50 M = 42, 44 FIN_SS = F IN
SS_MODE (DOWN_LOW) 50 < F IN < 75 M = 56 FIN_SS = F IN
75 < F IN < 100 M = 42, 44 FIN_SS = F IN
100 < F IN < 150 M = 42, 44 FIN_SS = F IN
Table 3-6: Manual SS Timing Adjustment Using Input Frequency for UltraScale+ Devices
Input Frequency Input Frequency Adjustment
Parameter M D
(MHz) (FIN_SS)
30 < F IN < 40 M = 28 D=1 FIN_SS = F IN x 56/55
M = 21 D=1 FIN_SS = F IN x 42/41
40 < F IN < 60
M = 22 D=1 FIN_SS = FIN x 44/43
SS_MODE 60 < F IN < 80 M = 28 D=2 FIN_SS = F IN x 56/55
(CENTER_HIGH)
M = 21 D=2 FIN_SS = F IN x 42/41
80 < F IN < 120
M = 22 D=2 FIN_SS = F IN x 44/43
M = 21 D=3 FIN_SS = F IN x 42/41
120 < F IN < 150
M = 22 D=3 FIN_SS = F IN x 44/43
30 < F IN < 40 M = 56 D=2 FIN_SS = F IN x 112/111
M = 42 D=2 FIN_SS = F IN x 84/83
40 < F IN < 60
M = 44 D=2 FIN_SS = F IN x 88/87
Table 3-6: Manual SS Timing Adjustment Using Input Frequency for UltraScale+ Devices (Cont’d)
Input Frequency Input Frequency Adjustment
Parameter M D
(MHz) (FIN_SS)
35 < F IN < 40 M = 56 D=2 FIN_SS = F IN
40 < F IN < 60 M = 42, 44 D=2 FIN_SS = F IN
SS_MODE
60 < F IN < 80 M = 56 D=4 FIN_SS = F IN
(DOWN_LOW)
80 < F IN < 120 M = 42, 44 D=4 FIN_SS = F IN
120 < F IN < 150 M = 42, 44 D=6 FIN_SS = F IN
For a 25 MHz input clock, the new timing constraints would be:
For an 80 MHz input clock, the new timing constraints would be:
Table 3-5 and Table 3-6 provide information which allows the manual adjustment of timing
constraints to the frequency range of the spread-spectrum enabled clock. This is for the
generation of timing constraints in an XDC file used by Vivado tools.
Table 3-5 and Table 3-6 show that timing constraints should be modified when
spread-spectrum clocking parameter SS_MODE is set to CENTER_LOW or CENTER_HIGH.
When SS_MODE attribute is set to DOWN_LOW or DOWN_HIGH timing constraint
adjustment is not necessary.
Also note that manually adjusting timing constraints is not needed because the Vivado
tools detect when spread-spectrum clocking in a design. Vivado tools (static timing
analysis) automatically account for any timing spread caused by the spread-spectrum
enabled clocks. When spread-spectrum clocks are used, Vivado static timing analysis adds
a spread-spectrum (SS) uncertainty value of the total uncertainty calculation formula. The
formula used by the static analysis tools is as follows:
Equation 3-12
1
TS J 2 – D J 2 -2-
-------------------------------- + PE + SS
2
where:
CAUTION! When using spread-spectrum clocking in a design, it is necessary to use appropriated clock
domain crossing (CDC) circuitry for all signals, data and non-data, crossing clock and spread-spectrum
clock domains, and vice versa.
Asynchronous FIFOs should be used to transfer data between two clock domains. The depth
of the FIFO depends on the modulation frequency in the clock. The slower the modulation,
the deeper the FIFO needs to be:
frequencyDeviation
FIFO depth is proportional to = --------------------------------------------------- Equation 3-13
ModulationFrequency
When using spread-spectrum generation, the VCO frequency is set by the clocking wizard
based on the input frequency and SS_MODE. As a result, the clocking wizard is
recommended to set the output frequencies for CLKOUT[6:4,1,0].
Based on the VCO frequency and SS_MOD_PERIOD, the clocking wizard also determines the
correct modulation settings to set the modulation frequency within 10% of
SS_MOD_PERIOD. Because the modulation frequency is dependent on the VCO frequency,
the modulation frequency scales as the input frequency changes for a given compilation.
CLKOUT0_PHASE = 0;
CLKOUT0_DUTY_CYCLE = 0.5;
CLKOUT0_DIVIDE = 2;
CLKOUT1_PHASE = 90;
CLKOUT1_DUTY_CYCLE = 0.5;
CLKOUT1_DIVIDE = 2;
CLKOUT2_PHASE = 0;
CLKOUT2_DUTY_CYCLE = 0.25;
CLKOUT2_DIVIDE = 4;
CLKOUT3_PHASE = 90;
CLKOUT3_DUTY_CYCLE = 0.5;
CLKOUT3_DIVIDE = 8;
CLKOUT4_PHASE = 0;
CLKOUT4_DUTY_CYCLE = 0.5;
CLKOUT4_DIVIDE = 8;
CLKOUT5_PHASE = 135;
CLKOUT5_DUTY_CYCLE = 0.5;
CLKOUT5_DIVIDE = 8;
CLKFBOUT_PHASE = 0;
CLKFBOUT_MULT_F = 8;
DIVCLK_DIVIDE = 1;
CLKIN1_PERIOD = 10.0;
REFCLK
VCOCLK
CLKOUT0
CLKOUT1
CLKOUT2
CLKOUT3
CLKOUT4
CLKOUT5
X16696-111516
PLLs
There are two PLLs per CMT that provide clocking to the PHY logic and I/Os. In addition,
they can be used as frequency synthesizers for a wide range of frequencies, serve as jitter
filters, and provide basic phase shift capabilities and duty cycle programming. The PLLs
differ from the MMCM in number of outputs, cannot deskew clock nets, and do not have
advanced phase shift capabilities, Multipliers and input dividers have a smaller value range
and do not have many of the other advanced features of the MMCM.
The PLLE#_BASE primitive provides access to the most frequently used features of a
stand-alone PLL. Clock deskew, frequency synthesis, and duty cycle programming are
available to use with the PLLE#_BASE. The ports are listed in Table 3-9.
The PLLE#_ADV primitive provides access to all PLLE#_BASE features plus additional ports
for access to the DRP. The ports are listed in Table 3-10.
PLL Ports
Table 3-11 summarizes the PLL ports.
The RST signal is an asynchronous reset for the PLL. The PLL is synchronously re-enabled
when this signal is deasserted.
This signal powers down instantiated but currently unused PLLs. This mode can be used to
save power for temporarily inactive portions of the design and/or PLLs that are not active in
certain system configurations. No PLL power is consumed in this mode.
These are user-configurable clock outputs and can be divided versions of the VCO phase
outputs (user controllable) from 1 (bypassed) to 128. The input clock and output clocks can
be phase aligned.
For the possible configurations of CLKFBOUT, see Figure 3-17 and Figure 3-18. Unlike the
MMCM, the CLKFBOUT cannot drive logic.
X-Ref Target - Figure 3-17
1 Local Clock 2 1
CLKIN CLKOUT0 5
Network
CLKOUTPHYEN CLKOUT0B
2
RST CLKOUT1
LOCKED 5
CLKFBOUT 4
PLLE3
X18045-111516
CLKIN CLKOUT0
CLKOUTPHYEN CLKOUT0B
IBUFG BUFG
RST CLKOUT1
PWRDWN CLKOUT1B
CLKFBIN CLKOUTPHY_P
CLKOUTPHY_N
LOCKED
CLKFBOUT
PLLE3
X18046-111516
CLKFBIN must be connected either directly to the CLKFBOUT for internal feedback, or to the
CLKFBOUT through a BUF_IN. Using BUF_IN in the feedback path compensates for the clock
network delay in the same XIPHY bank as shown in Figure 3-17 where the nodes 1 and 5 are
phase aligned.
This is a dedicated clock output for use by the PHY byte logic and I/O. It can be 2X, 1X, or
0.5X of the VCO frequency.
CLKOUTPHYEN enables the CLKOUTPHY clock outputs. The PLL employs enable logic to
synchronize the asynchronous CLKOUTPHYEN signal from your design and controls when
the CLKOUTPHY clocks are released. After the CLKOUTPHY clock is released, the rising edge
is aligned to the rising edge of the input clock CLKIN. Glitch-free enabling and disabling of
the CLKOUTPHY output clock is assured for all configurations.
However, phase alignment between multiple PLL CLKOUTPHY clocks is only assured when
both the CLKFBOUT_MULT and CLKOUT[0:1]_DIVIDE values are set to 1, 2, 4, or 8. Rising
edges do not align for CLKFBOUT = 3, 5, 6, 7, 9,...
LOCKED
This output from the PLL is used to indicate when the PLLs have achieved frequency
alignment of the reference clock and the internal feedback. Frequency alignment is within a
predefined window of frequency matching within a predefined PPM range. The PLL
automatically locks after power on; no extra reset is required. LOCKED is deasserted within
one PFD clock cycle if the input clock stops or the frequency has changed. The PLL must be
reset when LOCKED is deasserted. The clock outputs should not be used prior to the
assertion of LOCKED.
The dynamic reconfiguration data input (DI) bus provides reconfiguration data. The value of
this bus is written to the configuration cells. The data is presented in the cycle that DEN and
DWE are active. The data is captured in a shadow register and written at a later time. DRDY
indicates when the DRP port is ready to accept another write. When not used, all bits must
be set to zero.
The dynamic reconfiguration write enable (DWE) input pin provides the write/read enable
control signal to write the DI data into or read the DO data from the DADDR address. When
not used, DWE must be tied Low.
The dynamic reconfiguration enable strobe (DEN) provides the enable control signal to
access the dynamic reconfiguration feature and enable all DRP port operations. When the
dynamic reconfiguration feature is not used, DEN must be tied Low.
DCLK is the reference clock for the dynamic reconfiguration port. The rising edge of this
signal is the timing reference for all other port signals. The setup time is specified in the
UltraScale device data sheets [Ref 5]. There is no hold time requirement for the other input
signals relative to the rising edge of DCLK. This signal can be driven by an IBUF, IBUFG,
BUFGCE, or BUFGCTRL. There are no dedicated connections to this clock input.
The dynamic reconfiguration output bus provides PLL data output when using dynamic
reconfiguration. If DWE is inactive while DEN is active at the rising edge of DCLK, this bus
holds the content of the configuration cells addressed by DADDR. The DO bus must be
captured on the rising edge of DCLK when DRDY is active. The DO bus value is held until the
next DRP operation.
The dynamic reconfiguration ready output (DRDY) provides the response to the DEN signal
for the PLL’s dynamic reconfiguration feature. This signal indicates that a DEN/ DCLK
operation has completed.
PLL Attributes
Table 3-12 lists the attributes for the PLLE#_BASE and PLLE#_ADV primitives.
Notes:
1. The specifications for the VCO frequencies PLL_FVCOMIN/PLL_FVCOMAX and minimum out frequency PLL_FOUTMIN are
different for the UltraScale and UltraScale+ families. Consult the appropriate data sheets.
The DRP port provides the ability to use a MMCM and/or PLL as a dynamic element in a
design. The DRP port setup is that of a common microcontroller peripheral and gives the
user access to a set of registers in the MMCM or PLL. These registers allow the user to fully
control the MMCM or PLL. Inputs pins and the values to define output clocks are turned
into register bits making it possible to use the primitives as active elements in a design.
Using the DRP port means reading and writing of registers of a peripheral. When using the
Clocking Wizard the DRP port can be enabled through an AXI-Lite controller to a hard or
soft microcontroller in the FPGA. Nevertheless, it might be necessary by design and other
requirements to use the DRP port in a bare metal configuration (also selectable in the
Clocking Wizard). The DRP port can then be used as such through a state machine based
design. To help with this the provided description of the functioning of the DRP port can be
used.
For additional DRP usage information, see MMCM and PLL Dynamic Reconfiguration
(XAPP888) [Ref 6] and the associated reference.
DI[15:0] DOUT[15:0]
DADDR[n:0] DRDY
DWE
DEN
DCLK
X21932-111618
Notes:
1. The width of the DADDR bus depends on the primitive that the DRP port is a part of. For a MMCM, the address bus
is 7-bit wide and for a PLL the DRP address bus is 7-bit wide (DADDR(6:0)). The DADDR port of an ADC/DAC in a
RFSoC device is 12-bit wide while the DADDR port of a GTP is 10-bit wide.
DCLK 1 2 3 n n+1
DEN
DWE
DADDR[n:0]
DI[15:0]
DO[15:0]
DRDY
X21935-111618
DCLK 1 2 3 n n+1
DEN
DWE
DADDR[n:0]
DI[15:0]
DO[15:0]
DRDY
Address valid
READ Data valid
X21937-111618
Read—Write Operation
A read-write operation must always be executed with respect to the DRDY signal. Only
when the DRDY signal pulses High, a new read or write operation can be initiated. If the
DRDY signal is not controlled after a read or write operation, from or to the DRP port, it is
not certain that the written bits are set or the obtained bits are representing the value of the
register.
X-Ref Target - Figure 3-22
DEN
DWE
DRDY
X21936-111618
MMCM PLL
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
27 interpolator
16 divck 16 divck
15 15
ckfbout
14 ckfbout 14
13
13
ckout6
12
11
ckout4
10
0F
ckout3
0E
0D
ckout2
0C
0B 0B ckoutphy
ckout1
0A 0A ckout1
9 9
ckout0
8 ckout0 8
7
7
ckout5
6
5 compensation 5 compensation
4 spread spectrum
IMPORTANT: When operating a DRP port, it is recommended that the existing contents of the register
that is going to be changed are first read. Write back the register contents where only the required bits
are modified. Modify only the colored bits and always maintain the state of the gray bits.
MMCM Registers
Reg 4F ADDR: 0x4F
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Default 0 0 0 0 0 0
Access R/W R/W R/W R/W R/W R/W
15 mc_res(3)
12 mc_res(2)
Loop filter resistor setting.
11 mc_res(1)
8 mc_res(0)
7 mc_lfhf(1)
Loop filter high frequency capacitor setting.
4 mc_lfhf(0)
Registers 0x4F and 0x4E define the values for the loop filters. Pick the appropriate values
for these filters from the MMCM and PLL Dynamic Reconfiguration (XAPP888) [Ref 6].
Notes:
1. If any of the output counters is using fine phase shift then mc_interp_en[3:0] must be set to 1111 otherwise
mc_interp_en[3:0] must be set to 0000.
2. mc_interp_en(4) is always set to 1.
3. If any of the output counters is using a phase of VCO other than 0 or 180, uses fractional division for a counter, or uses
spread-spectrum mode then mc_interp_en[7:5] must be set to 111 otherwise mc_interp_en[7:5] must be set to 000.
Refer to MMCM and PLL Dynamic Reconfiguration (XAPP888) [Ref 6] to determine the values
for registers 0x1A, 0x19, and 0x18.
Registers 0x15, 0x14, and bits [15:12] of 0x13 control the fractional feedback (M counter)
shown in Figure 3-1.
Bits [10:0] of register 0x13 and register 0x12 control the CLKOUT6 counter.
Registers 0x11 and 0x10 control the output counter for CLKOUT4.
Registers 0x0F and 0x0E control the output counter for CLKOUT3.
Registers 0x0D and 0x0C control the output counter for CLKOUT2.
Registers 0x0B and 0x0A control the output counter for CLKOUT1.
The fractional output counter for CLKOUT0 is controlled by registers 0x09, 0x08, and bits
[15:12] of register 0x07.
Bits [10:0] of register 0x07 and register 0x06 control the CLKOUT5 counter.
The MMCM clock outputs are all defined by a configurable counter. The parameters
defining the clock outputs CLKOUT6 to CLKOUT1 are explained in the following table. Refer
to MMCM and PLL Dynamic Reconfiguration (XAPP888) [Ref 6] for calculation instructions
and methods.
Two of the counters, CLKFBOUT and CLKOUT0, are fractional counters. A fractional counter
uses two non-fractional counters, an extra state, and adder logic. This is the reason a
fractional counter has two enables (one for each counter to allow non-fractional use) and
two VCO phase selection settings. For the adder and state logic, the VCO phase selection is
extra split in rising and falling settings. Additional register configuration options defining
the fractional counters are listed in the following table.
Fractional counter mode is enabled when both mc_ckout_en and mc_ckout_frac_en are set.
Both counters take different phases from the VCO outputs.
Example 1
Output starts High with rising edge on counter A and goes Low with second rising edge
counter B. It goes High again with second rising edge of counter A after that and so on.
VCO phase 0
(0 degrees)
VCO phase 4
(180 degrees)
Divide by 2
CLKOUT
Divide by 2.5 State & adder logic assemble a
fractional clock from the two
selected VCO phases using the
real divide value
X21938-111418
Output starts High with rising edge on counter A and goes Low with second rising edge
counter B. It goes High again with next rising edge of counter B, counter B switches to VCO
phase 90 and the output goes Low again by the second rising edge of that phase and so on.
X-Ref Target - Figure 3-25
VCO phase 0
(0 degrees)
VCO phase 1
(45 degrees)
VCO phase 2
(90 degrees)
VCO phase 3
(135 degrees)
CLKOUT
Divide by 2.125 State & adder logic assemble a fractional
Clock from the two selected VCO phases
using the real divide value
X21939-111418
The settings for register 4 are controlled by the SS_MODE attribute of the MMCM. Refer to
the Spread-Spectrum Clock Generation section for detailed information on
spread-spectrum clocking set up and behavior.
ss_steps_init ss_steps
DOWN_LOW 100 011
DOWN_HIGH 100 011
CENTER_LOW 100 111
CENTER_HIGH 100 111
Register 0 represents the bits that are also available as attributes of the MMCM primitive.
For the functional explanation of these bits, refer to the MMCM Attributes section.
PLL Registers
The PLL DRP register set is similar and runs parallel with that of the MMCM. The number of
possible changeable registers in the PLL DRP resister set is smaller than that of the MMCM
because the PLL has only two clock outputs and doe not use a selectable VCO output
multiplexer and interpolator.
3 mc_direct_path_cntrl Reserved.
2 mc_in_dly_en Compensation delay enable.
The Clocking Wizard helps to correctly set up the MMCM and PLL resources. Additionally,
the Clocking Wizard reports the jitter and supports phase and frequency synthesis. See
LogiCORE IP Clocking Wizard User Guide (PG065) [Ref 8] for more information.
Clocking Guidelines
Clocking in a design is not just applying clock buffers, instantiating MMCM and/or PLL, and
applying one or a couple of constraints in a XDC file. Clocking and the setup of a clocking
network needs attention. To create a design, that is implemented (synthesize, place, and
route) using all of Vivado Design Suite features, and when downloaded makes the FPGA
function at optimal conditions, follow the guidelines provided in the chapters Clocking
Guidelines and Clock Domain Crossing of the UltraFast Design Methodology Guide for the
Vivado Design Suite (UG949) [Ref 1].
The UltraFast Design Methodology Guide for the Vivado Design Suite (UG949) [Ref 1] offers
a set of best practices intended to help streamline the design process for new devices. The
size and complexity of these designs require specific steps and design tasks to ensure
success at each stage of the design. Following these steps and adhering to the best
practices will help you achieve your desired design goals as quickly and efficiently as
possible. Two other documents that can be useful for designing are:
Xilinx Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see Xilinx
Support.
Solution Centers
See the Xilinx Solution Centers for support on devices, software tools, and intellectual
property at all stages of the design cycle. Topics include design assistance, advisories, and
troubleshooting tips.
References
1. UltraFast Design Methodology Guide for the Vivado Design Suite (UG949)
2. UltraScale Architecture Packaging and Pinout User Guide (UG575)
3. UltraScale Architecture SelectIO Resources User Guide (UG571)
4. Vivado Design Suite User Guide: Using Constraints (UG903)
5. UltraScale and UltraScale+ device data sheets:
Revision History
The following table shows the revision history for this document.