Reference Architecture: Lenovo Client Virtualization With Citrix Xendesktop
Reference Architecture: Lenovo Client Virtualization With Citrix Xendesktop
Mike Perks
Kenny Bain
Pawan Sharma
Resources ....................................................................................................... 69
This document describes the reference architecture for Citrix XenDesktop 7.6 and also supports the previous
versions of Citrix XenDesktop 5.6, 7.0, and 7.1. This document should be read with the Lenovo Client
Virtualization (LCV) base reference architecture document that is available at this website:
lenovopress.com/tips1275
The business problem, business value, requirements, and hardware details are described in the LCV base
reference architecture document and are not repeated here for brevity.
This document gives an architecture overview and logical component model of Citrix XenDesktop. The
document also provides the operational model of Citrix XenDesktop by combining Lenovo® hardware
platforms such as Flex System™, System x®, NeXtScale System™, and RackSwitch networking with OEM
hardware and software such as IBM Storwize and FlashSystem storage, and Atlantis Computing software. The
operational model presents performance benchmark measurements and discussion, sizing guidance, and
some example deployment models. The last section contains detailed bill of material configurations for each
piece of hardware.
Hypervisor
Web Connection
Interface Broker
(Delivery Stateless Virtual Desktops
Controller)
Hypervisor
FIREWALL
FIREWALL
3rd party
VPN
Internet
Clients Hosted Desktops and Apps
Shared
Hypervisor
Storage
Internal
Clients
HTTP/HTTPS ICA
and Apps
VM VM Directory
PVS and MCS Agent Agent Shared
Desktop
License Server DNS
VM VM Shared
Agent Agent Desktop DHCP
XenDesktop SQL Server
Shared
Management Services Local SSD Application OS Licensing
Storage
vCenter Server
Accelerator Accelerator Accelerator
vCenter SQL Server VM VM VM
Hypervisor Management Hypervisor Hypervisor Hypervisor Lenovo Thin
Client Manager
NFS and CIFS
Desktop Studio Desktop Studio is the main administrator GUI for Citrix XenDesktop. It is used
to configure and manage all of the main entities, including servers, desktop
pools and provisioning, policy, and licensing.
Web Interface The Web Interface provides the user interface to the XenDesktop
environment. The Web Interface brokers user authentication, enumerates the
available desktops and, upon start, delivers a .ica file to the Citrix Receiver
on the user‘s local device to start a connection. The Independent Computing
Architecture (ICA) file contains configuration information for the Citrix receiver
to communicate with the virtual desktop. Because the Web Interface is a
critical component, redundant servers must be available to provide fault
tolerance.
PVS and MCS Provisioning Services (PVS) is used to provision stateless desktops at a large
scale. Machine Creation Services (MCS) is used to provision dedicated or
stateless desktops in a quick and integrated manner. For more information,
see “Citrix XenDesktop provisioning” section on page 5.
License Server The Citrix License Server is responsible for managing the licenses for all
XenDesktop components. XenDesktop has a 30-day grace period that allows
the system to function normally for 30 days if the license server becomes
unavailable. This grace period offsets the complexity of otherwise building
redundancy into the license server.
XenDesktop SQL Server Each Citrix XenDesktop site requires an SQL Server database that is called
the data store, which used to centralize farm configuration information and
transaction logs. The data store maintains all static and dynamic information
about the XenDesktop environment. Because the XenDeskop SQL server is a
critical component, redundant servers must be available to provide fault
tolerance.
vCenter Server By using a single console, vCenter Server provides centralized management
of the virtual machines (VMs) for the VMware ESXi hypervisor. VMware
vCenter can be used to perform live migration (called VMware vMotion), which
allows a running VM to be moved from one physical server to another without
downtime.
vCenter SQL Server vCenter Server for VMware ESXi hypervisor requires an SQL database. The
vCenter SQL server might be Microsoft® Data Engine (MSDE), Oracle, or SQL
Server. Because the vCenter SQL server is a critical component, redundant
servers must be available to provide fault tolerance. Customer SQL databases
(including respective redundancy) can be used.
VDA Each VM needs a Citrix Virtual Desktop Agent (VDA) to capture desktop data
and send it to the Citrix Receiver in the client device. The VDA also emulates
keyboard and gestures sent from the receiver. ICA is the Citrix remote display
protocol for VDI.
Hypervisor XenDesktop has an open architecture that supports the use of several
different hypervisors, such as VMware ESXi (vSphere), Citrix XenServer, and
Microsoft Hyper-V.
Accelerator VM The optional accelerator VM in this case is Atlantis Computing, For more
information, see “Atlantis Computing” on page 7.
Shared storage Shared storage is used to store user profiles and user data files. Depending on
the provisioning model that is used, different data is stored for VM images. For
more information, see “Storage model” on section page 7.
For more information, see the Lenovo Client Virtualization base reference architecture document that is
available at this website: lenovopress.com/tips1275.
When the virtual disk (vDisk) master image is available from the network, the VM on a target device no longer
needs its local hard disk drive (HDD) to operate; it boots directly from the network and behaves as if it were
running from a local drive on the target device, which is why PVS is recommended for stateless virtual
desktops. PVS often is not used for dedicated virtual desktops because the write cache is not stored on shared
storage.
PVS is also used with Microsoft Roaming Profiles (MSRPs) so that the user’s profile information can be
separated out and reused. Profile data is available from shared storage.
It is a best practice to use snapshots for changes to the master VM images and also keep copies as a backup.
Streaming vDisk
The following types of Image Assignment Models for MCS are available:
MCS thin provisions each desktop from a master image by using built-in technology to provide each desktop
with a unique identity. Only changes that are made to the desktop use more disk space. For this reason, MCS
dedicated desktops are used for dedicated desktops.
Stateless and dedicated virtual desktops should have the following common shared storage items:
The master VM image and snapshots are stored by using Network File System (NFS) or block I/O
shared storage.
The paging file (or vSwap) is transient data that can be redirected to NFS storage. In general, it is
recommended to disable swapping, which reduces storage use (shared or local). The desktop memory
size should be chosen to match the user workload rather than depending on a smaller image and
swapping, which reduces overall desktop performance.
User profiles (from MSRP) are stored by using Common Internet File System (CIFS).
User data files are stored by using CIFS.
Dedicated virtual desktops or stateless virtual desktops that need mobility require the following items to be on
NFS or block I/O shared storage:
Difference disks are used to store user’s changes to the base VM image. The difference disks are per
user and can become quite large for dedicated desktops.
Identity disks are used to store the computer name and password and are small.
Stateless desktops can use local solid-state drive (SSD) storage for the PVS write cache, which is
used to store all image writes on local SSD storage. These image writes are discarded when the VM is
shut down.
Atlantis software works with any type of heterogeneous storage, including server RAM, direct-attached storage
(DAS), SAN, or network-attached storage (NAS). It is provided as a VMware ESXi or Citrix XenServer
compatible VM that presents the virtualized storage to the hypervisor as a native data store, which makes
deployment and integration straightforward. Atlantis Computing also provides other utilities for managing VMs
and backing up and recovering data stores.
Atlantis provides a number of volume types suitable for virtual desktops and shared desktops. Different volume
types support different application requirements and deployment models. Table 1 compares the Atlantis
volume types.
As shown in Figure 5, hyper-converged volumes are clustered across three or more servers and have built-in
resiliency in which the volume can be migrated to other servers in the cluster in case of server failure or
entering maintenance mode. Hyper-converged volumes are supported for ESXi and XenServer.
A variation of the simple hybrid volume type is the simple all-flash volume that uses fast, low-latency shared
flash storage whereby very little RAM is used and all I/O requests are sent to the flash storage after the inline
de-duplication and compression are performed.
This reference architecture concentrates on the simple hybrid volume type for dedicated desktops, stateless
desktops that use local SSDs, and host-shared desktops and applications. To cover the widest variety of
shared storage, the simple all-flash volume type is not considered.
Compute/Management Servers
• x3550, x3650, nx360
SMB (<600 users)
Hypervisor
• ESXi, XenServer, Hyper-V
Graphics Acceleration Not Recommended
• NVIDIA GRID K1 or K2
Shared Storage
• IBM Storwize V3700
Networking
• 10GbE
The vertical axis is split into two halves: greater than 600 users is termed Enterprise and less than 600 is
termed SMB. The 600 user split is not exact and provides rough guidance between Enterprise and SMB. The
last column in Figure 6 (labelled “hyper-converged”) spans both halves because a hyper-converged solution
can be deployed in a linear fashion from a small number of users (100) up to a large number of users (>4000).
The horizontal axis is split into three columns. The left-most column represents traditional rack-based systems
with top-of-rack (TOR) switches and shared storage. The middle column represents converged systems where
the compute, networking, and sometimes storage are converged into a chassis, such as the Flex System. The
Converged systems are not generally recommended for the SMB space because the converged hardware
chassis can be more overhead when only a few compute nodes are needed. Other compute nodes in the
converged chassis can be used for other workloads to make this hardware architecture more cost-effective.
To show the enterprise operational model for different sized customer environments, four different sizing
models are provided for supporting 600, 1500, 4500, and 10000 users.
To show the SMB operational model for different sized customer environments, four different sizing models are
provided for supporting 75, 150, 300, and 600 users.
To show the hyper-converged operational model for different sized customer environments, four different sizing
models are provided for supporting 300, 600, 1500, and 3000 users. The management server VMs for a
hyper-converged cluster can either be in a separate hyper-converged cluster or on traditional shared storage.
For stateless desktops, local SSDs can be used to store the base image and pooled VMs for improved
performance. Two replicas must be stored for each base image. Each stateless virtual desktop requires a
linked clone, which tends to grow over time until it is refreshed at log out. Two enterprise high-speed 400 GB
SSDs in a RAID 0 configuration should be sufficient for most user scenarios; however, 800 GB SSDs might be
needed. Because of the stateless nature of the architecture, there is little added value in configuring reliable
SSDs in more redundant configurations.
To use the latest version ESXi, obtain the image from the following website and upgrade by using vCenter:
ibm.com/systems/x/os/vmware/.
For stateless desktops, local SSDs can be used to improve performance by storing the delta disk for MCS or
the write-back cache for PVS. Each stateless virtual desktop requires a cache, which tends to grow over time
until the virtual desktop is rebooted. The size of the write-back cache depends on the environment. Two
enterprise high-speed 200 GB SSDs in a RAID 0 configuration should be sufficient for most user scenarios;
however, 400 GB (or even 800 GB) SSDs might be needed. Because of the stateless nature of the architecture,
there is little added value in configuring reliable SSDs in more redundant configurations.
The Flex System x240 compute nodes has two 2.5” drives. It is possible to both install XenServer and store
stateless virtual desktops on the same two drives by using larger capacity SSDs. The System x3550 and
System x3560 rack servers do not have this restriction and use separate HDDs for Hyper-V and SSDs for local
stateless virtual desktops.
The Flex System x240 compute nodes has two 2.5” drives. It is possible to both install XenServer and store
stateless virtual desktops on the same two drives by using larger capacity SSDs. The System x3550 and
System x3560 rack servers do not have this restriction and use separate HDDs for Hyper-V and SSDs for local
stateless virtual desktops.
Dynamic memory provides the capability to overcommit system memory and allow more VMs at the expense of
paging. In normal circumstances, each compute server must have 20% extra memory to allow failover. When a
server goes down and the VMs are moved to other servers, there should be no noticeable performance
degradation to VMs that is already on that server. The recommendation is to use dynamic memory, but it
should only affect the system when a compute server fails and its VMs are moved to other compute servers.
For Internet Explorer 11, the software rendering option must be used for flash videos to play correctly with
Login VSI 4.1.3.
Disable Large Send Offload (LSO) by using the Disable-NetadapterLSO command on the
Hyper-V compute server.
Disable virtual machine queue (VMQ) on all interfaces by using the Disable-NetAdapterVmq
command on the Hyper-V compute server.
Apply registry changes as per the Microsoft article that is found at this website:
support.microsoft.com/kb/2681638
The changes apply to Windows Server 2008 and Windows Server 2012.
Disable VMQ and Internet Protocol Security (IPSec) task offloading flags in the Hyper-V settings for
the base VM.
By default, storage is shared as hidden Admin shares (for example, e$) on Hyper-V compute server
and XenDesktop does not list Admin shares while adding the host. To make shared storage available
to XenDesktop, the volume should be shared on the Hyper-V compute server.
Because the SCVMM library is large, it is recommended that it is accessed by using a remote share.
Compute servers are servers that run a hypervisor and host virtual desktops. There are several considerations
for the performance of the compute server, including the processor family and clock speed, the number of
processors, the speed and size of main memory, and local storage options.
The use of the Aero theme in Microsoft Windows® 7 or other intensive workloads has an effect on the
maximum number of virtual desktops that can be supported on each compute server. Windows 8 also requires
more processor resources than Windows 7, whereas little difference was observed between 32-bit and 64-bit
Windows 7. Although a slower processor can be used and still not exhaust the processor power, it is a good
policy to have excess capacity.
Another important consideration for compute servers is system memory. For stateless users, the typical range
of memory that is required for each desktop is 2 GB - 4 GB. For dedicated users, the range of memory for each
desktop is 2 GB - 6 GB. Designers and engineers that require graphics acceleration might need 8 GB - 16 GB
of RAM per desktop. In general, power users that require larger memory sizes also require more virtual
processors. This reference architecture standardizes on 2 GB per desktop as the minimum requirement of a
Windows 7 desktop. The virtual desktop memory should be large enough so that swapping is not needed and
vSwap can be disabled.
For more information, see “BOM for enterprise compute servers” on page 47.
Table 2 lists the Login VSI performance of E5-2600 v3 processors from Intel that use the Login VSI 4.1 office
worker workload with ESXi 6.0.
Two E5-2650 v3 2.30 GHz, 10C 105W ESXi 6.0 239 users 234 users
Two E5-2670 v3 2.30 GHz, 12C 120W ESXi 6.0 283 users 291 users
Two E5-2680 v3 2.50 GHz, 12C 120W ESXi 6.0 284 users 301 users
Two E5-2690 v3 2.60 GHz, 12C 135W ESXi 6.0 301 users 306 users
Two E5-2680 v3 2.50 GHz, 12C 120W ESXi 6.0 244 users 237 users
Two E5-2690 v3 2.60 GHz, 12C 135W ESXi 6.0 252 users 246 users
Table 4 lists the results for the Login VSI 4.1 power worker workload.
Two E5-2680 v3 2.50 GHz, 12C 120W ESXi 6.0 203 users 202 users
These results indicate the comparative processor performance. The following conclusions can be drawn:
Between the Xeon E5-2650v3 (2.30 GHz, 10C 105W) and the Xeon E5-2680v3 (2.50 GHz, 12C 120W) series
processors are the Xeon E5-2660v3 (2.6 GHz 10C 105W) and the Xeon E5-2670v3 (2.3GHz 12C 120W)
series processors. The cost per user increases with each processor but with a corresponding increase in user
density. The Xeon E5-2680v3 processor has good user density, but the significant increase in cost might
outweigh this advantage. Also, many configurations are bound by memory; therefore, a faster processor might
not provide any added value. Some users require the fastest processor and for those users the Xeon
E5-2680v3 processor is the best choice. However, the Xeon E5-2650v3 processor is recommended for an
average configuration. The Xeon E5-2680v3 processor is recommended for power workers.
Previous Reference Architectures used Login VSI 3.7 medium and heavy workloads. Table 5 gives a
comparison with the newer Login VSI 4.1 office worker and knowledge worker workloads. The table shows that
Login VSI 3.7 is on average 20% to 30% higher than Login VSI 4.1.
Two E5-2650 v3 2.30 GHz, 10C 105W 4.1 Office worker 239 users 234 users
Two E5-2650 v3 2.30 GHz, 10C 105W 3.7 Medium 286 users 286 users
Two E5-2690 v3 2.60 GHz, 12C 135W 4.1 Office worker 301 users 306 users
Two E5-2690 v3 2.60 GHz, 12C 135W 3.7 Medium 394 users 379 users
Two E5-2690 v3 2.60 GHz, 12C 135W 4.1 Knowledge worker 252 users 246 users
Two E5-2690 v3 2.60 GHz, 12C 135W 3.7 Heavy 348 users 319 users
Two E5-2650 v2 2.60 GHz, 8C 85W 3.7 Medium 204 users 204 users
Two E5-2650 v3 2.30 GHz, 10C 105W 3.7 Medium 286 users 286 users
Two E5-2690 v2 3.0 GHz, 10C 130W 3.7 Medium 268 users 257 users
Two E5-2690 v3 2.60 GHz, 12C 135W 3.7 Medium 394 users 379 users
Two E5-2690 v2 3.0 GHz, 10C 130W 3.7 Heavy 224 users 229 users
Two E5-2690 v3 2.60 GHz, 12C 135W 3.7 Heavy 348 users 319 users
Table 7 lists the Login VSI performance of E5 2600 v3 processors from Intel that uses the Office worker
workload with XenServer 6.5.
Two E5-2650 v3 2.30 GHz, 10C 105W XenServer 6.5 225 users 224 users
Two E5-2680 v3 2.50 GHz, 12C 120W XenServer 6.5 274 users 278 users
Table 8 shows the results for the same comparison that uses the Knowledge worker workload.
Two E5-2680 v3 2.50 GHz, 12C 120W XenServer 6.5 210 users 208 users
Table 9 lists the results for the Login VSI 4.1 power worker workload.
Two E5-2680 v3 2.50 GHz, 12C 120W XenServer 6.5 181 users 179 users
Table 10: Hyper-V Server 2012 R2 performance with Office worker workload
Processor with office worker workload Hypervisor MCS stateless MCS dedicated
Two E5-2650 v3 2.30 GHz, 10C 105W Hyper-V 270 users 272 users
Table 11 shows the results for the same comparison that uses the Knowledge worker workload.
Table 11: Hyper-V Server 2012 R2 performance with Knowledge worker workload
Processor with knowledge worker workload Hypervisor MCS stateless MCS dedicated
Two E5-2680 v3 2.50 GHz, 12C 120W Hyper-V 250 users 247 users
Table 12 lists the results for the Login VSI 4.1 power worker workload.
Table 12: Hyper-V Server 2012 R2 performance with power worker workload
Processor with power worker workload Hypervisor MCS Stateless MCS Dedicated
Two E5-2680 v3 2.50 GHz, 12C 120W Hyper-V 216 users 214 users
The default recommendation is the Xeon E5-2650v3 processor and 512 GB of system memory because this
configuration provides the best coverage for a range of users up to 3GB of memory. For users who need VMs
that are larger than 3 GB, Lenovo recommends the use of up to 768 GB and the Xeon E5-2680v3 processor.
Lenovo testing shows that 150 users per server is a good baseline and has an average of 76% usage of the
processors in the server. If a server goes down, users on that server must be transferred to the remaining
servers. For this degraded failover case, Lenovo testing shows that 180 users per server have an average of
89% usage of the processor. It is important to keep this 25% headroom on servers to cope with possible
failover scenarios. Lenovo recommends a general failover ratio of 5:1.
Table 13 lists the processor usage with ESXi for the recommended user counts for normal mode and failover
mode.
Table 14 lists the recommended number of virtual desktops per server for different VM memory. The number of
users is reduced in some cases to fit within the available memory and still maintain a reasonably balanced
system of compute and memory.
Table 15 lists the approximate number of compute servers that are needed for different numbers of users and
VM sizes.
Table 15: Compute servers needed for different numbers of users and VM sizes
Desktop memory size (2 GB or 4 GB) 600 users 1500 users 4500 users 10000 users
Desktop memory size (3 GB) 600 users 1500 users 4500 users 10000 users
For power workers, the default recommendation is the Xeon E5-2680v3 processor and 384 GB of system
memory. For users who need VMs that are larger than 3 GB, Lenovo recommends up to 768 GB of system
memory.
Lenovo testing shows that 125 users per server is a good baseline and has an average of 79% usage of the
processors in the server. If a server goes down, users on that server must be transferred to the remaining
servers. For this degraded failover case, Lenovo testing shows that 150 users per server have an average of
88% usage of the processor. It is important to keep this 25% headroom on servers to cope with possible
failover scenarios. Lenovo recommends a general failover ratio of 5:1.
Table 16 lists the processor usage with ESXi for the recommended user counts for normal mode and failover
mode.
Table 17 lists the recommended number of virtual desktops per server for different VM memory. The number of
power users is reduced to fit within the available memory and still maintain a reasonably balanced system of
compute and memory.
Table 18 lists the approximate number of compute servers that are needed for different numbers of power
users and VM sizes.
Table 18: Compute servers needed for different numbers of power users and VM sizes
Desktop memory size 600 users 1500 users 4500 users 10000 users
4.3.2 Intel Xeon E5-2600 v3 processor family servers with Atlantis USX
Atlantis USX provides storage optimization by using a 100% software solution. There is a cost for processor
and memory usage while offering decreased storage usage and increased input/output operations per second
(IOPS). This section contains performance measurements for processor and memory utilization of USX simple
hybrid volumes and gives an indication of the storage usage and performance.
For persistent desktops, USX simple hybrid volumes provide acceleration to shared storage. For environments
that are not using Atlantis USX, it is recommended to use linked clones to conserve shared storage space.
However, with Atlantis USX hybrid volumes, it is recommended to use full clones for persistent desktops
because they de-duplicate more efficiently than the linked clones and can support more desktops per server.
For stateless desktops, USX simple hybrid volumes provide storage acceleration for local SSDs. USX simple
in-memory volumes are not considered for stateless desktops because they require a large amount of memory.
Table 19 lists the Login VSI performance of USX simple hybrid volumes using two E5-2680 v3 processors and
ESXi 6.0.
A comparison of performance with and without the Atlantis VM shows that increase of 20 - 30% with the
Atlantis VM. This result is to be expected and it is recommended that higher-end processors (like the
E5-2680v3) are used to maximize density.
For office workers and knowledge workers, Lenovo testing shows that 125 users per server is a good baseline
and has an average of 74% usage of the processors in the server. If a server goes down, users on that server
must be transferred to the remaining servers. For this degraded failover case, Lenovo testing shows that 150
users per server have an average of 83% usage of the processor. For power workers, Lenovo testing shows
that 100 users per server is a good baseline and has an average of 74% usage of the processors in the server.
If a server goes down, users on that server must be transferred to the remaining servers. For this degraded
failover case, Lenovo testing shows that 120 users per server have an average of 85% usage of the processor.
It is important to keep this 25% headroom on servers to cope with possible failover scenarios. Lenovo
recommends a general failover ratio of 5:1.
Table 20 lists the processor usage with ESXi for the recommended user counts for normal mode and failover
mode.
It is still a best practice to separate the user folder and any other shared folders into separate storage. This
configuration leaves all of the other possible changes that might occur in a full clone to be stored in the simple
hybrid volume. This configuration is highly dependent on the environment. Testing by Atlantis Computing
suggests that 3.5 GB of unique data per persistent desktop is sufficient.
Atlantis USX uses de-duplication and compression to reduce the storage that is required for VMs. Experience
shows that a 85% - 95% reduction is possible depending on the VMs.
Assuming each VM is 30 GB, the uncompressed storage requirement for 150 full clones is 4500 GB. The
simple hybrid volume must be only 10% of this volume, which is 450 GB. To allow some room for growth and
the Atlantis VM, a 600 GB volume should be allocated.
Atlantis has some requirements on memory. The VM requires 5 GB, and 5% of the storage volume is used for
in memory metadata. In the example of a 450 GB volume, this amount is 22.5 GB. The default size for the RAM
Assuming 4 GB for the hypervisor, 144 GB (22.5 + 112.5 + 5 + 4) of system memory should be reserved. It is
recommended that at least 384 GB of server memory is used for USX simple hybrid VDI deployments. Proof of
concept (POC) testing can help determine the actual amount of RAM.
Table 21 lists the recommended number of virtual desktops per server for different VM memory sizes.
Table 21: Recommended number of virtual desktops per server with USX simple hybrid volumes
Processor E5-2680v3 E5-2680v3 E5-2680v3
Table 22 lists the number of compute servers that are needed for different numbers of users and VM sizes. A
server with 384 GB system memory is used for 2 GB VMs, 512 GB system memory is used for 3 GB VMs, and
768 GB system memory is used for 4 GB VMs.
Table 22: Compute servers needed for different numbers of users with USX simple hybrid volumes
600 users 1500 users 4500 users 10000 users
As a result of the use of Atlantis simple hybrid volumes, the only read operations are to fill the cache for the first
time. For all practical purposes, the remaining reads are few and at most 1 IOPS per VM. Writes to persistent
storage are still needed for starting, logging in, remaining in steady state, and logging off, but the overall IOPS
count is substantially reduced.
Assuming the use of a fast, low-latency shared storage device, a single VM boot can take 20 - 25 seconds to
get past the display of the logon window and get all of the other services fully loaded. This process takes this
time because boot operations are mainly read operations, although the actual boot time can vary depending on
the VM. Citrix XenDesktop boots VMs in batches of 10 at a time, which reduces IOPS for most storage
systems but is actually an inhibitor for Atlantis simple hybrid volumes. Without the use of XenDesktop, a boot of
100 VMs in a single data store is completed in 3.5 minutes; that is, only 11 times longer than the boot of a
single VM and far superior to existing storage solutions.
Login time for a single desktop varies, depending on the VM image but can be extremely quick. In some cases,
the login will take less than 6 seconds. Scale-out testing across a cluster of servers shows that one new login
As the name implies, multiple desktops share a single VM; however, because of this sharing, the compute
resources often are exhausted before memory. Lenovo testing showed that 128 GB of memory is sufficient for
servers with two processors.
Other testing showed that the performance differences between four, six, or eight VMs is minimal; therefore,
four VMs are recommended to reduce the license costs for Windows Server 2012 R2.
For more information, see “BOM for hosted desktops” section on page 52.
Two E5-2650 v3 2.30 GHz, 10C 105W ESXi 6.0 Office Worker 222 users
Two E5-2670 v3 2.30 GHz, 12C 120W ESXi 6.0 Office Worker 255 users
Two E5-2680 v3 2.50 GHz, 12C 120W ESXi 6.0 Office Worker 264 users
Two E5-2690 v3 2.60 GHz, 12C 135W ESXi 6.0 Office Worker 280 users
Two E5-2680 v3 2.50 GHz, 12C 120W ESXi 6.0 Knowledge Worker 231 users
Two E5-2690 v3 2.50 GHz, 12C 120W ESXi 6.0 Knowledge Worker 237 users
Table 24 lists the processor performance results for different size workloads that use four Windows Server
2012 R2 VMs with the Xeon E5-2600v3 series processors and XenServer 6.5 hypervisor.
Two E5-2650 v3 2.30 GHz, 10C 105W XenServer 6.5 Office Worker 225 users
Two E5-2670 v3 2.30 GHz, 12C 120W XenServer 6.5 Office Worker 243 users
Two E5-2680 v3 2.50 GHz, 12C 120W XenServer 6.5 Office Worker 262 users
Two E5-2690 v3 2.60 GHz, 12C 135W XenServer 6.5 Office Worker 271 users
Two E5-2680 v3 2.50 GHz, 12C 120W XenServer 6.5 Knowledge Worker 223 users
Two E5-2690 v3 2.50 GHz, 12C 120W XenServer 6.5 Knowledge Worker 226 users
Table 25: Hyper-V Server 2012 R2 results for shared hosted desktops
Processor Hypervisor Workload Hosted Desktops
Two E5-2650 v3 2.30 GHz, 10C 105W Hyper-V Office Worker 337 users
Two E5-2680 v3 2.50 GHz, 12C 120W Hyper-V Office Worker 295 users
Two E5-2680 v3 2.50 GHz, 12C 120W Hyper-V Knowledge Worker 264 users
Lenovo testing shows that 170 hosted desktops per server is a good baseline. If a server goes down, users on
that server must be transferred to the remaining servers. For this degraded failover case, Lenovo recommends
204 hosted desktops per server. It is important to keep 25% headroom on servers to cope with possible failover
scenarios. Lenovo recommends a general failover ratio of 5:1.
Table 26 lists the processor usage for the recommended number of users.
Table 27 lists the number of compute servers that are needed for different numbers of users. Each compute
server has 128 GB of system memory for the four VMs.
Table 27: Compute servers needed for different numbers of users and VM sizes
600 users 1500 users 4500 users 10000 users
The Citrix XenDesktop VDI edition has fewer features than the other XenDesktop versions and might be
sufficient for the customer SMB environment. For more information and a comparison, see this website:
citrix.com/go/products/xendesktop/feature-matrix.html
Citrix XenServer has no other license cost and is an alternative to other hypervisors. The performance
measurements in this section show XenServer and ESXi.
Providing that there is some kind of HA for the management server VMs, the number of compute servers that
are needed can be reduced at the cost of less user density. There is a cross-over point on the number of users
where it makes sense to have dedicated compute servers for the management VMs. That cross-over point
varies by customer, but often it is in the range of 300 - 600 users. A good assumption is a reduction in user
density of 20% for the management VMs; for example, 125 users reduces to 100 per compute server.
Shared storage is expensive. Some shared storage is needed to ensure user recovery if there is a failure and
the IBM Storwize V3700 with an iSCSI connection is recommended. Dedicated virtual desktops always must
be on a server in case of failure, but stateless virtual desktops can be provisioned to HDDs on the local server.
Only the user data and profile information must be on shared storage.
4.5.1 Intel Xeon E5-2600 v3 processor family servers with local storage
The performance measurements and recommendations in this section are for the use of stateless virtual
machines with local storage on the compute server. For more information about for persistent virtual desktops
as persistent users must use shared storage for resilience, see “Compute servers for virtual desktops” on page
14.
XenServer 6.5 formats a local datastore by using the LVMoHBA file system. As a consequence XenServer 6.5
supports only thick provisioning and not thin provisioning. This fact means that VMs that are created with MCS
are large and take up too much local disk space. Instead, only PVS was used to provision stateless virtual
desktops.
Table 28 shows the processor performance results for the Xeon E5-2650v3 series of processors by using
stateless virtual machines on local HDDs with XenServer 6.5. This configuration used 12 or 16 300 GB 15k
RPM HDDs in a RAID 10 array. Measurements showed that 8 HDDs is not sufficient for the required IOPS and
the capacity requirements.
Two E5-2680 v3 2.50 GHz, 12C 120W Office worker 211 users 211 users
Two E5-2680 v3 2.50 GHz, 12C 120W Knowledge worker 117 users 162 users
For Office worker, 12 HDDs have sufficient IOPS but the storage capacity with 300 GB HDDs was 96%. If the
number of VMs is reduced to 125, there should be sufficient capacity for a default 6GB write cache per VM with
some room left over for VM growth. For VMs that are larger than 30 GB, it is recommended that 600 GB 15k
RPM HDDs are used.
For Knowledge worker, 12 HDDs do not have sufficient IOPS. In this case, at least 16 HDDs should be used.
Because an SMB environment is used with a range of user sizes from less than 75 to 600, the following
configurations are recommended:
For small user counts, each server can support 75 users at most by using two E5-2650v3 processors,
256 GB of memory, and 12 HDDs. The extra memory is needed for the VM for management servers.
For average user counts, each server supports 125 users at most by using two E5-2650v3 processors,
256 GB of memory, and 16 HDDs.
These configurations also need two extra HDDs for installing XenServer 6.5.
Table 29 shows the number of compute servers that are needed for different number of medium users.
Table 29: SMB Compute servers needed for different numbers of users and VM sizes
75 users 150 users 250 users 500 users
For more information, see “BOM for SMB compute servers” on page 47.
For more information, see “BOM for hyper-converged compute servers” on page 56.
4.6.1 Intel Xeon E5-2600 v3 processor family servers with Atlantis USX
Atlantis USX is tested by using the knowledge worker workload of Login VSI 4.1. Four Lenovo x3650 M5
servers with E5-2680v3 processors were networked together by using a 10 GbE TOR switch. Atlantis USX was
installed and four 400 GB SSDs per server were used to create an all-flash hyper-converged volume across
the four servers that were running ESXi 5.5 U2.
This configuration was tested with 500 dedicated virtual desktops on four servers and then three servers to see
the difference if one server is unavailable. Table 30 lists the processor usage for the recommended number of
users.
From these measurements, Lenovo recommends 125 user per server in normal mode and 150 users per
server in failover mode. Lenovo recommends a general failover ratio of 5:1.
Table 31 lists the recommended number of virtual desktops per server for different VM memory sizes.
Table 31: Recommended number of virtual desktops per server for Atlantis USX
Processor E5-2680 v3 E5-2680 v3 E5-2680 v3
Table 32 lists the approximate number of compute servers that are needed for different numbers of users and
VM sizes.
Table 32: Compute servers needed for different numbers of users for Atlantis USX
An important part of a hyper-converged system is the resiliency to failures when a compute server is
unavailable. Login VSI was run and then the compute server was powered off. This process was done during
the steady state phase. For the steady state phase, 114 - 120 VMs were migrated from the “failed” server to the
other three servers with each server gaining 38 - 40 VMs.
Figure 7 shows the processor usage for the four servers during the steady state phase when one of the servers
is powered off. The processor spike for the three remaining servers is noticeable.
There is an impact on performance and time lag if a hyper-converged server suffers a catastrophic failure, yet
VSAN can recover quite quickly. However, this situation is best avoided as it is important to build in redundancy
at multiple levels for all mission critical systems.
Dedicated GPU with one GPU per user which is called pass-through mode.
GPU hardware virtualization (vGPU) that partitions each GPU for 1 to 8 users.
The performance of graphics acceleration was tested on the NVIDIA GRID K1 and GRID K2 adapters by using
the Lenovo System x3650 M5 server and the Lenovo NeXtScale nx360 M5 server. Each of these servers
supports up to two GRID adapters. No significant performance differences were found between these two
servers when they were used for graphics acceleration and the results apply to both.
Because pass-through mode offers a low user density (eight for GRID K1 and four for GRID K2), it is
recommended that this mode is only used for power users, designers, engineers, or scientists that require
powerful graphics acceleration.
Lenovo recommends that a high-powered CPU, such as the E5-2680v3, is used for pass-through mode and
vGPU because accelerated graphics tends to put an extra load on the processor. For the pass-through mode,
with only four or eight users per server, 128 GB of server memory should be sufficient even for the high end
GRID K2 users who might need 16 GB or even 24 GB per VM. See also NVidia’s release notes on vGPU for
Citrix XenServer 6.0:
The Heaven benchmark is used to measure the per user frame rate for different GPUs, resolutions, and image
quality. This benchmark is graphics-heavy and is fairly realistic for designers and engineers. Power users or
knowledge workers usually have less intense graphics workloads and can achieve higher frame rates. All tests
were done by using the default H.264 compatibility display mode for the clients. This configuration offers the
best compromise between data size (compression) and performance.
Table 33 lists the results of the Heaven benchmark as frames per second (FPS) that are available to each user
with the GRID K1 adapter by using pass-through mode with DirectX 11.
Table 34 lists the results of the Heaven benchmark as FPS that are available to each user with the GRID K2
adapter by using pass-through mode with DirectX 11.
Table 35 lists the results of the Heaven benchmark as FPS that are available to each user with the GRID K1
adapter by using vGPU mode with DirectX 11. The K180Q profile has similar performance to the K1
pass-through mode.
Table 36 lists the results of the Heaven benchmark as FPS that are available to each user with the GRID K2
adapter by using vGPU mode with DirectX 11. The K280Q profile has similar performance to the K2
pass-through mode.
The GRID K2 GPU has more than twice the performance of the GRID K1 GPU, even with the high quality,
tessellation, and anti-aliasing options. This result is expected because of the relative performance
characteristics of the GRID K1 and GRID K2 GPUs. The frame rate decreases as the display resolution
increases.
For more information about the bill of materials (BOM) for GRID K1 and K2 GPUs for Lenovo System x3650
M5 and NeXtScale nx360 M5 servers, see the following corresponding BOMs:
PVS servers often are run natively on Windows servers. The testing showed that they can run well inside a VM,
if it is sized per Table 37. The disk space for PVS servers is related to the number of provisioned images.
Table 38 lists the number of management VMs for each size of users following the high availability and
performance characteristics. The number of vCenter servers is half of the number of vCenter clusters because
each vCenter server can handle two clusters of up to 1000 desktops.
PVS servers for stateless case only 2 (1+1) 4 (2+2) 8 (6+2) 14 (10+4)
ESXi management service VM 600 users 1500 users 4500 users 10000 users
vCenter servers 1 1 3 7
Each management VM requires a certain amount of virtual processors, memory, and disk. There is enough
capacity in the management servers for all of these VMs. Table 39 lists an example mapping of the
management VMs to the four physical management servers for 4500 users.
It is assumed that common services, such as Microsoft Active Directory, Dynamic Host Configuration Protocol
(DHCP), domain name server (DNS), and Microsoft licensing servers exist in the customer environment.
For shared storage systems that support block data transfers only, it is also necessary to provide some file I/O
servers that support CIFS or NFS shares and translate file requests to the block storage system. For high
availability, two or more Windows storage servers are clustered.
Based on the number and type of desktops, Table 40 lists the recommended number of physical management
servers. In all cases, there is redundancy in the physical management servers and the management VMs.
For more information, see “BOM for enterprise and SMB management servers” on page 60.
The Lenovo XClarity Administrator provides agent-free hardware management for Lenovo’s System x® rack
servers and Flex System™ compute nodes and components, including the Chassis Management Module
(CMM) and Flex System I/O modules. Figure 8 shows the Lenovo XClarity administrator interface, where Flex
System components and rack servers are managed and are seen on the dashboard. Lenovo XClarity
Administrator is a virtual appliance that is quickly imported into a virtualized environment server configuration.
Experimentation with VDI infrastructures shows that the input/output operation per second (IOPS) performance
takes precedence over storage capacity. This precedence means that more slower speed drives are needed to
match the performance of fewer higher speed drives. Even with the fastest HDDs available today (15k rpm),
there can still be excess capacity in the storage system because extra spindles are needed to provide the
IOPS performance. From experience, this extra storage is more than sufficient for the other types of data such
as SQL databases and transaction logs.
The large rate of IOPS, and therefore, large number of drives needed for dedicated virtual desktops can be
ameliorated to some extent by caching data in flash memory or SSD drives. The storage configurations are
based on the peak performance requirement, which usually occurs during the so-called “logon storm.” This is
when all workers at a company arrive in the morning and try to start their virtual desktops, all at the same time.
It is always recommended that user data files (shared folders) and user profile data are stored separately from
the user image. By default, this has to be done for stateless virtual desktops and should also be done for
dedicated virtual desktops. It is assumed that 100% of the users at peak load times require concurrent access
to user data and profiles.
Stateless virtual desktops are provisioned from shared storage using PVS. The PVS write cache is maintained
on a local SSD. Table 41 summarizes the peak IOPS and disk space requirements for stateless virtual
desktops on a per-user basis.
Table 42 summarizes the peak IOPS and disk space requirements for dedicated or shared stateless virtual
desktops on a per-user basis. Persistent virtual desktops require a high number of IOPS and a large amount of
disk space. Stateless users that require mobility and have no local SSDs also fall into this category. The last
three rows of Table 42 are the same as Table 41 for stateless desktops.
The sizes and IOPS for user data files and user profiles that are listed in Table 41 and Table 42 can vary
depending on the customer environment. For example, power users might require 10 GB and five IOPS for
user files because of the applications they use. It is assumed that 100% of the users at peak load times require
concurrent access to user data files and profiles.
Many customers need a hybrid environment of stateless and dedicated desktops for their users. The IOPS for
dedicated users outweigh those for stateless users; therefore, it is best to bias towards dedicated users in any
storage controller configuration.
The storage configurations that are presented in this section include conservative assumptions about the VM
size, changes to the VM, and user data sizes to ensure that the configurations can cope with the most
demanding user scenarios.
This reference architecture describes the following different shared storage solutions:
Block I/O to IBM Storwize® V7000 / Storwize V3700 storage using Fibre Channel (FC)
Block I/O to IBM Storwize V7000 / Storwize V3700 storage using FC over Ethernet (FCoE)
Block I/O to IBM Storwize V7000 / Storwize V3700 storage using Internet Small Computer System
Interface (iSCSI)
Block I/O to IBM FlashSystem 840 with Atlantis USX storage acceleration
The IBM Storwize V3700 storage system is somewhat similar to the Storwize V7000 storage, but is restricted
to a maximum of five expansion enclosures for a total of 120 drives. The maximum size of the cache for the
Storwize V3700 is 8 GB.
The Storwize cache acts as a read cache and a write-through cache and is useful to cache commonly used
data for VDI workloads. The read and write cache are managed separately. The write cache is divided up
across the storage pools that are defined for the Storwize storage system.
The tiered storage support of Storwize storage also allows a mixture of different disk drives. Slower drives can
be used for shared folders and profiles; faster drives and SSDs can be used for persistent virtual desktops and
desktop images.
To support file I/O (CIFS and NFS) into Storwize storage, Windows storage servers must be added, as
described in “Management servers” on page 30.
The fastest HDDs that are available for Storwize storage are 15k rpm drives in a RAID 10 array. Storage
performance can be significantly improved with the use of Easy Tier. If this performance is insufficient, SSDs or
alternatives (such as a flash storage system) are required.
For this reference architecture, it is assumed that each user has 5 GB for shared folders and profile data and
uses an average of 2 IOPS to access those files. Investigation into the performance shows that 600 GB
10k rpm drives in a RAID 10 array give the best ratio of input/output operation performance to disk space. If
users need more than 5 GB for shared folders and profile data then 900 GB (or even 1.2 TB), 10k rpm drives
can be used instead of 600 GB. If less capacity is needed, the 300 GB 15k rpm drives can be used for shared
folders and profile data.
Persistent virtual desktops require both: a high number of IOPS and a large amount of disk space for the linked
clones. The linked clones can grow in size over time as well. For persistent desktops, 300 GB 15k rpm drives
configured as RAID 10 were not sufficient and extra drives were required to achieve the necessary
performance. Therefore, it is recommended to use a mixture of both speeds of drives for persistent desktops
and shared folders and profile data.
Depending on the number of master images, one or more RAID 1 array of SSDs can be used to store the VM
master images. This configuration help with performance of provisioning virtual desktops; that is, a “boot storm”.
Each master image requires at least double its space. The actual number of SSDs in the array depends on the
number and size of images. In general, more users require more images.
Image size 30 GB 30 GB 30 GB 30 GB
400 GB SSD configuration RAID 1 (2) RAID 1 (2) Two RAID 1 Four RAID 1
arrays (4) arrays (8)
Table 44 lists the Storwize storage configuration that is needed for each of the stateless user counts. Only one
Storwize control enclosure is needed for a range of user counts. Based on the assumptions in Table 44, the
IBM Storwize V3700 storage system can support up to 7000 users only.
Table 45 lists the Storwize storage configuration that is needed for each of the dedicated or shared stateless
user counts. The top four rows of Table 45 are the same as for stateless desktops. Lenovo recommends
clustering the IBM Storwize V7000 storage system and the use of a separate control enclosure for every 2500
or so dedicated virtual desktops. For the 4500 and 10000 user solutions, the drives are divided equally across
all of the controllers. Based on the assumptions in Table 45, the IBM Storwize V3700 storage system can
support up to 1200 users.
Table 45: Storwize storage configuration for dedicated or shared stateless users
Dedicated or shared stateless storage 600 users 1500 users 4500 users 10000 users
Refer to the “BOM for shared storage” on page 64 for more details.
Persistent virtual desktops require the most storage space and are the best candidate for this storage device.
The device also can be used for user folders, snap clones, and image management, although these items can
be placed on other slower shared storage.
The amount of required storage for persistent virtual desktops varies and depends on the environment. Table
46 is provided for guidance purposes only.
Table 46: FlashSystem 840 storage configuration for dedicated users with Atlantis USX
Dedicated storage 1000 users 3000 users 5000 users 10000 users
2 TB flash module 4 8 12 0
4 TB flash module 0 0 0 12
Capacity 4 TB 12 TB 20 TB 40 TB
Refer to the “BOM for OEM storage hardware” on page 68 for more details.
4.11 Networking
The main driver for the type of networking that is needed for VDI is the connection to shared storage. If the
shared storage is block-based (such as the IBM Storwize V7000), it is likely that a SAN that is based on 8 or
16 Gbps FC, 10 GbE FCoE, or 10 GbE iSCSI connection is needed. Other types of storage can be network
attached by using 1 Gb or 10 Gb Ethernet.
Also, there is user and management virtual local area networks (VLANs) available that require 1 Gb or 10 Gb
Ethernet as described in the Lenovo Client Virtualization reference architecture, which is available at this
website: lenovopress.com/tips1275.
Automated failover and redundancy of the entire network infrastructure and shared storage is important. This
failover and redundancy is achieved by having at least two of everything and ensuring that there are dual paths
between the compute servers, management servers, and shared storage.
If only a single Flex System Enterprise Chassis is used, the chassis switches are sufficient and no other TOR
switch is needed. For rack servers, more than one Flex System Enterprise Chassis TOR switches are required.
TOR switches 1
4.12 Racks
The number of racks and chassis for Flex System compute nodes depends upon the precise configuration that
is supported and the total height of all of the component parts: servers, storage, networking switches, and Flex
System Enterprise Chassis (if applicable). The number of racks for System x servers is also dependent on the
total height of all of the components. For more information, see the “BOM for racks” section on page 67.
4.13.1 Deployment example 1: Flex Solution with single Flex System chassis
As shown in Table 52, this example is for 1250 stateless users that are using a single Flex System chassis.
There are 10 compute nodes supporting 125 users in normal mode and 156 users in the failover case of up to
two nodes not being available. The IBM Storwize V7000 storage is connected by using FC directly to the Flex
System chassis.
Figure 9 shows the deployment diagram for this configuration. The first rack contains the compute and
management servers and the second rack contains the shared storage.
M4 VMs
Cxx
Desktop SQL Server
Desktop controller Each compute
Web Server server has
PVS Servers (2) 125 user VMs
M2 VMs
vCenter Server
vCenter SQL Server
License Server
PVS Servers (2)
M3 VMs
vCenter Server
Desktop SQL Server
Web Server
PVS Servers (2)
Figure 9: Deployment diagram for 4500 stateless users using Storwize V7000 shared storage
Figure 10 shows the 10 GbE and Fibre Channel networking that is required to connect the three Flex System
Enterprise Chassis to the Storwize V7000 shared storage. The detail is shown for one chassis in the middle
and abbreviated for the other two chassis. The 1 GbE management infrastructure network is not shown for the
purpose of clarity.
Redundant 10 GbE networking is provided at the chassis level with two EN4093R switches and at the rack
level by using two G8264R TOR switches. Redundant SAN networking is also used with two FC3171 switches
and two top of rack SAN24B-5 switches. The two controllers in the Storwize V7000 are redundantly connected
to each of the SAN24B-5 switches.
Assuming 125 users per server in the normal case and 150 users in the failover case, then 3000 users need 24
compute servers. A maximum of four compute servers can be down for a 5:1 failover ratio. Each compute
server needs at least 315 GB of RAM (150 x 2.1), not including the hypervisor. This figure is rounded up to
384 GB, which should be more than enough and can cope with up to 125 users, all with 3 GB VMs.
Each compute server is a System x3550 server with two Xeon E5-2650v2 series processors, 24 16 GB of
1866 MHz RAM, an embedded dual port 10 GbE virtual fabric adapter (A4MC), and a license for FCoE/iSCSI
(A2TE). For interchangeability between the servers, all of them have a RAID controller with 1 GB flash upgrade
and two S3700 400 GB MLC enterprise SSDs that are configured as RAID 0 for the stateless VMs.
In addition, there are three management servers. For interchangeability in case of server failure, these extra
servers are configured in the same way as the compute servers. All the servers have a USB key with ESXi.
There also are two Windows storage servers that are configured differently with HDDs in RAID 1 array for the
operating system. Some spare, preloaded drives are kept to quickly deploy a replacement Windows storage
server if one should fail. The replacement server can be one of the compute servers. The idea is to quickly get
a replacement online if the second one fails. Although there is a low likelihood of this situation occurring, it
reduces the window of failure for the two critical Windows storage servers.
All of the servers communicate with the Storwize V7000 shared storage by using FCoE through two TOR
RackSwitch G8264CS 10GbE converged switches. All 10 GbE and FC connections are configured to be fully
redundant. As an alternative, iSCSI with G8264 10GbE switches can be used.
For 300 persistent users and 2700 stateless users, a mixture of disk configurations is needed. All of the users
require space for user folders and profile data. Stateless users need space for master images and persistent
users need space for the virtual clones. Stateless users have local SSDs to cache everything else, which
substantially decreases the amount of shared storage. For stateless servers with SSDs, a server must be
taken offline and only have maintenance performed on it after all of the users are logged off rather than being
able to use vMotion. If a server crashes, this issue is immaterial.
It is estimated that this configuration requires the following IBM Storwize V7000 drives:
This configuration requires 96 drives, which fit into one Storwize V7000 control enclosure and three expansion
enclosures.
Figure 11 shows the deployment configuration for this example in a single rack. Because the rack has 36 items,
it should have the capability for six power distribution units for 1+1 power redundancy, where each PDU has 12
C13 sockets.
Each virtual desktop is 40 GB. The total required capacity is 86 GB, assuming full clones of 1500 VMs, one
complete mirror of the data (FFT=1), space of swapping, and 30% extra capacity. However, the use of linked
clones for VSAN means that a substantially smaller capacity is required and 43 TB is more than sufficient.
This example uses six System x3550 M5 1U servers and 2 G8124E RackSwitch TOR switches. Each server
has two E5-2680v3 CPUs, 512 GB of memory, two network cards, two 400 GB SSDs, and six 1.2TB HDDs.
Two disk groups of 1 SSD and 3 HDDs are used for VSAN.
B
SP
L/A
MB
A
MS
B
SP
L/A
MB
A
MS
0 1 2 3 4 5 6 7
3550 M5
0 1 2 3 4 5 6 7
3550 M5
0 1 2 3 4 5 6 7
3550 M5
0 1 2 3 4 5 6 7
3550 M5
0 1 2 3 4 5 6 7
3550 M5
0 1 2 3 4 5 6 7
3550 M5
Figure 12: Deployment configuration for 750 dedicated users using hyper-converged System x servers
For 120 CAD users, 15 servers are required and three extra servers are used for failover in case of a hardware
problem. In total, three NeXtScale n1200 chassis are needed for these 18 servers in 18U of rack space.
Storwize V7000 is used for shared storage for virtual desktops and for the CAD data store. This example uses
a total of six Storwize enclosures, each with 22 HDDs and four SSDs. Each HDD is a 15k rpm 600 GB and
each SDD is 400 GB. Assuming RAID 10 for the HDDs, the six enclosures can store approximately 36 TB of
data. A total of 12 400 SSDs is more than the recommended 10% of the data capacity and is used for Easytier
function to improve read and write performance.
The 120 VMs for the CAD users can use as much as 10 TB, but that use still leaves a reasonable 26 TB of
storage for CAD data.
The total rack space for the compute servers, storage, and TOR switches for 1 GbE management network,
10 GbE user network and 8 Gbps FC SAN is 39U, as shown in Figure 13.
Figure 13: Deployment configuration for 120 CAD users using NeXtScale servers with GPUs.
The BOM lists in this appendix are not meant to be exhaustive and must always be double-checked with the
configuration tools. Any discussion of pricing, support, and maintenance options is outside the scope of this
document.
For connections between TOR switches and devices (servers, storage, and chassis), the connector cables are
configured with the device. The TOR switch configuration includes only transceivers or other cabling that is
needed for failover or redundancy.
System x3650 M5
Code Description Quantity
5462AC1 System x3650 M5 1
A5GU Intel Xeon Processor E5-2650 v3 10C 2.3GHz 25MB 2133MHz 105W 1
A5EM Intel Xeon Processor E5-2650 v3 10C 2.3GHz 25MB 2133MHz 105W 1
A5FD System x3650 M5 2.5" Base without Power Supply 1
A5EA System x3650 M5 Planar 1
A5FN System x3650 M5 PCIe Riser 1 (1 x16 FH/FL + 1 x8 FH/HL Slots) 1
A5R5 System x3650 M5 PCIe Riser 2 (1 x16 FH/FL + 1 x8 FH/HL Slots) 1
A5AX System x 550W High Efficiency Platinum AC Power Supply 2
6311 2.8m, 10A/100-250V, C13 to IEC 320-C14 Rack Power Cable 2
A1ML Lenovo Integrated Management Module Advanced Upgrade 1
A5EY System Documentation and Software-US English 1
A5GG x3650 M5 16x 2.5" HS HDD Assembly Kit (Dual RAID) 1
A3YZ ServeRAID M5210 SAS/SATA Controller for System x 1
A3Z2 ServeRAID M5200 Series 2GB Flash/RAID 5 Upgrade 1
2302 RAID Configuration 1
A2KB Primary Array - RAID 10 (minimum of 4 drives required) 1
A4TR 300GB 15K 6Gbps SAS 2.5" G3HS HDD 14
A5UT Emulex VFA5 2x10 GbE SFP+ PCIe Adapter for System x 1
9297 2U Bracket for Emulex 10GbE Virtual Fabric Adapter for System x 1
Select extra network connectivity for FCoE or iSCSI, 8Gb FC, or 16 Gb FC
A5UV Emulex VFA5 FCoE/iSCSI SW for PCIe Adapter for System x (FoD) 1
3591 Brocade 8Gb FC Dual-port HBA for System x 1
7595 2U Bracket for Brocade 8GB FC Dual-port HBA for System x 1
88Y6854 5m LC-LC fiber cable (networking) 2
A2XV Brocade 16Gb FC Dual-port HBA for System x 1
88Y6854 5m LC-LC fiber cable (networking) 2
Select amount of system memory
A5B7 16GB TruDDR4 Memory (2Rx4, 1.2V) PC4-17000 CL15 2133MHz LP RDIMM 16
A5B7 16GB TruDDR4 Memory (2Rx4, 1.2V) PC4-17000 CL15 2133MHz LP RDIMM 24
System x3550 M5
Because the Windows storage servers use a bare-metal operating system (OS) installation, they require much
less memory and can have a reduced configuration as listed below.
Table 44 and Table 45 on page 36 list the number of Storwize V7000 storage controllers and expansion units
that are needed for different user counts.
RackSwitch G8052
Code Description Quantity
7159G52 Lenovo System Networking RackSwitch G8052 (Rear to Front) 1
6201 1.5m, 10A/100-250V, C13 to IEC 320-C14 Rack Power Cable 2
3802 1.5m Blue Cat5e Cable 3
A3KP Lenovo System Networking Adjustable 19" 4 Post Rail Kit 1
RackSwitch G8124E
RackSwitch G8264
RackSwitch G8264CS
System x rack
Table 46 on page 37 shows how much FlashSystem storage is needed for different user counts.
Code Description Quantity
9840-AE1 IBM FlashSystem 840 1
AF11 4TB eMLC Flash Module 12
AF10 2TB eMLC Flash Module 0
AF1B 1 TB eMLC Flash Module 0
AF14 Encryption Enablement Pack 1
Select network connectivity for 10 GbE iSCSI, 10 GbE FCoE, 8 Gb FC, or 16 Gb FC
AF17 iSCSI Host Interface Card 2
AF1D 10 Gb iSCSI 8 Port Host Optics 2
AF15 FC/FCoE Host Interface Card 2
AF1D 10 Gb iSCSI 8 Port Host Optics 2
AF15 FC/FCoE Host Interface Card 2
AF18 8 Gb FC 8 Port Host Optics 2
3701 5 m Fiber Cable (LC-LC) 8
AF15 FC/FCoE Host Interface Card 2
AF19 16 Gb FC 4 Port Host Optics 2
3701 5 m Fiber Cable (LC-LC) 4
Version 1.1 5 May 2015 Added performance measurements and recommendations for
the Intel E5-2600 v3 processor family and added BOMs for
Lenovo M5 series servers.
Version 1.2 2 July 2015 Added more performance measurements and recommendations
for the Intel E5-2600 v3 processor family to include power
workers, Hyper-V hypervisor, SMB mode, and Atlantis USX
simple hybrid storage acceleration.