Emc Networker Release 7.6 Service Pack 1: Performance Optimization Planning Guide
Emc Networker Release 7.6 Service Pack 1: Performance Optimization Planning Guide
EMC Corporation
Corporate Headquarters:
Hopkinton, MA 01748-9103
1-508-435-1000
www.EMC.com
Copyright © 1990-2010 EMC Corporation. All rights reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change
without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO
REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION,
AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
All other trademarks used herein are the property of their respective owners.
2 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Contents
Preface
Chapter 1 Overview
Organization............................................................................................................ 10
NetWorker data flow.............................................................................................. 11
EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide 3
Contents
4 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Preface
As part of an effort to improve and enhance the performance and capabilities of its product
lines, EMC periodically releases revisions of its hardware and software. Therefore, some
functions described in this document may not be supported by all versions of the software or
hardware currently in use. For the most up-to-date information on product features, refer to
your product release notes.
If a product does not function properly or does not function as described in this document,
please contact your EMC representative.
Audience This document is part of the NetWorker documentation set, and is intended for use
by system administrators to identify the different hardware and software
components that make up the NetWorker datazone. It discusses the component’s
impact on storage management tasks, and provides general guidelines for locating
problems and solutions.
EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide 5
Preface
Note: To access the E-lab Issue Tracker or the NetWorker Procedure Generator, go to
http://www.Powerlink.emc.com. You must have a service agreement to use this site.
Conventions used in EMC uses the following conventions for special notices.
this document
Note: A note presents information that is important, but not hazard-related.
6 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Preface
! CAUTION
A caution contains information essential to avoid data loss or damage to the system
or equipment.
! IMPORTANT
An important notice contains information essential to operation of the software.
Typographical conventions
EMC uses the following type style conventions in this document:
<> Angle brackets enclose parameter or variable values supplied by the user
EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide 7
Preface
Where to get help EMC support, product, and licensing information can be obtained as follows.
Product information — For documentation, release notes, software updates, or for
information about EMC products, licensing, and service, go to the EMC Powerlink
website (registration required) at: http://Powerlink.EMC.com
Technical support — For technical support, go to EMC Customer Service on
Powerlink. To open a service request through Powerlink, you must have a valid
support agreement. Please contact your EMC sales representative for details about
obtaining a valid support agreement or to answer any questions about your account.
Your comments Your suggestions will help us continue to improve the accuracy, organization, and
overall quality of the user publications. Please send your opinion of this document to:
[email protected]
If you have issues, comments, or questions about specific information or procedures,
include the title and, if available, the part number, the revision (for example, A01), the
page numbers, and any other details that will help us locate the subject you are
addressing.
8 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
1
Overview
Overview 9
Overview
Organization
This guide is organized into the following chapters:
◆ Chapter 2, “Size the NetWorker Environment,” provides details on how to
determine requirements.
◆ Chapter 3, “Tune Settings,” provides details on how to tune the backup
environment to optimize backup and restore performance.
◆ Chapter 4, “Test Performance,” provides details on how to test and understand
bottlenecks by using available tools.
10 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Overview
Note: Figure 1 and Figure 2 are simplified diagrams, and not all interprocess communication is
shown. There are many other possible backup and recover data flow configurations.
12 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
2
Size the NetWorker
Environment
This chapter describes how to best determine backup and system requirements. The
first step is to understand the environment. Performance issues can often be
attributed to hardware or environmental issues. An understanding of the entire
backup data flow is important to determine the optimal performance that can be
expected from the NetWorker software.
This chapter includes the following topics:
◆ Expectations ................................................................................................................... 14
◆ System components ...................................................................................................... 16
◆ Components of a NetWorker environment ............................................................... 22
◆ Recovery performance factors..................................................................................... 27
◆ Connectivity and bottlenecks ...................................................................................... 28
Expectations
This section describes backup environment performance expectations and required
backup configurations.
14 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
Expectations 15
Size the NetWorker Environment
System components
Every backup environment has a bottleneck. It may be a fast bottleneck, but the
bottleneck will determine the maximum throughput obtainable in the system.
Backup and restore operations are only as fast as the slowest component in the
backup chain.
Performance issues are often attributed to hardware devices in the datazone. This
guide assumes that hardware devices are correctly installed and configured.
This section discusses how to determine requirements. For example:
◆ How much data must move?
◆ What is the backup window?
◆ How many drives are required?
◆ How many CPUs are required?
Devices on backup networks can be grouped into four component types. These are
based on how and where devices are used. In a typical backup network, the following
four components are present:
◆ System
◆ Storage
◆ Network
◆ Target device
System
The components that impact performance in system configurations are listed here:
◆ CPU
◆ Memory
◆ System bus (this determines the maximum available I/O bandwidth)
CPU requirements
Determine the optimal number of CPUs required, if 5 MHz is required to move 1 MB
of data from a source device to a target device. For example, a NetWorker server, or
storage node backing up to a local tape drive at a rate of 100 MB per second, requires
1 GHz of CPU power:
◆ 500 MHz is required to move data from the network to a NetWorker server or
storage node.
◆ 500 MHz is required to move data from the NetWorker server or storage node to
the backup target device.
Note: 1 GHz on one type of CPU does not directly compare to a 1 GHz of CPU from a different
vendor.
The CPU load of a system is impacted by many additional factors. For example:
◆ High CPU load is not necessarily a direct result of insufficient CPU power, but
can be a side effect of the configuration of the other system components.
◆ Drivers:
16 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
Memory requirements
Table 1 on page 17 lists the minimum memory requirements for the NetWorker
server. This ensures that memory is not a bottleneck.
Less than 50 4 GB
51–150 8 GB
System components 17
Size the NetWorker Environment
Note: Avoid using old bus types or high speed components optimized for old bus
type as they generate too many interrupts causing CPU spikes during data transfers.
Note: The aggregate number of bus adapters should not exceed bus specifications.
Bus specifications
Bus specifications are listed in Table 2 on page 18.
18 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
Storage
The components that impact performance of storage configurations are listed here:
◆ Storage connectivity:
• Local versus SAN attached versus NAS attached
• Use of storage snapshots
The snapshot technology used determines the read performance
◆ Storage replication:
Some replication technologies add significant latency to write access slows down
storage access.
◆ Storage type:
• Serial ATA (SATA) computer bus is a storage-interface for connecting host bus
adapters to storage devices such as hard disk drives and optical drives.
• Fibre Channel (FC) is a gigabit-speed network technology primarily used for
storage networking.
• Flash is a non-volatile computer storage used for general storage and the
transfer of data between computers and other digital products.
◆ I/O transfer rate of storage:
I/O transfer rate of storage is influenced by different RAID levels, where the best
RAID level for the backup server is RAID1 or RAID5. Backup to disk should use
RAID3.
◆ Scheduled I/O:
If the target system is scheduled to perform I/O intensive tasks at a specific time,
schedule backups to run at a different time.
System components 19
Size the NetWorker Environment
◆ I/O data:
• Raw data access offers the highest level of performance, but does not logically
sort saved data for future access.
• File systems with a large number of files have degraded performance due to
additional processing required by the file system.
◆ Compression:
If data is compressed on the disk, the operating system or an application, the data
is decompressed before a backup. The CPU requires time to re-compress the files,
and disk speed is negatively impacted.
Network
The components that impact network configuration performance are listed here:
◆ IP network
A computer network made of devices that support the Internet Protocol to
determine the source and destination of network communication.
◆ Storage network
The system on which physical storage, such as tape, disk, or file system resides.
◆ Network speed
The speed at which data travels over the network.
◆ Network bandwidth
The maximum throughput of a computer network.
◆ Network path
The communication path used for data transfer in a network.
◆ Network concurrent load
The point at which data is placed in a network to ultimately maximize
bandwidth.
◆ Network latency
The measure of the time delay for data traveling between source and target
devices in a network.
Target device
The components that impact performance in target device configurations are listed
here:
◆ Storage type:
• Raw disk versus Disk Appliance:
– Raw disk: Hard disk access at a raw, binary level, beneath the file system
level.
– Disk Appliance: A system of servers, storage nodes, and software.
20 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
System components 21
Size the NetWorker Environment
Datazone
A datazone is a single NetWorker server and its client computers. Additional
datazones can be added as backup requirements increase.
Note: It is recommended to have no more than 1500 clients or 3000 client instances per
NetWorker datazone. This number reflects an average NetWorker server and is not a hard
limit.
22 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
The NMC often runs on the backup server, and adds significant load to the backup
server. For larger environments, it is recommended to install NMC on a separate
computer. A single NMC server can be used to administer multiple backup servers.
NetWorker server
NetWorker servers provide services to back up and recover data for the NetWorker
client computers in a datazone. The NetWorker server can also act as a storage node
and control multiple remote storage nodes.
Index and media management operations are some of the primary processes of the
NetWorker server:
◆ The client file index tracks the files that belong to a save set. There is one client file
index for each client.
◆ The media database tracks:
• The volume name
• The location of each saveset fragment on the physical media (file number/file
record)
• The backup dates of the save sets on the volume
• The file systems in each save set
◆ Unlike the client file indexes, there is only one media database per server.
◆ The client file indexes and media database can grow to become prohibitively large
over time and will negatively impact backup performance.
◆ The NetWorker server schedules and queues all backup operations, tracks
real-time backup and restore related activities, and all NMC communication. This
information is stored for a limited amount of time in the jobsdb which for
real-time operations has the most critical backup server performance impact.
Note: The data stored in this database is not required for restore operations.
Note: The typical NetWorker server workload is from many small I/O operations. This is
why disks with high latency perform poorly despite having peak bandwidth. High latency
rates are the most common bottleneck of a backup server in larger environments.
◆ Avoid additional software layers as this adds to storage latency. For example, the
antivirus software should be configured with the NetWorker databases (/nsr) in
its exclusion list.
◆ Plan the use of replication technology carefully as it significantly increases
storage latency.
◆ Ensure that there is sufficient CPU power for large servers to complete all internal
database tasks.
◆ Use fewer CPUs, as systems with fewer high performance CPUs outperform
systems with numerous lower performance CPUs.
◆ Do not attach a high number of high performance tape drives or AFTD devices
directly to a backup server.
◆ Ensure that there is sufficient memory on the server to complete all internal
database tasks.
◆ Off-load backups to dedicated storage nodes when possible for clients that must
act as a storage node by saving data directly to backup server.
Note: The system load that results from storage node processing is significant in large
environments. For enterprise environments, the backup server should backup only its internal
databases (index and bootstrap).
24 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
NetWorker client
A NetWorker client computer is any computer whose data must be backed up. The
NetWorker Console server, NetWorker servers, and NetWorker storage nodes are also
NetWorker clients. NetWorker clients hold mission critical data and are resource
intensive. Applications on NetWorker clients are the primary users of CPU, network,
and I/O resources. Only read operations performed on the client do not require
additional processing.
Client speed is determined by all active instances of a specific client backup at a point
in time.
NetWorker databases
The factors that determine the size of NetWorker databases are available in
“NetWorker database bottlenecks” on page 32.
Virtual environments
NetWorker clients can be created for virtual machines for either traditional backup or
VMware Consolidated Backup (VCB). Additionally, the NetWorker software can
automatically discover virtual environments and changes to those environments on
either a scheduled or on-demand basis and provides a graphical view of those
environments.
26 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
As illustrated in Figure 5 on page 29, the network is upgraded from a 100 base T
network to a GigE network, and the bottleneck has moved to another device. The
host is now unable to generate data fast enough to use the available network
bandwidth. System bottlenecks can be due to lack of CPU, memory, or other
resources.
28 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
30 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
Although the local volumes are performing at optimal speeds, they are unable to use
the available system, network, and target device resources. To improve the storage
performance, move the data volumes to high performance external RAID arrays.
As illustrated in Figure 8 on page 32, the external RAID arrays have improved the
system performance. The RAID arrays perform nearly as well as the other
components in the chain ensuring that performance expectations are met. There will
always be a bottleneck, however the impact of the bottleneck device is limited as all
devices are performing at almost the same level as the other devices in the chain.
Note: This section does not suggest that all components must be upgraded to improve
performance, but attempts to explain the concept of bottlenecks, and stresses the importance of
having devices that perform at similar speeds as other devices in the chain.
32 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Size the NetWorker Environment
◆ For the NetWorker client file index database (nsr/index): The number of files
indexed and in the browse policy. This is normally the largest of the NetWorker
databases. For storage sizing, use this formula:
Index catalog size = (n+(i*d))*c*160*1.5
where:
n = number of files to backup
d = days in cycle (time between full backups)
i = incremental data change per day in percentages
c = number of cycles online (browse policy)
The statistical average is 160 bytes per entry in the catalog.
Multiply by 1.5 to accommodate growth and error
Note: The index database can be split over multiple locations, and the location is
determined on a per client bases.
Figure 9 on page 33 illustrates the overall performance degradation when the disk
performance on which NetWorker media database resides is a bottleneck.
34 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
3
Tune Settings
The NetWorker software has various optimization features that can be used to tune
the backup environment and to optimize backup and restore performance.
This chapter incudes the following topics:
◆ Optimize NetWorker parallelism..................................................... 36
◆ Device performance tuning methods.............................................. 38
◆ Network devices................................................................................. 39
◆ Network optimization ....................................................................... 42
◆ Storage optimization.......................................................................... 50
Tune Settings 35
Tune Settings
Server parallelism
The server parallelism attribute controls how many save streams the server accepts
simultaneously. The more save streams the server can accept, the faster the devices
and client disks run. Client disks can run at their performance limit or the limits of
the connections between them.
Server parallelism is not used to control the startup of backup jobs, but as a final limit
of sessions accepted by a backup server. The server parallelism value should be as
high as possible while not overloading the backup server itself.
Client parallelism
The best approach for client parallelism values is:
◆ For regular clients, use the lowest possible parallelism settings to best balance
between the number of save sets and throughput.
◆ For the backup server, set highest possible client parallelism to ensure that index
backups are not delayed. This ensures that groups complete as they should.
Often backup delays occur when client parallelism is set too low for the NetWorker
server. The best approach to optimize NetWorker client performance is to eliminate
client parallelism, reduce it to 1, and increase the parallelism based on client
hardware and data configuration.
It is critical that the NetWorker server has sufficient parallelism to ensure index
backups do not impede group completion.
The client parallelism values for the client that represents the NetWorker server are:
◆ Never set parallelism to 1
◆ For small environments (under 30 servers), set parallelism to at least 8
◆ For medium environments (31–100 servers), set parallelism to at least 12
◆ For larger environments (100+ servers), set parallelism to at least 16
These recommendations assume that the backup server is a dedicated backup server.
The backup server should always be a dedicated server for optimum performance.
Group parallelism
The best approach for group parallelism values is:
◆ Create save groups with a maximum of 50 clients with group parallelism
enforced. Large save groups with more than 50 clients can result in many
operating system processes starting at the same time causing temporary
operating system resource exhaustion.
◆ Stagger save group start times by a small amount to reduce the load on the
operating system. For example, it is best to have 4 save groups, each with 50
clients, starting at 5 minute intervals than to have 1 save group with 200 clients.
36 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
Multiplexing
The Target Sessions attribute sets the target number of simultaneous save streams
that write to a device. This value is not a limit, therefore a device might receive more
sessions than the Target Sessions attribute specifies. The more sessions specified for
Target Sessions, the more save sets that can be multiplexed (or interleaved) onto the
same volume.
“AFTD device target and max sessions” on page 40 provides additional information
on device Target Sessions.
Performance tests and evaluation can determine whether multiplexing is appropriate
for the system. Follow these guidelines when evaluating the use of multiplexing:
◆ Find the maximum rate of each device. Use the bigasm test described in “The
bigasm directive” on page 56.
◆ Find the backup rate of each disk on the client. Use the uasm test described in
“The uasm directive” on page 56.
If the sum of the backup rates from all disks in a backup is greater than the maximum
rate of the device, do not increase server parallelism. If more save groups are
multiplexed in this case, backup performance will not improve, and recovery
performance might slow down.
Built-in compression
Turn on device compression to increase effective throughput to the device. Some
devices have a built-in hardware compression feature. Depending on how
compressible the backup data is, this can improve effective data throughput, from a
ratio of 1.5:1 to 3:1.
Drive streaming
To obtain peak performance from most devices, stream the drive at its maximum
sustained throughput. Without drive streaming, the drive must stop to wait for its
buffer to refill or to reposition the media before it can resume writing. This can cause
a delay in the cycle time of a drive, depending on the device.
38 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
Network devices
If data is backed up from remote clients, the routers, network cables, and network
interface cards affect the backup and recovery operations. This section lists the
performance variables in network hardware, and suggests some basic tuning for
networks. The following items address specific network issues:
◆ Network I/O bandwidth:
The maximum data transfer rate across a network rarely approaches the
specification of the manufacturer because of network protocol overhead.
Note: The following statement concerning overall system sizing must be considered when
addressing network bandwidth.
Each attached tape drive (physical VTL or AFTD) uses available I/O bandwidth,
and also consumes CPU as data still requires processing.
◆ Network path:
Networking components such as routers, bridges, and hubs consume some
overhead bandwidth, which degrades network throughput performance.
◆ Network load:
• Do not attach a large number of high-speed NICs directly to the NetWorker
server, as each IP address use significant amounts of CPU resources. For
example, a mid-size system with four 1 GB NICs uses more than 50 percent of
its resources to process TCP data during a backup.
• Other network traffic limits the bandwidth available to the NetWorker server
and degrades backup performance. As the network load reaches a saturation
threshold, data packet collisions degrade performance even more.
DataDomain
Backup to DataDomain storage can be configured by using multiple technologies:
◆ Backup to VTL:
NetWorker devices are configured as tape devices and data transfer occurs over
Fiber Channel.
Information on VTL optimization is available in “Number of virtual device drives
versus physical device drives” on page 41.
◆ Backup to AFTD over CIFS or NFS:
• Overall network throughput depends on the CIFS and NFS performance
which depends on network configuration.
“Network optimization” on page 42 provides best practices on backup to
AFTD over CIFS or NFS.
• Inefficiencies in the underlying transport limits backup performance to 70-80
percent of the link speed. For optimal performance, NetWorker release 7.5
Service Pack 2 or later is required.
◆ Backup to DataDomain by using native device type:
• NetWorker 7.6 Service Pack 1 provides a new device type designed specifically
for native communication to DataDomain storage over TCP/IP links.
Network devices 39
Tune Settings
Note: Despite the method used for backup to DataDomain storage, the aggregate
backup performance is limited by the maximum ingress rate of the specific
DataDomain model.
40 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
Network devices 41
Tune Settings
Network optimization
This section explains the following:
◆ “Advanced configuration optimization” on page 42
◆ “Operating system TCP stack optimization” on page 42
◆ “Advanced tuning” on page 43
◆ “Expected NIC throughput values” on page 43
◆ “Network latency” on page 43
◆ “Ethernet duplexing” on page 45
◆ “Firewalls” on page 45
◆ “Jumbo frames” on page 45
◆ “Congestion notification” on page 45
◆ “TCP buffers” on page 46
◆ “NetWorker socket buffer size” on page 47
◆ “IRQ balancing and CPU affinity” on page 47
◆ “Interrupt moderation” on page 48
◆ “TCP offloading” on page 48
◆ “Name resolution” on page 49
42 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
Note: It is required that all network components in the data path are able to handle jumbo
frames. Do not enable jumbo frames if this is not the case.
Advanced tuning
IRQ processing for high-speed NICs is very expensive, but can provide enhanced
performance by selecting specific CPU cores. Specific recommendations depend on
the CPU architecture.
Network latency
Increased network TCP latency has a negative impact on overall throughput, despite
the amount of available link bandwidth. Longer distances or more hops between
network hosts can result in lower overall throughput.
Network latency has a high impact on the efficiency of bandwidth use.
For example, Figure 10 on page 44 and Figure 11 on page 44 illustrate backup
throughput on the same network link, with varying latency.
Network optimization 43
Tune Settings
44 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
Ethernet duplexing
Network links that perform in half-duplex mode cause decreased NetWorker traffic
flow performance. For example, a 100 Mb half-duplex link results in backup
performance of less than 1 MB/s.
The default configuration setting on most operating systems for duplexing is auto
negotiated as recommended by IEEE802.3. However, auto negotiation requires that
the following conditions are met:
◆ Proper cabling
◆ Compatible NIC adapter
◆ Compatible switch
Auto negotiation can result in a link performing as half-duplex.
To avoid issues with auto negotiation, force full-duplex settings on the NIC. Forced
full-duplex setting must be applied to both sides of the link. Forced full-duplex on
only one side of the link results in failed auto negotiation on the other side of the link.
Firewalls
The additional layer on the I/O path in a hardware firewall increases network
latency, and reduces the overall bandwidth use.
It is recommended to avoid using software firewalls on the backup server as it
processes a large number of packets resulting in significant overhead.
Details on firewall configuration and impact are available in the technical note,
Configuring TCP Networks and Network Firewalls for EMC NetWorker.
Jumbo frames
It is recommended to use jumbo frames in environments capable of handling them.
If both the source and the computers, and all equipment in the data path are capable
of handling jumbo frames, increase the MTU to 9 KB:
These examples are for Linux and Solaris operating systems:
◆ Linux: ifconfig eth0 mtu 9000 up
◆ Solaris: nxge0 accept-jumbo 1
Congestion notification
This section describes how to disable congestion notification algorithms.
◆ Windows 2008 R2 only:
1. Disable optional congestion notification algorithms:
C:\> netsh interface tcp set global ecncapability=disabled
2. Advanced TCP algorithm provides the best results on Windows. However
disable advanced TCP algorithm if both sides of the network conversion are
not capable of the negotiation:
C:\> netsh interface tcp set global congestionprovider=ctcp
Network optimization 45
Tune Settings
◆ Linux:
1. Check for non-standard algorithms:
cat /proc/sys/net/ipv4/tcp_available_congestion_control
2. Disable ECN:
echo 0 >/proc/sys/net/ipv4/tcp_ecn
◆ Solaris:
Disable TCP Fusion if present:
set ip:do_tcp_fusion = 0x0
TCP buffers
For high-speed network interfaces, increase size of TCP send/receive buffers:
◆ Linux:
echo 262144 >/proc/sys/net/core/rmem_max
echo 262144 >/proc/sys/net/core/wmem_max
echo 262144 >/proc/sys/net/core/rmem_default
echo 262144 >/proc/sys/net/core/wmem_default
echo '8192 524288 2097152' >/proc/sys/net/ipv4/tcp_rmem
echo '8192 524288 2097152' >/proc/sys/net/ipv4/tcp_wmem
Set the recommended RPC value:
sunrpc.tcp_slot_table_entries = 64
Another method is to enable dynamic TCP window scaling. This requires
compatible equipment in the data path:
sysctl -w net.ipv4.tcp_window_scaling=1
◆ Solaris:
tcp_max_buf 10485760
tcp_cwnd_max 10485760
tcp_recv_hiwat 65536
tcp_xmit_hiwat 65536
◆ AIX
Modify the values for the parameters in /etc/rc.net if the values are lower than
the recommended. The number of bytes a system can buffer in the kernel on the
receiving sockets queue:
no -o tcp_recvspace=524288
The number of bytes an application can buffer in the kernel before the application
is blocked on a send call:
no -o tcp_sendspace=524288
46 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
◆ Windows:
• The default buffer sizes maintained by the Windows operating system are
sufficient.
• Set the registry entry:
AdditionalCriticalWorkerThreads: DWORD=10
• If the NIC drivers are able to create multiple buffers or queues at the
driver-level, enable it at the driver level. For example, Intel 10 Gb NIC drivers
by default have RSS Queues set to 2, and the recommended value for
optimum performance is 16.
• The Windows 2008 sever introduces a method to auto tune the TCP stack. If a
server on the LAN or a network device in the datazone such as a router or
switch does not support TCP Windows scaling, backups can fail. To avoid
failed backups, and ensure optimal NetWorker operations, apply the
Microsoft Hotfix KB958015 to the Windows 2008 Server, and set the auto
tuning level value to highlyrestricted:
1. Check the current TCP settings:
C:\> netsh interface tcp show global
2. If required, restrict the Windows TCP receive side scaling auto tuning level:
C:\> netsh interface tcp set global
autotuninglevel=highlyrestricted
Note: If the hotfix KB958015 is not applied, the autotuning level must be set to disabled
rather than highlyrestricted.
Note: The general rule is that only one core per physical CPU should handle NIC interrupts.
Use multiple cores per CPU only if there are more NICs than CPUs. However, transmitting and
receiving should always be handled by the same CPU without exception.
Network optimization 47
Tune Settings
◆ Solaris:
Interrupt only one core per CPU. For example, for a system with 4 CPUs and 4
cores per CPU, use this command:
psradm -i 1-3 5-7 9-11 13-15
Additional tuning depends on the system architecture.
These are examples of successful settings on a Solaris system with a T1/T2 CPU
(Niagara):
ddi_msix_alloc_limit 8
tcp_squeue_wput 1
ip_soft_rings_cnt 64
ip_squeue_fanout 1
Some NIC drivers artificially limit interrupt rates to reduce peak CPU use. However,
this also limits the maximum achievable throughput. If a NIC driver is set for
"Interrupt moderation," disable it for optimal network throughput.
Interrupt moderation
On Windows, for a 10GB network, it is recommended to disable interrupt moderation
for the network adapter to improve network performance.
TCP offloading
For systems with NICs capable of handling TCP packets at a lower level, enable TCP
offloading on the operating system to:
◆ Increase overall bandwidth utilization
◆ Decrease the CPU load on the system
Note: Not all NICs that market offloading capabilities are fully compliant with the standard.
◆ For a Windows 2008 server, use this command to enable TCP offloading:
C:\> netsh interface tcp set global chimney=enabled
48 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
◆ For a Windows 2008 R2 server, use these commands with additional properties to
enable TCP offloading:
C:\> netsh interface tcp set global dca=enabled
C:\> netsh interface tcp set global netdma=enabled
◆ Disable TCP offloading for older generation NIC cards that exhibit problems such
as backup sessions that hang, or fail with RPC errors similar to this:
Connection reset by peer
Name resolution
The NetWorker server relies heavily on the name resolution capabilities of the
operating system.
For a DNS server, set low-latency access to the DNS server to avoid performance
issues by configuring, either of these:
◆ Local DNS cache
or
◆ Local non-authoritative DNS server with zone transfers from the main DNS
server
Ensure that the server name and hostnames assigned to each IP address on the
system are defined in the hosts file to avoid DNS lookups for local hostname checks.
Network optimization 49
Tune Settings
Storage optimization
This section describes settings for NetWorker sever and storage node disk
optimization.
Note: Avoid using synchronous replication technologies or any other technology that
adversely impacts latency.
50 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Tune Settings
◆ For NDMP backups on the NetWorker server, use a separate location for
/nsr/tmp folder to accommodate large temporary file processing.
◆ Use the operating system to handle parallel file system I/O even if all mount
points are on the same physical location. The operating system handles parallel
file system I/O more efficiently than the NetWorker software.
◆ Use RAID-3 for disk storage for AFTD.
◆ For antivirus software, disable scanning of the NetWorker databases. If the
antivirus software is able to scan the /nsr folder, performance degradation,
time-outs, or NetWorker database corruption can occur because of frequent file
open/close requests. The antivirus exclude list should also include NetWorker
storage node locations used for Advanced File Type Device (AFTD).
Note: Disabled antivirus scanning of specific locations might not be effective if it includes
all locations during file access, despite the exclude list if it skips scanning previously
accessed files. Contact the specific vendor to obtain an updated version of the antivirus
software.
◆ For file caching, aggressive file system caching can cause commit issues for:
• The NetWorker server: all NetWorker databases can be impacted (nsr\res,
nsr\index, nsr\mm).
• The NetWorker storage node: When configured to use Advanced File Type
Device (AFTD).
Be sure to disable delayed write operations, and use file system driver Flush
and Write-Through commands instead.
◆ Disk latency considerations for the NetWorker server are higher than for typical
server applications as NetWorker utilizes committed I/O: Each write to NW
internal database must be acknowledged and flushed before next write is
attempted. This is to avoid any potential data loss in internal databases. These are
considerations for /nsr in cases where storage is replicated or mirrored:
• Do not use software based replication as it adds an additional layer to I/O
throughput and causes unexpected NetWorker behavior.
• With hardware based replication, the preferred method is asynchronous
replication as it does not add latency on write operations.
• Do not use synchronous replication over long distance links, or links with
non-guaranteed latency.
• SANs limit local replication to 12 km and longer distances require special
handling.
• Do not use TCP networks for synchronous replication as they do not
guarantee latency.
• Consider the number of hops as each hardware component adds latency.
Storage optimization 51
Tune Settings
52 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
4
Test Performance
This chapter describes how to test and understand bottlenecks by using available
tools including NetWorker programs such as bigasm and uasm. This chapter includes
the following topics:
◆ Determine symptoms ................................................................................................... 52
◆ Monitor performance.................................................................................................... 53
◆ Determine bottlenecks by using a generic FTP test.................................................. 54
◆ Test the performance of the setup by using dd......................................................... 55
◆ Test disk performance by using bigasm and uasm .................................................. 56
Test Performance 51
Test Performance
Determine symptoms
Considerations for determining the reason for poor backup performance are listed
here:
◆ Is the performance consistent for the entire duration of the backup?
◆ Do the backups perform better when started at a different time?
◆ Is it consistent across all save sets for the clients?
◆ Is it consistent across all clients with similar system configuration using a specific
storage node?
◆ Is it consistent across all clients with similar system configuration in the same
subnet?
◆ Is it consistent across all clients with similar system configuration and
applications?
Observe how the client performs with different parameters. Inconsistent backup
speed can indicate problems with software or firmware.
For each NetWorker client, answer these questions:
◆ Is the performance consistent for the entire duration of the backup?
◆ Is there a change in performance if the backup is started at a different time?
◆ Is it consistent across all clients using specific storage node?
◆ Is it consistent across all save sets for the client?
◆ Is it consistent across all clients in the same subnet?
◆ Is it consistent across all clients with similar operating systems, service packs,
applications?
◆ Does the backup performance improve during the save or does it decrease?
These and similar questions can help to identify the specific performance issues.
52 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Test Performance
Monitor performance
Monitor the I/O, disk, CPU, and network performance by using native performance
monitoring tools such as:
◆ Windows: Perfmon
◆ UNIX: iostat, vmstat, or netstat commands
Unusual activity before, during, and after backups can determine that devices are
using excessive resources.
By using these tools to observe performance over a period of time, resources
consumed by each application, including NetWorker are clearly identified.
If it is discovered that slow backups are due to excessive network use by other
applications, this can be corrected by changing backup schedules.
Note: High CPU use is often the result of waiting for external I/O, not insufficient CPU power.
This is indicated by high CPU use inside SYSTEM versus user space.
Monitor performance 53
Test Performance
Note: Do not use local volumes to create and transfer files for ftp tests, use backup
volumes.
54 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide
Test Performance
2. Make note of the time it takes for the file to transfer, and compare it with the
current tape performance.
56 EMC NetWorker Release 7.6 Service Pack 1 Performance Optimization Planning Guide