0% found this document useful (0 votes)
2K views

Using Performance Monitor To Troubleshoot SAN Performance-Final

A poorly performing SAN can result in user complaints, application jobs not completing in time, and could impact disaster recovery plans and regulatory compliance. By either identifying or eliminating the SAN as the cause of the performance issues, resolution will occur that much faster.

Uploaded by

flatexy
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views

Using Performance Monitor To Troubleshoot SAN Performance-Final

A poorly performing SAN can result in user complaints, application jobs not completing in time, and could impact disaster recovery plans and regulatory compliance. By either identifying or eliminating the SAN as the cause of the performance issues, resolution will occur that much faster.

Uploaded by

flatexy
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

USING PERFORMANCE MONITOR TO TROUBLESHOOT SAN

PERFORMANCE

Introduction
This document is intended to be used by storage administrators who are concerned about the
performance of their LeftHand storage system. The goal of this document is to provide the
reader with the knowledge and tools necessary to determine whether the SAN is performing as
expected, and if so, if there is room for more performance.

The performance of the SAN can dramatically affect the performance of the application that
depends on that storage. A poorly performing SAN can result in user complaints, application
jobs not completing in time, and could impact disaster recovery plans and regulatory compliance.
By either identifying or eliminating the SAN as the cause of the performance issues, resolution
will occur that much faster.

© 2009, LeftHand Networks. All rights reserved.


Contents

Performance Troubleshooting Flowchart ..................................................................................... 10


Best Practices for Gathering Statistics .......................................................................................... 11
Understanding Expected SAN Performance ................................................................................. 12
FAQ............................................................................................................................................... 15
Examples ....................................................................................................................................... 16
Glossary ........................................................................................................................................ 18
Contacting Support ....................................................................................................................... 20

Page 2 of 20
Performance Trends and Relationships
I/O Size
When analyzing SAN1performance, it is important to keep in mind several trends and
relationships between the statistics that are being monitored, as changing one aspect of a
workload can have a measurable impact on another. As a result, performance expectations can
change, and knowing how these variables impact one another can help to set realistic
expectations of SAN performance.

The size of the I/O to and from the SAN impacts the measurable performance statistics of the
SAN. Specifically, the smaller the I/O size, the more I/Os per second (IOPS) the SAN can
process. However, the corollary to this is a decrease in throughput (as measured in MB/s).
Conversely, as I/O size increases, IOPS decreases but throughput increases. When an I/O gets
above a certain size, latency also increases as the time required to transport each I/O increases
such that the disk itself is no longer the major influence on latency. The following chart shows
the relationship between I/O size, IOPS, and throughput.

Chart 1: Sample relationship between I/O Size, Throughput, and I/Os per Second

Queue Depth
Queue depth is the amount of outstanding I/O waiting for processing by the SAN. In other
words, it’s the count of how many pieces of data are stacked up waiting to get written to or read
from the SAN. If the queue depth is low (as defined below), it means that there are few (or no)
I/Os waiting on the SAN. Latency (or response time) is minimal as each I/O gets processed
immediately. IOPS are reduced because the SAN is waiting on I/O from the application. If

1
The terms “SAN” and “cluster” are used throughout this document. “SAN” is used when discussing topics that
apply to all storage, “cluster” is used when discussing topics specific to an HP LeftHand product.

Page 3 of 20
queue depth is high, there are outstanding I/Os waiting to be serviced by the SAN. This
increases IOPS, but adds latency because each I/O is waiting to be serviced instead of being
serviced immediately. The SAN performs optimally when there are enough I/Os outstanding to
keep the SAN busy, but not so many that each I/O has to wait longer than desired to get serviced.

The optimal queue depth for a cluster would be 1-2 times the number of physical disks in the
SAN for SAS drives, and one times the number of physical drives for SATA. This difference is
because of the way SAS and SATA differ in the way they handle the queuing of I/O2. With the
Virtualization SAN, the optimal queue depth for any volume would be in the 40 to 48 range (24
physical disks * 2 = 48).

Disk Count
Disk Type
12 24 48 96

SAS (2x disk


24 48 96 192
count)

SATA (1x disk


12 24 48 96
count)

Table 1: Optimal queue depth by drive type and quantity

Latency
Latency is the amount of time the system is waiting for the I/O to complete by either getting the
data back from the SAN during reads or getting acknowledgements back for the writes.
Latencies and queue depth are correlated because as the SAN gets busier, it takes longer to
service an individual I/O, as shown in the chart below. As mentioned above, queue depth can
improve overall SAN performance to a certain point, until the SAN becomes saturated. This
saturation point is dependent on many variables, such as platform and disk type, workload to the
SAN, etc. Saturating the SAN leads to high latency, as illustrated in Chart 2.

Latencies matter because if an application is waiting on storage for data, then the users are
waiting on the application. Applications where response times are critical, such as email and
OLTP (Online Transactional Processing) systems, require lower latencies than applications such
as backup. As an example, imagine waiting 10 seconds for an email to open versus 10 seconds
for a backup job to get data from the SAN. The nature of the application determines the
tolerance for latency on the SAN.

2
For more information, see http://www.serialstoragewire.net/Articles/2006_03/sysinsights16.html

Page 4 of 20
Chart 2: Sample Relationship of IOPS and Latency to Queue Depth

SAS vs. SATA


Drives based on SAS technology offer higher rotational speeds, lower latency, and the ability to
queue commands more effectively than SATA, which makes SAS optimal for workloads such as
virtualization, email servers, OLTP, and other random database type applications. Higher
density drives, such as SATA, offer a better Gigabyte per dollar and are especially good for
workloads such as archive repositories, file shares and as a staging area to backup data to tape.
Performance sensitive applications such as email and OLTP are best suited for higher
performance/lower latency SAS drives, and performance may not be satisfactory if those
applications are run on lower performing/higher capacity SATA drives.

Attributes

Disk type Average


Rotational MAXS
disk Cost/GB Reliability Ideal for:
Speed IOPS
latency

Exchange
15,000 SQL
SAS 3.4 ms 290 Low High Databases
RPM
Random workloads
Virtualization
Low-performing
SATA 7,200 RPM 11ms 100 Low Low applications
Fileshare
DR Sites
Table 2: Approximate performance attributes of SAS and SATA drives

Page 5 of 20
Analyzing SAN Performance
There are many variables that affect the performance of a SAN and the metrics that performance
is measured by. The preceding section discusses some of these variables, and moving forward it
is important to recognize that these variables can change the expected performance of the SAN.
Keep the preceding trends in mind while reading the following sections.

Make sure there are no hardware faults


One of the most common causes of performance problems is hardware issues. Some common
hardware issues that can be detected from the hardware diagnostics panel are situations in which
the cache on the RAID controller has been disabled, or problems with the drives such as
complete or pending failures. To run hardware diagnostics, expand the node(s) in question,
select Hardware, right click and “Run Diagnostic Tests”. The results will appear on the right
ride of the CMC – if any tests return a result of “Fail”, contact HP LeftHand support.

Image 1: Example of a hardware fault that can impact cluster performance

Look for discrepancies between nodes


Discrepancies between performance counters on different nodes can be an indication of a
problem with a specific storage node or its connectivity to the network rather than a limitation of
the performance of the SAN as a whole. Key performance metrics to monitor and compare are:

 Collect performance data for each of the storage nodes, including IOPS total, CPU
Utilization, and Network Bytes Total for each of the network cards. If one or more of the
storage nodes look like they are underperforming relative to other storage nodes in the
cluster, it’s an indicator that something is not optimized on that storage node. Some
simple things to check are networks setup and connectivity, and the health of the
hardware (run hardware diagnostics via the CMC).

Page 6 of 20
 Disk latencies higher on one or more node compared to the other storage nodes in the
cluster can indicate that the storage node(s) is not in an optimal state. Things that could
cause this are a disk rebuilds or disabled cache on the controller.
 All nodes can host connections from the server to the volume. If too much traffic is
coming through one node, this could indicate that the network connection to the node is
the bottleneck. Look at the Network Bytes Total for that node. If it is equal to or near
that maximum limit for its network connections, performance may benefit by
reconnecting to the volumes through another node. Maximum throughput for two GigE
ports is ~224MB/s; a 10Gb network card has a maximum throughput of ~1120MB/s.

Are any volumes or storage nodes restriping or resynching?


There are several conditions that can lead to data being moved around on the cluster, transparent
to users. This could include volume(s) restriping as a result of changing volume properties, such
as replication level. Storage nodes can also have to resynchronize if a storage node has been
offline due to maintenance, power loss, etc. When the storage node comes back online, it
resynchronizes with the other nodes in the cluster, and this resynchronization can impact
performance. To determine if volumes are restriping, select “Volumes and Snapshots from the
cluster and look at Status

The restripe rate can be increased by selecting the management group, right clicking, and
selecting Edit Management Group. Adjust the Bandwidth Priority to maximize the rebuild rate if
desired, or minimize the rebuild rate if you would like the SAN I/O to preference applications.

Page 7 of 20
Image 2: Example of a volume during resynchronization

Image 3: Example of a volume during a restripe

Is the host initiator saturated?


For the volume or host initiator, check total throughput. For Windows servers, the maximum
throughput to the cluster is ~112MB/s (1000 Gigabits / 1024) for a gigabit initiator, or
~1120MB/s (10000 Gigabits / 1024) for a 10Gb initiator. For other operating systems that
support bonded (teamed) initiators for iSCSI traffic, that number will increase, depending on the

Page 8 of 20
bonding method used. To determine if the initiator or network is saturated, check the throughput
for the initiator in question via the performance monitor in the CMC or by using a tool such as
Windows Perfmon to monitor the initiator in question.

Has the workload changed?


If more applications or workloads have been added to the cluster, the cluster might have
naturally reached its maximum operating capacity. Table 1 gives estimates of the maximum
performance capacities of common cluster configurations for applications. If your workload
exceeds what is listed in Table 1, you may have outgrown your current cluster performance
capabilities and the cluster will need to grow by adding more nodes.

Is latency high?
As discussed above, latencies are the amount of time the application is waiting for the I/O to
complete. High storage latencies are often the cause of user complaints of slow email, poor
application response time, etc. Table 2 below lists some common applications and their optimal
and acceptable latencies. High latencies can often be reduced by either adding disks to the
cluster or by moving high I/O volumes to a dedicated cluster. To check latencies, add those
counters from the “Performance Monitor” screen in the CMC. More information on this process
can be found in the SAN User Manual for SAN/iQ 8.0

Is queue depth high?


Queue depth is the amount of outstanding I/O waiting for processing by the SAN. If the queue
depth is low, that means the SAN has performance available to be used by applications. For
example, if the cluster queue depth is one, only one disk is getting I/O requests at a time and the
SAN will perform similar to a single hard drive. Ideally, the queue depth should be equal to the
number of physical disks in the cluster for SATA, and about one to two times the number of
physical disks for SAS. Queue depth higher than the recommended maximums can lead to
higher latencies, which can impact application response time. To check queue depth, add those
counters from the “Performance Monitor” screen in the CMC. More information on this process
can be found in the SAN User Manual for SAN/iQ 8.0.

Page 9 of 20
Performance Troubleshooting Flowchart
SAN is not performaning
as expected

Is
Did all
performance Run hardware
hardware
suspect across Yes diagnostics on No Contact LeftHand Support
diagnostics
the entire each NSM
pass?
cluster?

Yes

Are any
No volumes Yes Wait for restripe to complete
restriping?

No
No

Are the
No storage nodes Yes Wait for resynch to complete
resynching?

Has any
Is more data Increased completion time is
application
Yes being Yes expected as more data is being
workload
processed? processed
changed?

No
No

Is the host / SAN is performing as expected.


initiator /network Yes 10Gb on the host(s) might improve
saturated? performance.

No

Collect
Performance
Data for affected
hosts/volumes
Increase the performance of the SAN (cluster)

Is queue
Is latency SAN might have reached maximum
Yes depth Yes
high? performance
high?
by adding more nodes

No SAN background operations may


No
be taking place, contact support

Is queue SAN might have reached maximum


Yes
depth high? performance

SAN is under-utilized. Application


No
tuning may increase performance

Chart 3: Performance Troubleshooting Flowchart

Page 10 of 20
Best Practices for Gathering Statistics
In most SANs, performance is not constant over any specific period of time, such as a day or
week. There may be spikes in the morning, such as when as workers arrive at work and hit email
servers harder than during the rest of the day. Heavy reporting or batch processing may take
place on the weekend, increasing load on servers and storage at peaks higher than the rest of the
week. It is important that statistics are captured for the times with peak workloads, whether
caused by user activity or system maintenance (such as backups).

It’s important that the correct statistics be captured for the application being monitored. If the
application is a database, I/Os and volume latency will be more important than throughput. If the
application is streaming to a server (such as doing a backup to or from the SAN), throughput
would be a better statistic to measure.

Additionally, if the concern is around a specific server or volume, statistics for that item should
be of primary concern. The following guidelines will help to ensure that the proper statistics are
collected3.

Item being monitored Performance Statistics to Monitor

SAN as a whole Cluster statistics:

IOPS total, Throughput Total, Average


I/O Size, Queue Depth Total, I/O
Latency Total

Specific Server Server initiator IQN:

IOPS total, Throughput Total, Average


I/O Size, Queue Depth Total, I/O
Latency Total

Specific Volume For the volume(s) in question:

IOPS total, Throughput Total, Average


I/O Size, Queue Depth Total, I/O
Latency Total.

3
See the SAN User Manual for SAN/iQ 8.0 for more information on how to collect performance data
(http://www.lefthandnetworks.com/document.aspx?oid=a0e00000000013yAAA)

Page 11 of 20
Specific NSM Storage Node Statistics:

CPU Utilization, Network Bytes Total:


Motherboard: Port1, Network Bytes
Total: Motherboard: Port2, IOPS Total,
Throughput Total, Queue Depth Total,
and I/O Latency Total

Table 3: Relevant Performance Monitor statistics for a specific object

Understanding Expected SAN Performance


Workload
Typically workloads can be defined by four categories – I/O size, reads vs. writes, sequential vs.
random, and queue depth. A typical application usually consists of a mix of reads and writes,
and sequential and random. For example, a typical Microsoft Exchange 2007 server has a
workload that is 8k in I/O size, 70% read, and 20% sequential. The type of workload will affect
the results of the performance measurement. For example, a cluster of three storage node2120s
with hardware RAID 5 and Network RAID level 2 will see less than half the throughput on 64k
sequential writes vs. 64k sequential reads (assuming proper queue depth). This is because when
a write is sent to the SAN, it gets written in two places as a result of Network RAID level 2.
Writes to a hardware RAID 5 volume are also slower than reads to a RAID 5 volume, because of
the parity overhead of the RAID 5 writes. When analyzing the performance of your LeftHand
SAN, take the workload into consideration.

Queue Depth and Latency as an Indicator of Cluster Performance


As mentioned above, to maximize the efficiency of the cluster a balance between queue depth
and latency must be found.

As a general rule of thumb, for operations such as databases and email servers where
disks become the bottleneck, the queue depth of the workload for the cluster is
recommended to be no larger than the number of physical disk drives in the SAN for
SATA drives. For example, a three node cluster of 2120 storage nodes with 36 SAS
drives would have an optimal queue depth of around 60.

If your queue depth is higher than recommended and your latency is high, that is a key indicator
that your cluster is saturated. The solution to this is to add more nodes to the cluster. If your
latency is high but your queue depth is low, it indicates some workload on the cluster that is not
related to application workload, such as a volume being restriped across a storage node that was
added. Applications that are sensitive to response time, such as OLTP and email require lower
latencies. Operations such as backups and batch processing or report generation can tolerate
higher latencies from the cluster because there are no users actively waiting for a reply from the
system.

Page 12 of 20
Application Virtualization SAN Multi-Site SAN 6 Total Nodes 8 Total Nodes

Exchange
3,400 IOPS/ 6,800 IOPS/ 10,200 IOPS/ 13,600 IOPS/
(IOPS/USERS)
4,800 users 9,600 users 14,400 users 19,200 users
SQL (8k Random
5,200 IOPS 10,400 IOPS 15,600 IOPS 20,800 IOPS
Reads)

Virtual Servers
136 272 408 544
(Maximum Guests)

File Share
(Maximum 220 MB/s 440 MB/s 660 MB/s 880 MB/s
Throughput)

Table 4-- Expected Performance for NSM 2120-G2 SAS with Network RAID 2 in Hardware RAID 5

The following chart offers preferred sustained response times for specific application types based
on customer experience. Keep in mind that while these numbers are based on known best
practices, your individual requirements and tolerances for latencies can vary greatly from these
numbers. If your cluster is performing to your expectations, no action or concern is required by
you. Also keep in mind that latency on the cluster can spike from time to time. Brief spikes are
not cause for concern; sustained periods of high latency for latency-sensitive applications
indicate that more nodes should be added to the cluster.

Application Typical optimal Typical maximum


response time acceptable response time

Email (Exchange, Lotus Notes, etc) 20ms <100ms

Database/OTLP 20ms <100ms

(SQL Server, Oracle, etc.)

File Share Up to 100ms <200ms

Batch Processing/Reporting Up to 100ms Varies widely

Backup Varies widely Varies widely

Table 5: Optimal disk latencies for applications

Page 13 of 20
Investigating background tasks
The cluster runs background operations that can impact the performance that is available to hosts
and applications. Some examples of background operations include volume restripes as storage
nodes are added to the cluster, or the synchronization of storage nodes that have been offline and
been brought back online. These operations will be evidenced by the IOPS of the cluster being
higher than the total IOPS of the volumes on the cluster, or as cluster IOPS when there are no
volumes connected to the cluster. These are normal background tasks and should not be cause
for alarm; however, as they do consume system resources, they should factor in to your SAN
performance sizing.

Hardware RAID rebuild tasks can also impact performance, such as when a physical hard drive
is replaced in the cluster. This operation is not visible via the Performance Monitor, but can be
seen in the Centralized Management Console in the “Storage” section for that individual storage
node.

Background task How to identify Action

SAN/Node Initialization High Storage Node IOPS Wait for completion

Volume Status on Volume


Volume Restripe Wait for completion
Details Tab in CMC

Volume Status on Volume


Volume Resynchronization Wait for completion
Details Tab in CMC

Volume Status on Volume


Node Restripe Details Tab in CMC for all Wait for completion
volumes

Volume Status on Volume


Node Resynchronization Details Tab in CMC for all Wait for completion
volumes

RAID Status on Disk Setup


RAID Rebuild Wait for completion
tab of Storage in CMC

Table 6: Identification and actions for SAN/iQ background tasks

Page 14 of 20
FAQ
I just built a new cluster, and have not created or attached any volumes. Why do I see
activity on the storage nodes?
When storage nodes are added to a management group or a new management group is created, a
background initialization process occurs on the storage nodes in the management group. This is
a onetime occurrence, but explains the activity on the storage nodes. This activity occurs as
quickly as an individual storage node can write to its disks, and will appear on the Performance
Monitor as a throughput of up to ~200 MB/s of local throughput on each storage node,
depending on the type of drive and the hardware RAID level. This will show up as local I/O and
will not be measured over the network. If these tasks have completed, there will still be some
background activity going on, as SAN/iQ does preventive maintenance on the cluster, checking
for and repairing bad blocks on the physical disks. This is expected and should not be a cause
for concern.

I just deleted a volume/snapshot. Why do I see activity on the storage nodes?


This initialization also occurs when a volume or snapshot is deleted, as SAN/iQ writes zeros over
the deleted volume or snapshot as a security measure and a validation of the integrity of the disk.
If these tasks have completed, there will still be some background activity going on, as SAN/iQ
does preventive maintenance on the cluster, checking for and repairing bad blocks on the
physical disks. This is expected and should not be a cause for concern.

I notice that CPU usage of one storage node is higher than the rest. Is something wrong
with my storage node?
The storage node hosting the Virtual IP address or the coordinating manager might also have
higher CPU usage during certain operations. This is to be expected and not cause for concern. If
you’re not noticing any unexpected performance changes with your applications, there is likely
nothing wrong with the hardware. The cause of the higher CPU usage is probably related to day
to day SAN operations, such as Remote Snapshots or managing the iSCSI sessions to the cluster.
Unless this continues for a prolonged amount of time, there is nothing to worry about in this
example.

CPU/IOPS/Throughput spike occasionally. Why is this?


There are several possible causes for this. Some possible causes include bursty I/O from the
application or I/O directly to or from cache on the storage node. Regardless of the cause, the
important thing to understand is that temporary drops or spikes in I/O to and from the cluster are
considered a normal part of the cluster operation and should not be cause for concern.

I notice that CPU usage of storage nodes occasionally is at 100%. Is something wrong
with my cluster?
If you’re not noticing any unexpected performance changes with your applications, there is likely
nothing wrong with the cluster. The cause of the higher CPU usage is probably related to day to
day cluster operations, such as Snapshot deletes. Unless this continues for a prolonged amount
of time, there is nothing to worry about in this example.

Page 15 of 20
CPU/IOPS/Throughput drop to zero occasionally. Why is this?
There are several reasons that this can happen. Some possible causes include the disk controller
flushing I/O to disk, or the application may have temporarily stopped requesting I/O from the
cluster. Regardless of the cause, the important thing to understand is that temporary drops or
spikes in I/O to and from the cluster are considered a normal part of the cluster operation and
should not be cause for concern.

Examples

Example 1: Backups take unusually long to complete


Step 1: Define the problem.

Application administrator complains that backups are taking twice as long to complete as they
did a week ago.

**Note: The problem might also appear as a slower throughput from the cluster to the backup
server.

Step 2: Identify other possible performance issues

None of the other system administrators have performance concerns

Step 3: Identify any changes in the application workload.

The application administrator says that the workload is relatively unchanged. No new data is
being backed up and the tape devices appear to be functioning properly.

**Note: If the application data being backed up had doubled in size, that also would have led to
the backup taking twice as long (twice as much data at the same speed means twice as much time
to back it up). This is why it is important to have a clear understanding of the problem.

Step 4: Check to see if the initiator is saturated.

Using the Performance Monitor, the administrator sees that throughput from the SAN to the
backup server averages 9 MB/s, with a maximum of 11 MB/s. The administrator does further
investigation on the initiator on the backup server and finds that the switch port the initiator is
plugged in to has auto-negotiated to 100 MB/s. Changing the port to 1000 MB/s speeds up the
backup job, which completes in its usual time frame.

Example 2: Users complain about slow email response time


Step 1: Define the problem.

Page 16 of 20
With the SAS Starter SAN, users complain that opening Outlook can take up to a minute –
opening any single email can take up to 10 seconds.

Step 2: Identify other possible performance issues.

The web administrator mentions that he’s also had complaints of long load times for some pages,
and the DBA mentions that reports run during the day are starting to take longer to complete.
The backup administrator has no performance issues with the backups he runs at midnight.

Step 3: Run hardware diagnostics on every node in the cluster

As the performance issues seem to be occurring against the SAN as a whole, run hardware
diagnostics on each node to identify any node that might have hardware faults, a possible cause
of performance degradation. In this example, all tests on all nodes pass.

**Note: If any of the nodes show a performance fault (the diagnostics tests returns a “Fail”
result), run performance counters as outlined in Table 3 against all of the nodes. Look for
discrepancies between nodes that would indicate a performance impact from the hardware fault.

Step 4: Identify any changes in the application workload.

None of the application administrators can identify any major changes to their applications or
size of their data.

Step 5: Check to see if the initiator is saturated.

None of the initiators for the servers experiencing performance problems are near their
maximum throughput. To be safe, the network configuration is checked and all initiators are set
to 1000MB/s full-duplex. Flow control is also enabled.

Step 6: Check if queue depth on the cluster is high

The Performance Monitor is set to collect data for 48 hours with the following cluster counters:
IOPS total, Throughput Total, Average I/O Size, Queue Depth Total, I/O Latency Total. Queue
depth averages 10, but during the day, queue depth is much higher, often in the 40-50 range.
This correlates to the times of user complaints.

Step 7: Check if latency on the SAN is high

Using the performance counters collected in Step 6, latency averages 12ms, but again, during the
day, latency is much higher, with many spikes over 100ms. At night, latency is very low, except
during the time that the backup operations run.

Based on the information gathered through the Performance Monitor, all signs point to the SAN
being overloaded during the day, when latencies and queue are high. At night, latencies and
queue are high as a result of running the backup jobs. Because the backups are not latency

Page 17 of 20
sensitive, this is not a problem. Because email and the database operations during the day are
latency sensitive, latency must be decreased. The proper solution to this is to add more nodes
(and thus more performance) to the cluster.

Glossary
Bond – The combining of two or more physical network interfaces into a single logical interface
for performance and fault tolerance. Also referred to as a “team”

CMC – The Centralized Management Console. This is the application which serve as a single
pane of glass for monitoring and managing all HP LeftHand storage nodes.

Disk – A single physical hard drive.

Cluster - A cluster is a grouping of storage nodes that create the storage pool from which you
create volumes. This can also be referred to as the SAN.

Initiator - An initiator functions as an iSCSI client. An initiator typically serves the same
purpose to a computer as a SCSI bus adapter would, except that instead of physically cabling
SCSI devices (like hard drives and tape changers), an iSCSI initiator sends SCSI commands over
an IP network.

I/O – An I/O is a read or write to a drive. I/O is usually measured in IOPS – I/Os Per Second.

Latency - latency is a measurement of the read/write performance of a disk drive. Latency is the
time it takes for a particular sector to pass under the read/write head of a disk after the head is
positioned over the appropriate disk track.

NIC - A computer hardware component designed to allow computers to communicate over a


computer network. Also called a network card, network adapter, network interface controller,
network interface card, or LAN adapter

Perfmon – The Windows Performance Monitor console, which can be used to monitor and
collect performance statistics for a number of Microsoft objects.

Queue Depth - Amount of outstanding I/O waiting for processing by the SAN

RAID - RAID (originally redundant array of inexpensive disks, now redundant array of
independent disks) refers to a data storage scheme using multiple hard drives to share or replicate
data among the drives. The benefit of RAID is higher performance and/or availability than
stand-alone drives. The different RAID types supported by HP LeftHand are:

 RAID 0 - data striped across disk set (select platforms only)


 RAID 10 - mirrored sets of RAID 1 disks

Page 18 of 20
 RAID 5 - data blocks are distributed across all disks in a RAID set. Redundant
information is stored as parity distributed across the disks.
 RAID 50 - mirrored sets of RAID 5 disks.

Restripe - Striped data is stored across all disks in the cluster. You might change the
configuration of a volume, for example, change replication level, add a storage node, or remove
a storage node. Because of your change, the pages in the volume must be reorganized across the
new configuration. This reorganization is known as a restripe.

Resynchronization - When a storage node goes down, and writes continue to a second storage
node, and the original store comes back up, the original storage node needs to recoup the exact
data captured by the second storage node.

Rotational Speed – The speed at which the platters in the physical hard drive spin. Higher
rotational speeds lead to higher performance because of reduced seek time and the ability to pull
data off the drive faster.

SAS - Short for Serial Attached SCSI, an evolution of parallel SCSI into a point-to-point serial
peripheral interface in which controllers are linked directly to disk drives. SAS is a performance
improvement over traditional SCSI; its full-duplex signal transmission supports 3.0Gb/s. In
addition, SAS drives can be hot-plugged.

SATA - Serial ATA (Serial Advanced Technology Attachment or SATA) is a new standard for
connecting hard drives into computer systems. As its name implies, SATA is based on serial
signaling technology, unlike IDE (Integrated Drive Electronics) hard drives that use parallel
signaling

Seek Time - Refers to the time a program or device takes to locate a particular piece of data

Snapshot - A point in time image of a volume for use with backup and other applications. The
data on a snapshot does not change, and a live volume can revert back to a snapshot to bring the
data back to a previous state.

Spindle – Another name for disk, a single physical hard drive.

Team – Another name for Bond, the combining of two or more physical network interfaces into
a single logical interface for performance and fault tolerance.

Volume - A logical entity that is made up of storage on one or more storage nodes. It can be
used as raw data storage or it can be formatted with a file system and used by a host or file
server.

Page 19 of 20
Contacting Support
Contact LeftHand Networks Support if you have any questions on the above information:

North America: EMEA:

Basic Contract Customers All Customers


1.866.LEFT-NET (1.866.533.8638) 00.800.5338.4263 (Int'l Toll-Free number)
303.217.9010 +1.303.625.2647 (US number)
http://support.lefthandnetworks.com http://support.lefthandnetworks.com

Premium Contract Customers


1.888.GO-SANIQ (1.888.467.2647)
303.625.2647
http://support.lefthandnetworks.com

Page 20 of 20

You might also like