0% found this document useful (0 votes)
148 views124 pages

ESS Problem Determinatio Guide IBM

libro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views124 pages

ESS Problem Determinatio Guide IBM

libro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Elastic Storage Server

Version 5.3.1

Problem Determination Guide

IBM

GC27-9272-00
Elastic Storage Server
Version 5.3.1

Problem Determination Guide

IBM

GC27-9272-00
Note
Before using this information and the product it supports, read the information in “Notices” on page 97.

This edition applies to version 5.3.1 of the Elastic Storage Server (ESS) for Power, to version 5 release 0 modification
1 of the following product, and to all subsequent releases and modifications until otherwise indicated in new
editions:
v IBM Spectrum Scale RAID (product number 5641-GRS)
Significant changes or additions to the text and illustrations are indicated by a vertical line (|) to the left of the
change.
IBM welcomes your comments; see the topic “How to submit your comments” on page ix. When you send
information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any way it believes
appropriate without incurring any obligation to you.
© Copyright IBM Corporation 2014, 2018.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Tables . . . . . . . . . . . . . . . v Chapter 10. Recovery Group Issues . . 37

About this information . . . . . . . . vii Chapter 11. Contacting IBM . . . . . . 39


Related information . . . . . . . . . . . vii Information to collect before contacting the IBM
Conventions used in this information . . . . . viii Support Center . . . . . . . . . . . . . 39
How to submit your comments . . . . . . . . ix How to contact the IBM Support Center . . . . . 41

Chapter 1. Drive call home in 5146 and Chapter 12. Maintenance procedures 43
5148 systems . . . . . . . . . . . . 1 Updating the firmware for host adapters,
Background and overview . . . . . . . . . . 1 enclosures, and drives . . . . . . . . . . . 43
Installing the IBM Electronic Service Agent . . . . 2 Disk diagnosis . . . . . . . . . . . . . 44
Login and activation . . . . . . . . . . . 3 Background tasks . . . . . . . . . . . . 45
Electronic Service Agent configuration . . . . . 4 Server failover . . . . . . . . . . . . . 46
Creating problem report . . . . . . . . . 7 Data checksums . . . . . . . . . . . . . 46
Uninstalling and reinstalling the IBM Electronic Disk replacement . . . . . . . . . . . . 46
Service Agent. . . . . . . . . . . . . . 12 Other hardware service . . . . . . . . . . 47
Test call home . . . . . . . . . . . . . 12 Replacing failed disks in an ESS recovery group: a
Callback Script Test. . . . . . . . . . . 13 sample scenario . . . . . . . . . . . . . 47
Post setup activities . . . . . . . . . . . 14 Replacing failed ESS storage enclosure components:
a sample scenario . . . . . . . . . . . . 52
| Chapter 2. Software call home . . . . 15 Replacing a failed ESS storage drawer: a sample
scenario . . . . . . . . . . . . . . . 53
Replacing a failed ESS storage enclosure: a sample
Chapter 3. Re-creating the NVR scenario . . . . . . . . . . . . . . . 59
partitions. . . . . . . . . . . . . . 17 Replacing failed disks in a Power 775 Disk
Enclosure recovery group: a sample scenario . . . 66
Chapter 4. Re-creating NVRAM pdisks 19 Directed maintenance procedures available in the
GUI . . . . . . . . . . . . . . . . . 72
Chapter 5. Steps to restore an I/O node 21 Replace disks . . . . . . . . . . . . 72
Update enclosure firmware . . . . . . . . 73
Update drive firmware . . . . . . . . . 73
Chapter 6. Best practices for Update host-adapter firmware . . . . . . . 74
troubleshooting . . . . . . . . . . . 27 Start NSD . . . . . . . . . . . . . . 74
How to get started with troubleshooting. . . . . 27 Start GPFS daemon . . . . . . . . . . . 74
Back up your data . . . . . . . . . . . . 27 Increase fileset space . . . . . . . . . . 75
Resolve events in a timely manner . . . . . . 28 Synchronize node clocks . . . . . . . . . 75
Keep your software up to date . . . . . . . . 28 Start performance monitoring collector service. . 75
Subscribe to the support notification . . . . . . 28 Start performance monitoring sensor service . . 76
Know your IBM warranty and maintenance
agreement details . . . . . . . . . . . . 29 Chapter 13. References . . . . . . . 77
Know how to report a problem . . . . . . . . 29 Events . . . . . . . . . . . . . . . . 77
Messages . . . . . . . . . . . . . . . 77
Chapter 7. Limitations . . . . . . . . 31 Message severity tags . . . . . . . . . . 77
Limit updates to Red Hat Enterprise Linux (ESS 5.3) 31 IBM Spectrum Scale RAID messages . . . . . 79

Chapter 8. Collecting information about Notices . . . . . . . . . . . . . . 97


an issue . . . . . . . . . . . . . . 33 Trademarks . . . . . . . . . . . . . . 98

Chapter 9. GUI Issues . . . . . . . . 35 Glossary . . . . . . . . . . . . . 101


Issue with loading GUI . . . . . . . . . . 35
Index . . . . . . . . . . . . . . . 107

© Copyright IBM Corp. 2014, 2018 iii


iv ESS 5.3.1: Problem Determination Guide
Tables
1. Conventions . . . . . . . . . . . . viii 6. DMPs . . . . . . . . . . . . . . 72
2. IBM websites for help, services, and 7. IBM Spectrum Scale message severity tags
information . . . . . . . . . . . . 29 ordered by priority . . . . . . . . . . 78
3. Background tasks . . . . . . . . . . 45 8. ESS GUI message severity tags ordered by
4. ESS fault tolerance for drawer/enclosure 54 priority . . . . . . . . . . . . . . 78
5. ESS fault tolerance for drawer/enclosure 60

© Copyright IBM Corp. 2014, 2018 v


vi ESS 5.3.1: Problem Determination Guide
About this information
This information guides you in monitoring and troubleshooting the Elastic Storage Server (ESS) Version
5.x for Power® and all subsequent modifications and fixes for this release.

Related information
ESS information

The ESS 5.3.1 library consists of these information units:


v Elastic Storage Server: Quick Deployment Guide, SC27-9205
v Elastic Storage Server: Problem Determination Guide, SC27-9208
v Elastic Storage Server: Command Reference, SC27-9246
v IBM Spectrum Scale RAID: Administration, SC27-9206
v IBM ESS Expansion: Quick Installation Guide (Model 084), SC27-4627
v IBM ESS Expansion: Installation and User Guide (Model 084), SC27-4628
v IBM ESS Expansion: Hot Swap Side Card - Quick Installation Guide (Model 084), GC27-9210
v Installing the Model 024, ESLL, or ESLS storage enclosure, GI11-9921
v Removing and replacing parts in the 5147-024, ESLL, and ESLS storage enclosure
v Disk drives or solid-state drives for the 5147-024, ESLL, or ESLS storage enclosure
v For information about the DCS3700 storage enclosure, see:
– System Storage® DCS3700 Quick Start Guide, GA32-0960-04:
– IBM® System Storage DCS3700 Storage Subsystem and DCS3700 Storage Subsystem with Performance
Module Controllers: Installation, User's, and Maintenance Guide, GA32-0959-07:
http://www.ibm.com/support/docview.wss?uid=ssg1S7004920
v For information about the IBM Power Systems™ EXP24S I/O Drawer (FC 5887), see IBM Knowledge
Center :
http://www.ibm.com/support/knowledgecenter/8247-22L/p8ham/p8ham_5887_kickoff.htm
For more information, see IBM Knowledge Center:

http://www-01.ibm.com/support/knowledgecenter/SSYSP8_5.3.1/sts531_welcome.html

For the latest support information about IBM Spectrum Scale™ RAID, see the IBM Spectrum Scale RAID
FAQ in IBM Knowledge Center:

http://www.ibm.com/support/knowledgecenter/SSYSP8/gnrfaq.html

Switch information
ESS release updates are independent of switch updates. Therefore, it is recommended that Ethernet and
Infiniband switches used with the ESS cluster be at their latest switch firmware levels. Customers are
responsible for upgrading their switches to the latest switch firmware. If switches were purchased
through IBM, review the minimum switch firmware used in validation of this ESS release available in
Customer networking considerations section of Elastic Storage Server: Quick Deployment Guide.

© Copyright IBM Corp. 2014, 2018 vii


Other related information

For information about:


v IBM Spectrum Scale, see IBM Knowledge Center:
http://www.ibm.com/support/knowledgecenter/STXKQY/ibmspectrumscale_welcome.html
v IBM Spectrum Scale call home, see Understanding call home.
v IBM POWER8® servers, see IBM Knowledge Center:
http://www.ibm.com/support/knowledgecenter/POWER8/p8hdx/POWER8welcome.htm
v Extreme Cluster/Cloud Administration Toolkit (xCAT), go to the xCAT website :
http://xcat.org/
v Mellanox OFED Release Notes:
– 4.3: https://www.mellanox.com/related-docs/prod_software/
Mellanox_OFED_Linux_Release_Notes_4_3-1_0_1_0.pdf
– 4.1: https://www.mellanox.com/related-docs/prod_software/
Mellanox_OFED_Linux_Release_Notes_4_1-1_0_2_0.pdf

Conventions used in this information


Table 1 describes the typographic conventions used in this information. UNIX file name conventions are
used throughout this information.
Table 1. Conventions
Convention Usage
bold Bold words or characters represent system elements that you must use literally, such as
commands, flags, values, and selected menu options.

Depending on the context, bold typeface sometimes represents path names, directories, or file
names.
bold underlined bold underlined keywords are defaults. These take effect if you do not specify a different
keyword.
constant width Examples and information that the system displays appear in constant-width typeface.

Depending on the context, constant-width typeface sometimes represents path names,


directories, or file names.
italic Italic words or characters represent variable values that you must supply.

Italics are also used for information unit titles, for the first use of a glossary term, and for
general emphasis in text.
<key> Angle brackets (less-than and greater-than) enclose the name of a key on the keyboard. For
example, <Enter> refers to the key on your terminal or workstation that is labeled with the
word Enter.
\ In command examples, a backslash indicates that the command or coding example continues
on the next line. For example:
mkcondition -r IBM.FileSystem -e "PercentTotUsed > 90" \
-E "PercentTotUsed < 85" -m p "FileSystem space used"
{item} Braces enclose a list from which you must choose an item in format and syntax descriptions.
[item] Brackets enclose optional items in format and syntax descriptions.
<Ctrl-x> The notation <Ctrl-x> indicates a control character sequence. For example, <Ctrl-c> means
that you hold down the control key while pressing <c>.
item... Ellipses indicate that you can repeat the preceding item one or more times.

viii ESS 5.3.1: Problem Determination Guide


Table 1. Conventions (continued)
Convention Usage
| In synopsis statements, vertical lines separate a list of choices. In other words, a vertical line
means Or.

In the left margin of the document, vertical lines indicate technical changes to the
information.

How to submit your comments


Your feedback is important in helping us to produce accurate, high-quality information. You can add
comments about this information in IBM Knowledge Center:

http://www.ibm.com/support/knowledgecenter/SSYSP8/sts_welcome.html

To contact the IBM Spectrum Scale development organization, send your comments to the following
email address:

[email protected]

About this information ix


x ESS 5.3.1: Problem Determination Guide
Chapter 1. Drive call home in 5146 and 5148 systems
ESS version 5.x can generate call home events when a physical drive needs to be replaced in an attached
enclosures.

ESS version 5.x automatically opens an IBM Service Request with service data, such as the location and
FRU number to carryout the service task. The drive call home feature is only supported for drives
installed in 5887, DCS3700 (1818), 5147-024 and 5147-084 enclosures in the 5146 and 5148 systems.

Background and overview


ESS 4.5 introduced ESS Management Server and I/O Server HW call home capability in ESS 5146
systems, where hardware events are monitored by the HMC managing these servers.

When a serviceable event occurs on one of the monitored servers, the Hardware Management Console
(HMC) generates a call home event. ESS 5.X provides additional Call Home capabilities for the drives in
the attached enclosures of ESS 5146 and ESS 5148 systems.

Figure 1. ESS Call Home block diagram

In ESS 5146 the HMC obtains the health status from the Flexible Service Process (FSP) of each server.
When there is a serviceable event detected by the FSP, it is sent to the HMC, which initiates a call home
event if needed. This function is not available in ESS 5148 systems.

© Copyright IBM Corporation © IBM 2014, 2018 1


The IBM Spectrum Scale RAID pdisk is an abstraction of a physical disk. A pdisk corresponds to exactly
one physical disk, and belongs to exactly one de-clustered array within exactly one recovery group.

The attributes of a pdisk includes the following:


v The state of the pdisk
v The disk's unique worldwide name (WWN)
v The disk's field replaceable unit (FRU) code
v The disk's physical location code

When the pdisk state is ok, the pdisk is healthy and functioning normally. When the pdisk is in a
diagnosing state, the IBM Spectrum Scale RAID disk hospital is performing a diagnosis task after an
error has occurred.

The disk hospital is a key feature of the IBM Spectrum Scale RAID that asynchronously diagnoses errors
and faults in the storage subsystem. When the pdisk is in a missing state, it indicates that the IBM
Spectrum Scale RAID is unable to communicate with a disk. If a missing disk becomes reconnected and
functions properly, its state changes back to ok. For a complete list of pdisk states and further information
on pdisk configuration and administration, see IBM Spectrum Scale RAID Administration .

Any pdisk that is in the dead, missing, failing or slow state is known as a non-functioning pdisk. When
the disk hospital concludes that a disk is no longer operating effectively and the number of
non-functioning pdisks reaches or exceeds the replacement threshold of their de-clustered array, the disk
hospital adds the replace flag to the pdisk state. The replace flag indicates the physical disk
corresponding to the pdisk that must be replaced as soon as possible. When the pdisk state becomes
replace, the drive replacement callback script is run.

The callback script communicates with the Electronic Service Agent™ (ESA) over a REST API. The ESA is
installed in the ESS Management Server (EMS), and initiates a call home task. The ESA is responsible for
automatically opening a Service Request (PMR) with IBM support, and managing end-to-end life cycle of
the problem.

Installing the IBM Electronic Service Agent


IBM Electronic Service Agent (ESA) for PowerLinux™ version 4.1 and later can monitor the ESS systems.
It is installed in the ESS Management Server (EMS) during the installation of ESS version 5.X, or when
upgrading to ESS 5.X.

The IBM Electronic Service Agent is installed when the gssinstall command is run. The gssinstall
command can be used in one of the following ways depending on the system:
v For 5146 system:
gssinstall_ppc64 -u
v For 5148 system:
gssinstall_ppc64le -u
The rpm files for the esagent is found in the /install/gss/otherpkgs/rhels7/<arch>/gss directory.

Issue the following command to verify that the rpm for the esagent is installed:
rpm -qa | grep esagent

This gives an output similar to the following:


esagent.pLinux-4.2.0-9.noarch

2 ESS 5.3.1: Problem Determination Guide


Login and activation
After the ESA is installed, the ESA portal can be reached by going to the following link:
https://<EMS or ip>:5024/esa

For example:
https://192.168.45.20:5024/esa

The ESA uses port 5024 by default. It can be changed by using the ESA CLI if needed. For more
information on ESA, see IBM Electronic Service Agent. On the Welcome page, log in to the IBM Electronic
Service Agent GUI. If an untrusted site certificate warning is received, accept the certificate or click Yes to
proceed to the IBM Electronic Service Agent GUI. You can get the context sensitive help by selecting the
Help option located in the upper right corner.

After you have logged in, go to the Main Activate ESA, to run the activation wizard. The activation
wizard requires valid contact, location and connectivity information.

Figure 2. ESA portal after login

The All Systems menu option shows the node where ESA is installed. For example, ems1. The node
where ESA is installed is shown as PrimarySystem in the System Info. The ESA Status is shown as
Online only on the PrimarySystem node in the System Info tab.

Note: The ESA is not activated by default. In case it is not activated, you will get a message similar to
the following:
[root@ems1 tmp]# gsscallhomeconf -E ems1 --show
IBM Electronic Service Agent (ESA) is not activated.
Activated ESA using /opt/ibm/esa/bin/activator -C and retry.

Chapter 1. Drive call home in 5146 and 5148 systems 3


Electronic Service Agent configuration
Entities or systems that can generate events are called endpoints. The EMS, I/O Servers, and attached
enclosures can be endpoints in ESS. Only enclosure endpoints can generate events, and the only event
generated for call home is the disk replacement event. In the ESS 5146 systems, HMC can generate call
home for certain node-related events.

In ESS, the ESA is only installed on the EMS, and automatically discovers the EMS as PrimarySystem.
The EMS and I/O Servers have to be registered to ESA as endpoints. The gsscallhomeconf command is
used to perform the registration task. The command also registers enclosures attached to the I/O servers
by default.

| The software call home is registered based on the customer information given while configuring the ESA
| agent. A software call home group auto is configured by default and the EMS node acts as the software
| call home server. The weekly and daily software call home data collection configuration is also activated
| by default.

| The software call home uses the ESA network connection settings to upload the data to IBM. The ESA
| agent network setup must be complete and working for the software call home to work.

| Note: You cannot configure the software call home without configuring the ESA. For more information,
| see Chapter 2, “Software call home,” on page 15.
usage: gsscallhomeconf [-h] ([-N NODE-LIST | -G NODE-GROUP] [--show] [--prefix PREFIX] [--suffix SUFFIX]
| -E ESA-AGENT [--register {node,all}] [--no-swcallhome] [--crvpd]
[--serial SOLN-SERIAL] [--model SOLN-MODEL] [--verbose]

optional arguments:
-h, --help Show this help message and exit
-N NODE-LIST Provide a list of nodes to configure.
-G NODE-GROUP Provide name of node group.
--show Show callhome configuration details.
--prefix PREFIX Provide hostname prefix. Use = between --prefix and value if the value starts with -.
--suffix SUFFIX Provide hostname suffix. Use = between --suffix and value if the value starts with -.
-E ESA-AGENT Provide nodename for esa agent node
--register {node,all} Register endpoints(nodes, enclosure or all) with ESA.
| --no-swcallhome Do not configure software callhome while configuring hardware callhome
--crvpd Create vpd file.
--serial SOLN-SERIAL Provide ESS solution serial number.
--model SOLN-MODEL Provide ESS model.
--verbose Provide verbose output

A sample output is shown:


[root@ems1 ~]# gsscallhomeconf -E ems1 -N ems1,gss_ppc64 --suffix=-ib
2017-02-07T21:46:27.952187 Generating node list...
2017-02-07T21:46:29.108213 nodelist: ems1 essio11 essio12
2017-02-07T21:46:29.108243 suffix used for endpoint hostname: -ib
End point ems1-ib registered successfully with systemid 802cd01aa0d3fc5137f006b7c9d95c26
End point essio11-ib registered successfully with systemid c7dba51e109c92857dda7540c94830d3
End point essio12-ib registered successfully with systemid 898fb33e04f5ea12f2f5c7ec0f8516d4
End point enclosure G5CT018 registered successfully with systemid
c14e80c240d92d51b8daae1d41e90f57
End point enclosure G5CT016 registered successfully with systemid
524e48d68ad875ffbeeec5f3c07e1acf
ESA configuration for ESS Callhome is complete.
| Started configuring software callhome
| Checking for ESA is activated or not before continuing.
| Fetching customer detail from ESA.
| Customer detail has been successfully fetched from ESA.
| Setting software callhome customer detail.
| Successfully set the customer detail for software callhome.
| Enabled daily schedule for software callhome.
| Enabled weekly schedule for software callhome.
| Direct connection will be used for software calhome.
| Successfully set the direct connection settings for software callhome.

4 ESS 5.3.1: Problem Determination Guide


| Enabled software callhome capability.
| Creating callhome automatic group
| Created auto group for software call home and enabled it.
| Software callhome configuration completed.

The gsscallhomeconf command logs the progress and error messages in the /var/log/messages file. There
is a --verbose option that provides more details of the progress, as well error messages. The following
example displays the type of information sent to the /var/log/messages file in the EMS by the
gsscallhomeconf command.
[root@ems1 vpd]# grep ems1 /var/log/messages | grep gsscallhomeconf

Feb 8 01:37:39 ems1 gsscallhomeconf: [I] End point ems1-ib registered successfully with
systemid 802cd01aa0d3fc5137f006b7c9d95c26
Feb 8 01:37:40 ems1 gsscallhomeconf: [I] End point essio11-ib registered successfully
with systemid c7dba51e109c92857dda7540c94830d3
Feb 8 01:37:41 ems1 gsscallhomeconf: [I] End point essio12-ib registered successfully
with systemid 898fb33e04f5ea12f2f5c7ec0f8516d4
Feb 8 01:43:04 ems1 gsscallhomeconf: [I] ESA configuration for ESS Callhome is complete.

The endpoints are visible in the ESA portal after registration, as shown in the following figure:

Figure 3. ESA portal after node registration

Name
Shows the name of the endpoints that are discovered or registered.
SystemHealth
Shows the health of the discovered endpoints. A green icon (') indicates that the discovered
system is working fine. The red (X) icon indicates that the discovered endpoint has some
problem.
ESAStatus
Shows that the endpoint is reachable. It is updated whenever there is a communication between
the ESA and the endpoint.
SystemType
Shows the type of system being used. Following are the various ESS device types that the ESA
supports.

Chapter 1. Drive call home in 5146 and 5148 systems 5


Figure 4. List of icons showing various ESS device types

Detail information about the node can be obtained by selecting System Information. Here is an example
of the system information:

Figure 5. System information details

When an endpoint is successfully registered, the ESA assigns a unique system identification (system id) to
the endpoint. The system id can be viewed using the --show option.
For example:

6 ESS 5.3.1: Problem Determination Guide


[root@ems1 vpd]# gsscallhomeconf -E ems1 --show
System id and system name from ESA agent

{
"c14e80c240d92d51b8daae1d41e90f57": "G5CT018",
"c7dba51e109c92857dda7540c94830d3": "essio11-ib",
"898fb33e04f5ea12f2f5c7ec0f8516d4": "essio12-ib",
"802cd01aa0d3fc5137f006b7c9d95c26": "ems1-ib",
"524e48d68ad875ffbeeec5f3c07e1acf": "G5CT016"
}

When an event is generated by an endpoint, the node associated with the endpoint must provide the
system id of the endpoint as part of the event. The ESA then assigns a unique event id for the event. The
system id of the endpoints are stored in a file called esaepinfo01.json in the /vpddirectory of the EMS
and I/O servers that are registered. The following example displays a typical esaepinfo01.json file:
[root@ems1 vpd]# cat esaepinfo01.json
{
"encl": {
"G5CT016": "524e48d68ad875ffbeeec5f3c07e1acf",
"G5CT018": "c14e80c240d92d51b8daae1d41e90f57"
},
"esaagent": "ems1", "node": {
"ems1-ib": "802cd01aa0d3fc5137f006b7c9d95c26",
"essio11-ib": "c7dba51e109c92857dda7540c94830d3",
"essio12-ib": "898fb33e04f5ea12f2f5c7ec0f8516d4"
}

In the ESS 5146, the gsscallhomeconf command requires the ESS solution vpd file that contains the IBM
Machine Type and Model (MTM) and serial number information to be present. The vpd file is used by
the ESA in the call home event. If the vpd file is absent, the gsscallhomeconf command fails, and
displays an error message that the vpd file is missing. In this case, you can rerun the command with the
--crvpd option, and provide the serial number and model number using the --serial and --model
options. In ESS 5148, the vpd file is auto generated if not present.

The system vpd information is stored in the essvpd01.json file in the EMS /vpd directory. Here is an
example of a vpd file:
[root@ems1 vpd]# cat essvpd01.json
{
"groupname": "ESSHMC", "model": "GS2",
"serial": "219G17G", "system": "ESS", "type": "5146"
}
[root@ems1 vpd]# cat essvpd01.json
{
"groupname": "ESSHMC", "model": "GS2",
"serial": "219G17G", "system": "ESS", "type": "5146"
}

Creating problem report


After the ESA is activated, and the endpoints for the nodes and enclosures are registered, they can send
an event request to the ESA to initiate a call home.

For example, when replace is added to a pdisk state, indicating that the corresponding physical drive
needs to be replaced, an event request is sent to the ESA with the associated system id of the enclosure
where the physical drive resides. Once the ESA receives the request it generates a call home event. Each
server in the ESS is configured to enable callback for IBM Spectrum Scale RAID related events. These
callbacks are configured during the cluster creation, and updated during the code upgrade. The ESA can
filter out duplicate events when event requests are generated from different nodes for the same physical
drive. The ESA returns an event identification value when the event is successfully processed. The ESA
portal updates the status of the endpoints. The following figure shows the status of the enclosures when

Chapter 1. Drive call home in 5146 and 5148 systems 7


the enclosure contains one or more physical drives identified for replacement:

Figure 6. ESA portal showing enclosures with drive replacement events

The problem descriptions of the events can be seen by selecting the endpoint. You can select an endpoint
by clicking the red X. The following figure shows an example of the problem description.

Figure 7. Problem Description

Name
It is the serial number of the enclosure containing the drive to be replaced.
Description
It is a short description of the problem. It shows ESS version or generation, service task name and
location code. This field is used in the synopsis of the problem (PMR) report.
SRC
It is the Service Reference Code (SRC). An SRC identifies the system component area. For
example, DSK XXXXX, that detected the error and additional codes describing the error
condition. It is used by the support team to perform further problem analysis, and determine
service tasks associated with the error code and event.
Time of Occurrence
It is the time when the event is reported to the ESA. The time is reported by the endpoints in the
UTC time format, which ESA displays in local format.

8 ESS 5.3.1: Problem Determination Guide


Service request
It identifies the problem number (PMR number).
Service Request Status
It indicates reporting status of the problem. The status can be one of the following:
Open
No action is taken on the problem.
Pending
The system is in the process of reporting to the IBM support.
Failed
All attempts to report the problem information to the IBM support has failed. The ESA
automatically retries several times to report the problem. The number of retries can be
configured. Once failed, no further attempts are made.
Reported
The problem is successfully reported to the IBM support.
Closed
The problem is processed and closed.
Local Problem ID
It is the unique identification or event id that identifies a problem.

Problem details
Further details of a problem can be obtained by clicking the Details button. The following figure shows
an example of a problem detail.

Chapter 1. Drive call home in 5146 and 5148 systems 9


Figure 8. Example of a problem summary

If an event is successfully reported to the ESA, and an event ID is received from the ESA, the node
reporting the event uploads additional support data to the ESA that are attached to the problem (PMR)
for further analysis by the IBM support team.

Figure 9. Call home event flow

The callback script logs information in the /var/log/messages file during the problem reporting episode.
The following examples display the messages logged in the /var/log/message file generated by the
essio11 node:

10 ESS 5.3.1: Problem Determination Guide


v Callback script is invoked when the drive state changes to replace. The callback script sends an event
to the ESA:
Feb 8 01:57:24 essio11 gsscallhomeevent: [I] Event successfully sent
for end point G5CT016, system.id 524e48d68ad875ffbeeec5f3c07e1acf,
location G5CT016-6, fru 00LY195.
v The ESA responds by returning a unique event ID for the system ID in the json format.
Feb 8 01:57:24 essio11 gsscallhomeevent:
{#012 "status-details": "Received and ESA is processing",
#012 "event.id": "f19b46ee78c34ef6af5e0c26578c09a9",
#012 "system.id": "524e48d68ad875ffbeeec5f3c07e1acf",
#012 "last-activity": "Received and ESA is processing"
#012}

Note: Here #012 represents the new line feed \n.


v The callback script runs the ionodedatacol.sh script to collect the support data. It collects the
mmfs.log.latest, file and the last 24 hours of the kernel messages in the journal into a .tgz file.
Feb 8 01:58:15 essio11 gsscallhomeevent: [I] Callhome data collector
/opt/ibm/gss/tools/samples/ionodechdatacol.sh finished

Feb 8 01:58:15 essio11 gsscallhomeevent: [I] Data upload successful


for end point 524e48d68ad875ffbeeec5f3c07e1acf
and event.id f19b46ee78c34ef6af5e0c26578c09a9

Call home monitoring


A callback is a one-time event. Therefore, it is triggered when the disk state changes to replace. If the
ESA misses the event , for example if the EMS is down for maintenance, the call home event is not
generated by the ESA.

To mitigate this situation, the callhomemon.sh script is provided in the /opt/ibm/gss/tools/samples


directory of the EMS. This script checks for pdisks that are in the replace state, and sends an event to the
ESA to generate a call home event if there is no open PMR for the corresponding physical drive. This
script can be run on a periodic interval. For example, every 30 minutes.

In the EMS, create a cronjob as follows:


1. Open crontab editor using the following command:
crobtab -e
2. Setup a periodic cronjob by adding the following line:
*/30 * * * */opt/ibm/gss/tools/samples/callhomemon.sh
3. View the cronjob using the following command:
crontab -l
[root@ems1 deploy]# crontab -l
*/30 * * * * /opt/ibm/gss/tools/samples/callhomemon.sh

The call home monitoring protects against missing a call home due to the ESA missing a callback event.
If a problem report is not already created, the call home monitoring ensures that a problem report is
created.

Note: When the call home problem report is generated by the monitoring script, as opposed to being
triggered by the callback, the problem support data is not automatically uploaded. In this scenario, the
IBM support can request support data from the customer.

Upload data
The following support data is uploaded when the system displays a drive replace notification:
v The output of mmlspdisk command for the pdisk that is in replace state.
v Additional support data is provided only when the event is initiated as a response to a callback. The
following information is supplied in a .tgz file as additional support data:

Chapter 1. Drive call home in 5146 and 5148 systems 11


– mmfs.log.latest from the node which generates the event.
– Last 24 hours of the kernel messages (from journal) from the node which generates the event.

Note: If a PMR is created because of the periodic checking of the replaced drive state, for example, when
the callback event is missed, additional support data is not provided.

Uninstalling and reinstalling the IBM Electronic Service Agent


The ESA is not removed when the gssdeploy -c command is run to clean up the system.

The ESA rpm files must be removed manually if needed. Issue the following command to remove the
rpm files for the esagent:
yum remove esagent.pLinux-4.2.0-9.noarch

You can issue the following command to reinstall the rpm files for the esagent. The esagent requires the
gpfs.java file to be installed. The gpfs.java file is automatically installed by the gssinstall and
gssdeploy script. The dependencies may still not be resolved. In such case, use the --nodeps option to
install it.
rpm -ivh --nodeps esagent.pLinux4.1.012.noarch

Test call home


The configuration and setup for call home must be tested to ensure that the disk replace event can trigger
a call home.

The test is composed of three steps:


v ESA connectivity to IBM - Check connectivity from ESA to IBM network. This might not be required if
done during the activation.
/opt/ibm/esa/bin/verifyConnectivity -t
v ESA test Call Home - Test call home from the ESA portal. From the All System tab, check the system
health of the endpoint, and it will show the button for generating Test Problem.
v ESS call home script setup to ensure that the callback script is setup correctly.

Verify that the periodic monitoring is setup.


[root@ems1 deploy]# crontab -l */30 * * * * /opt/ibm/gss/tools/samples/callhomemon.sh

12 ESS 5.3.1: Problem Determination Guide


Figure 10. Sending a Test Problem

Callback Script Test


Verify that the system is healthy by issuing the gnrhealthcheck command. You must also verify that the
active recovery group (RG) server is the primary recovery group server for all recovery groups. For more
recovery group details, see the IBM Spectrum Scale RAID: Administration guide.

To test the callback script, select a pdisk from each enclosure alternating recovery groups. The purpose of
the test call home events is to ensure that all the attached enclosures can generate call home events by
using both the I/O servers in the building block.

For example, in a GS2 system with 5885 enclosure, one can select pdisks e1s02 (left RG) and e2s20 (right
RG). You must find the corresponding recovery group and active server for these pdisks. Send a disk
event to the ESA from the active recovery group server as shown in the following steps:.

Examples:
1. ssh to essio11
gsscallhomeevent --event pdReplacePdisk
--eventName “Test symptom generated by Electronic Service Agent”
--rgName rg_essio11-ib --pdName e1s02

Here the recovery group is rg_essio11-ib, and the active server is essio11-ib.
2. ssh to essio12
gsscallhomeevent --event pdReplacePdisk
--eventName Test “symptom generated by Electronic Service Agent”
--rgName rg_essio12-ib --pdName e2s20

Here the recovery group is rg_essio12-ib, and the active server is essio12-ib.

Note: Ensure that you state Test symptom generated by Electronic Service Agent in the --eventName
option. Check in the ESA that the enclosure system health is showing the event. You might have to
refresh the screen to make the event visible.

Select the event to see the details.

Chapter 1. Drive call home in 5146 and 5148 systems 13


Figure 11. List of events

For DCS3700 enclosures, the pdisks to test call home can have the e1d1s1 and the e2d5s10 (e3d1s1,
e4d5s10 etc.) alternating for recovery groups. For 5148-084 enclosures, the pdisks to test call home can
have the e1d1s1 (or e1d1s1ssd) and the e2d2s14 (e3d1s1, e4d2s14 etc) alternating for the recovery groups.

Post setup activities


v Delete any test problems.
v If the system has a 4U enclosure (DCS3700) in the configuration, obtain the actual matching seven digit
serial number, and keep it available if needed. The IBM support will need this serial number for
handling the problem properly.

14 ESS 5.3.1: Problem Determination Guide


|

| Chapter 2. Software call home


| The software call home feature collects files, logs, traces, and details of certain system health events from
| different nodes and services in an IBM Spectrum Scale cluster.

| These details are shared with the IBM® support center for monitoring and problem determination. For
| more information on call home, see Installing call home and Understanding call home.

| Configuring hardware and software call home


| You can configure call home (hardware and software) using the gsscallhomeconf command. You can use
| the --no-swcallhome option to set up just the call home hardware, and skip the software call home set
| up.

| The call home hardware and software call home can be set up using the following command:
| [root@ems1 ~]# gsscallhomeconf -E ems1 -N ems1,gss_ppc64 --suffix=-ib

| The command gives an output similar to the following:


| 2017-02-07T21:46:27.952187 Generating node list...
| 2017-02-07T21:46:29.108213 nodelist: ems1 essio11 essio12
| 2017-02-07T21:46:29.108243 suffix used for endpoint hostname: -ib
| End point ems1-ib registered successfully with systemid 802cd01aa0d3fc5137f006b7c9d95c26
| End point essio11-ib registered successfully with systemid c7dba51e109c92857dda7540c94830d3
| End point essio12-ib registered successfully with systemid 898fb33e04f5ea12f2f5c7ec0f8516d4
| End point enclosure G5CT018 registered successfully with systemid
| c14e80c240d92d51b8daae1d41e90f57
| End point enclosure G5CT016 registered successfully with systemid
| 524e48d68ad875ffbeeec5f3c07e1acf
| ESA configuration for ESS Callhome is complete.
| Started configuring software callhome
| Checking for ESA is activated or not before continuing.
| Fetching customer detail from ESA.
| Customer detail has been successfully fetched from ESA.
| Setting software callhome customer detail.
| Successfully set the customer detail for software callhome.
| Enabled daily schedule for software callhome.
| Enabled weekly schedule for software callhome.
| Direct connection will be used for software calhome.
| Successfully set the direct connection settings for software callhome.
| Enabled software callhome capability.
| Creating callhome automatic group
| Created auto group for software call home and enabled it.
| Software callhome configuration completed.

| If you want to skip the software call home set up, use the following command:
| [root@ems3 ~]# gsscallhomeconf -E ems3 -N ems3,gss_ppc64 --suffix=-te --register=all --no-swcallhome

| The command gives an output similar to the following:


| 2017-01-23T05:34:42.005215 Generating node list...
| 2017-01-23T05:34:42.827295 nodelist: ems3 essio31 essio32
| 2017-01-23T05:34:42.827347 suffix used for endpoint hostname: -te
| End point ems3-te registered sucessfully with systemid 37e5c23f98090750226f400722645655
| End point essio31-te registered sucessfully with systemid 35ae41e0388e08fd01378ae5c9a6ffef
| End point essio32-te registered sucessfully with systemid 9ea632b549434d57baef7c999dbf9479
| End point enclosure SV50321280 registered sucessfully with systemid 600755dc0aa2014526fe5945981b0e08
| End point enclosure SV50918672 registered sucessfully with systemid 92aa6428102b44a4a1c9a293402b324c
| ESA configuration for ESS Callhome is complete.

© Copyright IBM Corporation © IBM 2014, 2018 15


| Important: If the software call home set up has been skipped, it can be reconfigured again. However, the
| user needs to reconfigure both the call home hardware and the software call home again.

16 ESS 5.3.1: Problem Determination Guide


|

Chapter 3. Re-creating the NVR partitions


The Non-Volatile Random-Access Memory (NVRAM) physically resides within the IPR-Raid adapter that
is installed on the EMS, and each of the I/O nodes. The NVR partitions are created on the local sda drive
that is installed on the ESS I/O nodes to hold data for the log tip pdisks.

Although a total of 6 partitions are created, only 2 are actually used per I/O node, one for each NVR
pdisk. In some cases the NVRAM partitions might need to be recreated. For example, after a
hardware/OS failure.

Before re-creating the NVR partitions, list all the existing partitions for sda. To list all partitions for sda,
run the following command:
parted /dev/sda unit KiB print

This command will give a similar output:


Model: IBM IPR-10 749FFB00 (scsi)
Disk /dev/sda: 557727744kiB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:

Number Start End Size Type File system Flags


1 1024kiB 9216kiB 8192kiB primary boot, prep
2 9216kiB 521216kiB 512000kiB primary xfs
3 521216kiB 176649216kiB 176128000kiB primary xfs
4 176649216kiB 557727744kiB 381078528kiB extended
5 176651264kiB 279051264kiB 102400000kiB logical xfs
6 279052288kiB 381452288kiB 102400000kiB logical xfs
7 381453312kiB 483853312kiB 102400000kiB logical xfs
8 483854336kiB 535054336kiB 51200000kiB logical xfs
9 535055360kiB 543247360kiB 8192000kiB logical linux-swap(v1)

For optimal alignment, each partition must be exactly 2048000 KiB in size, and must be 1024 KiB apart
from each other.
In the sample output, the last end size pertains to Partition # 9, and has a value of 543247360 KiB.
To get the NVR partition's new start value, add 1024 KiB to the last end size value, and add 2048000 KiB
to the start value to determine the new end as shown:
1. NVR Partition 1 new start value = Last end size value + 1024 KiB = 543247360 KiB + 1024 KiB =
543248384 KiB
2. NVR Partition 1 new end = NVR Partition 1 new start value + 2048000 KiB = 543248384 KiB +
2048000 KiB = 545296384 KiB
To create the first NVR partition, run the following command:
parted /dev/sda mkpart logical 543248384KiB 545296384KiB

To get the new start for the second partition, you need to add 1024 KiB to the end size value of partition
1. Repeat the steps to calculate the start and end positions for the second partition as shown:
1. NVR Partition 2 new start = NVR Partition 1 end value + 1024 KiB = 545296384 KiB + 1024 KiB =
545297408 KiB
2. NVR Partition 2 new end = NVR Partition 2 new start value + 2048000 KiB = 545297408 KiB +
2048000 KiB = 547345408 KiB

Repeat the above steps four times to create a total of six partitions. When complete, the partitions list for
sda will look similar to the following:

© Copyright IBM Corp. 2014, 2018 17


[root@ems1 ~]# parted /dev/sda unit KiB print
Model: IBM IPR
-
10 749FFB00 (scsi)

Disk /dev/sda: 557727744kiB


Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1024kiB 9216kiB 8192kiB primary boot, prep
2 9216kiB 521216kiB 512000kiB primary xfs
3 521216kiB 176649216kiB 176128000kiB primary xfs
4 176649216kiB 557727744kiB 381078528kiB extended
5 176651264kiB 279051264kiB 102400000kiB logical xfs
6 279052288kiB 381452288kiB 102400000kiB logical xfs
7 381453312kiB 483853312kiB 102400000kiB logical xfs
8 483854336kiB 535054336kiB 51200000kiB logical xfs
9 535055360kiB 543247360kiB 8192000kiB logical linux-swap(v1)
10 543248384kiB 545296384kiB 2048001kiB logical xfs
11 545297408kiB 547345408kiB 2048001kiB logical xfs
12 547346432kiB 549394432kiB 2048001kiB logical xfs
13 549395456kiB 551443456kiB 2048001kiB logical xfs
14 551444480kiB 553492480kiB 2048001kiB logical xfs
15 553493504kiB 555541504kiB 2048001kiB logical xfs

18 ESS 5.3.1: Problem Determination Guide


Chapter 4. Re-creating NVRAM pdisks
NVRAM pdisks are used to store the log tip data which is eventually migrated to the log home vdisk.
Although ESS can continue to function without NVRAM pdisks, the performance is impacted without
their presence. Therefore, it is important to ensure that the NVRAM pdisks are functioning at all times.

The NVRAM pdisks may stop functioning and go into a missing state. This could be due to hardware
failure of the IPR card, or corrupt or missing NVR OS partition caused by an OS failure. To fix this
problem, the NVRAM pdisks must be recreated.

You can find the pdisks that are in a missing state by running the mmlsrecoverygroup command.
mmlsrecoverygroup rg_gssio1 -L --pdisk | grep NVR
NVR no 1 2 0,0 1 3632 MiB 14 days inactive 0% low
n1s01 0, 0 NVR 1816 MiB missing
n2s01 0, 0 NVR 1816 MiB missing

mmlsrecoverygroup rg_gssio2 -L --pdisk | grep NVR


NVR no 1 2 0,0 1 3632 MiB 14 days inactive 0% low
n1s02 0, 0 NVR 1816 MiB missing
n2s02 0, 0 NVR 1816 MiB missing

Before recreating the pdisks, ensure that all six NVRAM partitions exist on the sda by using the following
command:
parted /dev/sda unit KiB print

Model: IBM IPR


-
10 749FFB00 (scsi)
Disk /dev/sda: 557727744kiB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1024kiB 9216kiB 8192kiB primary boot, prep
2 9216kiB 521216kiB 512000kiB primary xfs
3 521216kiB 176649216kiB 176128000kiB primary xfs
4 176649216kiB 557727744kiB 381078528kiB extended
5 176651264kiB 279051264kiB 102400000kiB logical xfs
6 279052288kiB 381452288kiB 102400000kiB logical xfs
7 381453312kiB 483853312kiB 102400000kiB logical xfs
8 483854336kiB 535054336kiB 51200000kiB logical xfs
9 535055360kiB 543247360kiB 8192000kiB logical linux-swap(v1)
10 543248384kiB 545296384kiB 2048001kiB logical xfse*/
11 545297408kiB 547345408kiB 2048001kiB logical xfs
12 547346432kiB 549394432kiB 2048001kiB logical xfs
13 549395456kiB 551443456kiB 2048001kiB logical xfs
14 551444480kiB 553492480kiB 2048001kiB logical xfs
15 553493504kiB 555541504kiB 2048001kiB logical xfs

Note: In case the partitions are not present, you must recreate the 6 NVR partitions. For more
information, see "Re-creating the NVR partitions" .

After you have verified the 6 NVR partitions, create a stanza file for each of the NVRAM devices that are
missing, and save it.

© Copyright IBM Corp. 2014, 2018 19


gssio1.stanza:
%pdisk: pdiskName=n1s01 device=//gssio1/dev/sda10 da=NVR rotationRate=NVRAM
%pdisk: pdiskName=n2s01 device=//gssio2/dev/sda10 da=NVR rotationRate=NVRAM

mmaddpdisk rg_gssio1 -F gssio1.stanza --replace

Run the mmaddpdisk command using the stanza file that was created to replace the missing pdisks.
mmaddpdisk rg_gssio1 -F gssio1.stanza --replace

The following pdisks will be formatted on the node gssio.ess.com:


v //gssio1/dev/sda10
v //gssio2/dev/sda10

Run the mmlsrecoverygroup command to confirm the current state of the pdisks.
mmlsrecoverygroup rg_gssio1 -L --pdisk | grep NVR
n1s01 1, 1 NVR 1816 MiB ok
n2s01 1, 1 NVR 1816 MiB ok

Run the mmaddpdisk command to recreate the other missing NVRAM pdisks.

20 ESS 5.3.1: Problem Determination Guide


Chapter 5. Steps to restore an I/O node
If an I/O node fails due to a hardware or OS problem, and the OS is no longer accessible, you must
restore the node using the existing configuration settings stored in xCAT, which typically resides on the
EMS node.

This process will restore the OS image as well as the required ESS software, drivers and firmware.

Note: For the following steps, we assume that the gssio1 node is the node that is being restored.
1. Disable the GPFS auto load using the mmchconfig command.

Note: This prevents GPFS from restarting automatically upon reboot.


[ems]# mmlsconfig autoload
autoload yes
[ems]# mmchconfig autoload=no
[ems]# mmlsconfig autoload
autoload no
2. List the recovery groups using the mmlsrecoverygroup command to verify that the replacement node
is not an active recovery group server currently.
[ems1]# mmlsrecoverygroup
recovery group vdisks vdisks servers
------------------ ----------- ------ -------
rg_gssio1 3 18 gssio1,gssio2
rg_gssio2 3 18 gssio2,gssio1
List the current active recovery group server for each recovery group.
[ems1]# mmlsrecoverygroup rg_gssio1 -L | grep "active recovery" -A2
active recovery group server servers
----------------------------------------------- -------
gssio1 gssio1,gssio2

[ems1]# mmlsrecoverygroup rg_gssio2 -L | grep "active recovery" -A2


active recovery group server servers
----------------------------------------------- -------
gssio2 gssio2,gssio1

Note: If you are restoring gssio1, the active recovery group server for gssio1 should be gssio2. If it
is not set to gssio2, you need to run the mmchrecoverygroup command to change it.
[ems1]# mmchrecoverygroup rg_gssio1 --servers <NEW PRIMARY NODE>,<OLD PRIMARY NODE>
[root@gssio1 ~]# mmchrecoverygroup rg_gssio1 --servers gssio2,gssio1
[ems1]# mmlsrecoverygroup rg_gssio1 -L | grep "active recovery" -A2
active recovery group server servers
----------------------------------------------- -------
gssio2 gssio1,gssio2

[ems1]# mmlsrecoverygroup rg_gssio2 -L | grep "active recovery" -A2


active recovery group server servers
----------------------------------------------- -------
gssio2 gssio2,gssio1
3. Create a backup of the replacement node's network file.
[ems]# rm -rf /tmp/replacement_node_network_backup
[ems]# mkdir /tmp/replacement_node_network_backup
[ems]# scp <REPLACEMENT NODE>:/etc/sysconfig/network-scripts/ifcfg-*
/tmp/replacement_node_network_backup/
[ems]# scp gssio2:/etc/sysconfig/network-scripts/ifcfg-*
/tmp/replacement_node_network_backup/

© Copyright IBM Corporation © IBM 2014, 2018 21


Note: This is an optional step, and can only be taken when the replacement node can be accessed.
.
4. Check for the RHEL images available for install on the EMS node.
The RHEL image is needed in order to re-image the node that is being restored. The OS image
should be located on the EMS node under the following directory:
[ems]# ls /tftpboot/xcat/osimage/
rhels7.3-ppc64-install-gss
5. Configure the replacement node's boot state to Install for the specified OS image.
[ems]# nodeset <REPLACEMENT NODE> osimage=<OS_ISO_image>
[root@ems1 ~]# nodeset gssio2 osimage=rhels7.3-ppc64-install-gss
gssio2: install rhels7.3-ppc64-gss
6. Ensure that the remote console is properly configured on the EMS node.
[ems]# makeconservercf <REPLACEMENT NODE>
[root@ems1 ~]# makeconservercf gssio2
7. Reboot the replaced node to initiate the installation process.
[ems]# rnetboot <REPLACEMENT NODE> -V
[root@ems1 ~]# rnetboot gssio2 -V
lpar_netboot Status: List only ent adapters
lpar_netboot Status: -v (verbose debug) flag detected
lpar_netboot Status: -i (force immediate shutdown) flag detected
lpar_netboot Status: -d (debug) flag detected
node:gssio2
Node is gssio2
...
# Network boot proceeding - matched BOOTP, exiting.
# Finished.
sending commands ~. to expect
gssio2: Success
Monitor the progress of the installation, and wait for the xcatpost/yum/etc script to finish.
[ems]# watch "nodestat <REPLACEMENT NODE>; echo; tail /var/log/consoles/<REPLACEMENT NODE>"
[root@ems1 ~]# watch "nodestat gssio2; echo; tail /var/log/consoles/gssio2"
gssio2: noping
...
gssio2: install rhels7.3-ppc64-gss
...
gssio2: sshd
[ems]# watch -n .5 "ssh <REPLACEMENT NODE> ’ps -eaf | grep -v grep’ |
egrep ’xcatpost|yum|rpm|vpd’"
[root@ems1 ~]# watch -n .5 "ssh gssio2 ’ps -eaf | grep -v grep’ |
egrep ’xcatpost|yum|rpm|vpd’"

Note: Depending on what needs to be updated, the node might reboot one or more time. You need
to wait until there is no process output before taking the next step.
8. Verify that the upgrade files have been copied to the I/O node sync directory, /install/gss/sync/
ppc64/.
[ems]# ssh <REPLACEMENT NODE> "ls /install/gss/sync/ppc64/"
[root@ems1]# ssh gssio2 "ls /install/gss/sync/ppc64/"
gssio2: mofed
Wait for the directory to sync. After the mofed directory is created, you can take the next step.
9. Copy the host files from the healthy node to the replacement node.
[ems]# scp /etc/hosts <REPLACEMENT NODE>:/etc/
[root@ems1 mofed]# scp /etc/hosts gssio2:/etc/
10. Configure the network on the replacement node.
If you had backed up the network files previously, you can copy them over to the node, and restart
the node. Verify that the names of the devices are consistent with the backed up version before
replacing the files.

22 ESS 5.3.1: Problem Determination Guide


You can also apply the Red Hat updates not included in the xCAT image, if necessary.
11. Rebuild the GPFS kernel extensions on the replacement node.
If the kernel patches were applied, it may be necessary to rebuild the GPFS portability layer by
running the mmbuildgpl command.
[ems]# ssh <REPLACEMENT NODE> "/usr/lpp/mmfs/bin/mmbuildgpl"
[root@ems1 ~]# ssh gssio2 "/usr/lpp/mmfs/bin/mmbuildgpl"
--------------------------------------------------------
mmbuildgpl: Building GPL module begins at Wed Nov 8 17:18:21 EST 2017.
--------------------------------------------------------
Verifying Kernel Header...
kernel version = 31000514 (3.10.0-514.28.1.el7.ppc64, 3.10.0-514.28.1)
module include dir = /lib/modules/3.10.0-514.28.1.el7.ppc64/build/include
module build dir = /lib/modules/3.10.0-514.28.1.el7.ppc64/build
kernel source dir = /usr/src/linux-3.10.0-514.28.1.el7.ppc64/include
Found valid kernel header file under /usr/src/kernels/3.10.0-514.28.1.el7.ppc64/include
Verifying Compiler...
make is present at /bin/make
cpp is present at /bin/cpp
gcc is present at /bin/gcc
g++ is present at /bin/g++
ld is present at /bin/ld
Verifying Additional System Headers...
Verifying kernel-headers is installed ...
Command: /bin/rpm -q kernel-headers
The required package kernel-headers is installed
make World ...
make InstallImages ...
--------------------------------------------------------
mmbuildgpl: Building GPL module completed successfully at Wed Nov 8 17:18:39 EST 2017.
12. Restore the GPFS configuration from an existing healthy node in the cluster.
[ems]# ssh <REPLACEMENT NODE> "/usr/lpp/mmfs/bin/mmsdrrestore -p <GOOD NODE>"
[root@ems ~]# ssh gssio2 "/usr/lpp/mmfs/bin/mmsdrrestore -p ems1"
mmsdrrestore: Processing node gssio1
mmsdrrestore: Node gssio1 successfully restored.

Note: This code is executed on the replacement node, and the -p option is applied to an existing
healthy node.
13. Start GPFS on the recovered node, and enable the GPFS auto load.
a. Before starting GPFS, verify that the replacement node is still in DOWN state.
[ems]# mmgetstate -aL
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 gssio1 2 2 5 active quorum node
2 gssio2 0 0 5 down quorum node
3 ems1 2 2 5 active quorum node
4 gsscomp1 2 2 5 active
5 gsscomp 2 2 5 active
b. Start GPFS on the replacement node.
[ems]# mmstartup -N <REPLACEMENT NODE>
mmstartup: Starting GPFS ...
c. Verify that the replacement node is active.
[ems]# mmgetstate -aL
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 gssio1 2 3 5 active quorum node
2 gssio2 2 3 5 active quorum node
3 ems1 2 3 5 active quorum node
4 gsscomp1 2 3 5 active
5 gsscomp2 2 3 5 active
d. Ensure that all the file systems are mounted on the replacement node.
[ems]# mmmount all -N <REPLACEMENT NODE>
[ems]# mmlsmount all -L

Chapter 5. Steps to restore an I/O node 23


e. Re-enable the GPFS auto load.
[ems]# mmlsconfig autoload
autoload no

[ems]# mmchconfig autoload=yes


mmchconfig: Command successfully completed

[ems]# mmlsconfig autoload


autoload yes
14. Verify that the recovered node is now the active recovery group server for it's recovery group.
[ems1]# mmlsrecoverygroup
recovery group vdisks vdisks servers
------------------ ----------- ------ -------
rg_gssio1 3 18 gssio1,gssio2
rg_gssio2 3 18 gssio2,gssio1
View the active node for each recovery group.
[ems1]# mmlsrecoverygroup rg_gssio1 -L | grep "active recovery" -A2
active recovery group server servers
----------------------------------------------- -------
gssio1 gssio1,gssio2

[ems1]# mmlsrecoverygroup rg_gssio2 -L | grep "active recovery" -A2


active recovery group server servers
----------------------------------------------- -------
gssio2 gssio2,gssio1
The recovered node gssio1 must have automatically taken over its recovery group. In the event that
gssio1 did not, you need to manually set it as the active recovery group server for its recovery
group.
[ems1]# mmchrecoverygroup rg_gssio1 --servers <NEW PRIMARY NODE>,<OLD PRIMARY NODE>
[root@gssio1 ~]# mmchrecoverygroup rg_gssio1 --servers gssio2,gssio1

[ems1]# mmlsrecoverygroup rg_gssio1 -L | grep "active recovery" -A2


active recovery group server servers
----------------------------------------------- -------
gssio2 gssio1,gssio2

[ems1]# mmlsrecoverygroup rg_gssio2 -L | grep "active recovery" -A2


active recovery group server servers
----------------------------------------------- -------
gssio2 gssio2,gssio1
15. Verify that the NVRAM partition exists, and ensure the following:
v There should be 11 partitions.
v Partitions 6 through11 should be 2GB.
v Partitions 6 through 9 are marked as xfs for file system.
v Partitions 10 and 11 should not have a file system associated with it.
v After re-imaging, the node that was re-imaged will have an xfs file system as shown:
[ems]# ssh gssio1 "lsblk | egrep ’NAME|sda[0-9]’"
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
├─sda1 8:1 0 8M 0 part
├─sda2 8:2 0 500M 0 part /boot
├─sda3 8:3 0 246.1G 0 part /
├─sda4 8:4 0 1K 0 part
├─sda5 8:5 0 3.9G 0 part [SWAP]
├─sda6 8:6 0 2G 0 part
├─sda7 8:7 0 2G 0 part
├─sda8 8:8 0 2G 0 part
├─sda9 8:9 0 2G 0 part
├─sda10 8:10 0 2G 0 part
└─sda11 8:11 0 2G 0 part

24 ESS 5.3.1: Problem Determination Guide


[ems1]# ssh gssio1 "parted /dev/sda -l | egrep ’boot, prep’ -B 1 -A 10"
Number Start End Size Type File system Flags
1 1049kB 9437kB 8389kB primary boot, prep
2 9437kB 534MB 524MB primary xfs
3 534MB 265GB 264GB primary xfs
4 265GB 284GB 18.9GB extended
5 265GB 269GB 4194MB logical linux-swap(v1)
6 269GB 271GB 2097MB logical xfs
7 271GB 273GB 2097MB logical xfs
8 273GB 275GB 2097MB logical xfs
9 275GB 277GB 2097MB logical xfs
10 277GB 279GB 2097MB logical
11 279GB 282GB 2097MB logical

[ems1]# ssh gssio2 "lsblk | egrep ’NAME|sda[0-9]’"


NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
├─sda1 8:1 0 8M 0 part
├─sda2 8:2 0 500M 0 part /boot
├─sda3 8:3 0 246.1G 0 part /
├─sda4 8:4 0 1K 0 part
├─sda5 8:5 0 3.9G 0 part [SWAP]
├─sda6 8:6 0 2G 0 part
├─sda7 8:7 0 2G 0 part
├─sda8 8:8 0 2G 0 part
├─sda9 8:9 0 2G 0 part
├─sda10 8:10 0 2G 0 part
└─sda11 8:11 0 2G 0 part

[ems1]# ssh gssio2 "parted /dev/sda -l | egrep ’boot, prep’ -B 1 -A 10"


Number Start End Size Type File system Flags
1 1049kB 9437kB 8389kB primary boot, prep
2 9437kB 534MB 524MB primary xfs
3 534MB 265GB 264GB primary xfs
4 265GB 284GB 18.9GB extended
5 265GB 269GB 4194MB logical linux-swap(v1)
6 269GB 271GB 2097MB logical xfs
7 271GB 273GB 2097MB logical xfs
8 273GB 275GB 2097MB logical xfs
9 275GB 277GB 2097MB logical xfs
10 277GB 279GB 2097MB logical xfs
11 279GB 282GB 2097MB logical xfs

If the partitions do not exist, you need to create them. For more information, see Chapter 3,
“Re-creating the NVR partitions,” on page 17
16. View the current NVR device status.
[ems1]# mmlsrecoverygroup rg_gssio1 -L --pdisk | egrep "n[0-9]s[0-9]"
n1s01 1, 1 NVR 1816 MiB ok
n2s01 0, 0 NVR 1816 MiB missing

[ems1]# mmlsrecoverygroup rg_gssio2 -L --pdisk | egrep "n[0-9]s[0-9]"


n1s02 1, 1 NVR 1816 MiB ok
n2s02 0, 0 NVR 1816 MiB missing

Note: The missing NVR devices must be recreated or replaced. For more information, see Chapter 4,
“Re-creating NVRAM pdisks,” on page 19

Chapter 5. Steps to restore an I/O node 25


26 ESS 5.3.1: Problem Determination Guide
Chapter 6. Best practices for troubleshooting
Following certain best practices make the troubleshooting process easier.

For information on IBM Spectrum Scale issues and their resolution, see the IBM Spectrum Scale: Problem
Determination Guide in the IBM Spectrum Scale Knowledge Center.

How to get started with troubleshooting


Troubleshooting the issues reported in the system is easier when you follow the process step-by-step.

When you experience some issues with the system, go through the following steps to get started with the
troubleshooting:
1. Check the events that are reported in various nodes of the cluster by using the mmhealth node
eventlog command.
2. Check the user action corresponding to the active events and take the appropriate action. For more
information on the events and corresponding user action, see “Events” on page 77.
3. Check for events which happened before the event you are trying to investigate. They might give you
an idea about the root cause of problems. For example, if you see an event nfs_in_grace and
node_resumed a minute before you get an idea about the root cause why NFS entered the grace
period, it means that the node has been resumed after a suspend.
4. Collect the details of the issues through logs, dumps, and traces. You can use various CLI commands
and Settings > Diagnostic Data GUI page to collect the details of the issues reported in the system.
5. Based on the type of issue, browse through the various topics that are listed in the troubleshooting
section and try to resolve the issue.
6. If you cannot resolve the issue by yourself, contact IBM Support.

Back up your data


You need to back up data regularly to avoid data loss. It is also recommended to take backups before you
start troubleshooting. The IBM Spectrum Scale provides various options to create data backups.

Follow the guidelines in the following sections to avoid any issues while creating backup:
v GPFS(tm) backup data in IBM Spectrum Scale: Concepts, Planning, and Installation Guide
v Backup considerations for using IBM Spectrum Protect™ in IBM Spectrum Scale: Concepts, Planning, and
Installation Guide
v Configuration reference for using IBM Spectrum Protect with IBM Spectrum Scale(tm) in IBM Spectrum Scale:
Administration Guide
v Protecting data in a file system using backup in IBM Spectrum Scale: Administration Guide
v Backup procedure with SOBAR in IBM Spectrum Scale: Administration Guide

The following best practices help you to troubleshoot the issues that might arise in the data backup
process:
1. Enable the most useful messages in mmbackup command by setting the MMBACKUP_PROGRESS_CONTENT
and MMBACKUP_PROGRESS_INTERVAL environment variables in the command environment prior to issuing
the mmbackup command. Setting MMBACKUP_PROGRESS_CONTENT=7 provides the most useful messages. For
more information on these variables, see mmbackup command in IBM Spectrum Scale: Command and
Programming Reference.
2. If the mmbackup process is failing regularly, enable debug options in the backup process:

© Copyright IBM Corp. 2014, 2018 27


Use the DEBUGmmbackup environment variable or the -d option that is available in the mmbackup
command to enable debugging features. This variable controls what debugging features are enabled. It
is interpreted as a bitmask with the following bit meanings:
0x001 Specifies that basic debug messages are printed to STDOUT. There are multiple components
that comprise mmbackup, so the debug message prefixes can vary. Some examples include:
mmbackup:mbackup.sh
DEBUGtsbackup33:
0x002 Specifies that temporary files are to be preserved for later analysis.
0x004 Specifies that all dsmc command output is to be mirrored to STDOUT.
The -d option in the mmbackup command line is equivalent to DEBUGmmbackup = 1 .
3. To troubleshoot problems with backup subtask execution, enable debugging in the tsbuhelper
program.
Use the DEBUGtsbuhelper environment variable to enable debugging features in the mmbackup helper
program tsbuhelper.

Resolve events in a timely manner


Resolving the issues in a timely manner helps to get attention on the new and most critical events. If
there are a number of unfixed alerts, fixing any one event might become more difficult because of the
effects of the other events. You can use either CLI or GUI to view the list of issues that are reported in
the system.

You can use the mmhealth node eventlog to list the events that are reported in the system.

The Monitoring > Events GUI page lists all events reported in the system. You can also mark certain
events as read to change the status of the event in the events view. The status icons become gray in case
an error or warning is fixed or if it is marked as read. Some issues can be resolved by running a fix
procedure. Use the action Run Fix Procedure to do so. The Events page provides a recommendation for
which fix procedure to run next.

Keep your software up to date


Check for new code releases and update your code on a regular basis.

This can be done by checking the IBM support website to see if new code releases are available: IBM
Elastic Storage™ Server support website. The release notes provide information about new function in a
release plus any issues that are resolved with the new release. Update your code regularly if the release
notes indicate a potential issue.

Note: If a critical problem is detected on the field, IBM may send a flash, advising the user to contact
IBM for an efix. The efix when applied might resolve the issue.

Subscribe to the support notification


Subscribe to support notifications so that you are aware of best practices and issues that might affect your
system.

Subscribe to support notifications by visiting the IBM support page on the following IBM website:
http://www.ibm.com/support/mynotifications.

By subscribing, you are informed of new and updated support site information, such as publications,
hints and tips, technical notes, product flashes (alerts), and downloads.

28 ESS 5.3.1: Problem Determination Guide


Know your IBM warranty and maintenance agreement details
If you have a warranty or maintenance agreement with IBM, know the details that must be supplied
when you call for support.

For more information on the IBM Warranty and maintenance details, see Warranties, licenses and
maintenance.

Know how to report a problem


If you need help, service, technical assistance, or want more information about IBM products, you find a
wide variety of sources available from IBM to assist you.

IBM maintains pages on the web where you can get information about IBM products and fee services,
product implementation and usage assistance, break and fix service support, and the latest technical
information. The following table provides the URLs of the IBM websites where you can find the support
information.
Table 2. IBM websites for help, services, and information
Website Address
IBM home page http://www.ibm.com
Directory of worldwide contacts http://www.ibm.com/planetwide
Support for ESS IBM Elastic Storage Server support website
Support for IBM System Storage and IBM Total Storage http://www.ibm.com/support/entry/portal/product/
products system_storage/

Note: Available services, telephone numbers, and web links are subject to change without notice.

Before you call

Make sure that you have taken steps to try to solve the problem yourself before you call. Some
suggestions for resolving the problem before calling IBM Support include:
v Check all hardware for issues beforehand.
v Use the troubleshooting information in your system documentation. The troubleshooting section of the
IBM Knowledge Center contains procedures to help you diagnose problems.

To check for technical information, hints, tips, and new device drivers or to submit a request for
information, go to the IBM Elastic Storage Server support website.

Using the documentation

Information about your IBM storage system is available in the documentation that comes with the
product. That documentation includes printed documents, online documents, readme files, and help files
in addition to the IBM Knowledge Center.

Chapter 6. Best practices for troubleshooting 29


30 ESS 5.3.1: Problem Determination Guide
Chapter 7. Limitations
Read this section to learn about product limitations.

Limit updates to Red Hat Enterprise Linux (ESS 5.3)


Limit errata updates to RHEL to security updates and updates requested by IBM Service.

The required Red Hat components are:


v RHEL 7.3 ISO (PPC64BE and PPC64LE)
v Network manager version : 1.8.0-11.el7_4
v Systemd version: 219-42.el7_4.10
v Kernel version: 3.10.0-514.44

ESS 5.3 supports Red Hat Enterprise Linux 7.3 (3.10.0-514.44.1 ppc64BE and LE). It is highly
recommended that you install only the following types of updates to RHEL:
v Security updates.
v Errata updates that are requested by IBM Service.

© Copyright IBM Corporation © IBM 2014, 2018 31


32 ESS 5.3.1: Problem Determination Guide
Chapter 8. Collecting information about an issue
To begin the troubleshooting process, collect information about the issue that the system is reporting.

From the EMS, issue the following command:


gsssnap -i -g -N <IO node1>,<IO node 2>,..,<IO node X>

The system will return a gpfs.snap, an installcheck, and the data from each node.

© Copyright IBM Corp. 2014, 2018 33


34 ESS 5.3.1: Problem Determination Guide
Chapter 9. GUI Issues
When troubleshooting GUI issues, it is recommended to view the logs that are located under
/var/log/cnlog/mgtsrv. By default, the GUI is installed on the EMS node. It is possible that the customer
installed it in another node. In such cases, the GUI logs are stored in the node where the GUI is installed.

The following logs can be viewed to troubleshoot the GUI issues:


mgtsrv-system-log-x
Logs everything that runs in background processes such as refresh tasks. This is the most
important log for GUI.
mgtsrv-trace-log-x
Logs everything that is directly triggered by the GUI user. For example, starting an action,
clicking a button, executing a GUI CLI command, etc.
wlp-messages.log
This log covers the underlying Websphere Liberty. The log is mostly relevant during startup
phase.
gpfsgui_trc.log
Logs the issues related to incoming requests from the browser. Users must check this log if the
GUI displays the error message:
’Server was unable to process request.’

Issue with loading GUI


If there are problems in loading the GUI, you can reconfigure the GUI to see if that resolves the problem.

Follow these steps to reconfigure the GUI:


1. Run the following command to force the GUI to launch the wizard after the next login:
/usr/lpp/mmfs/gui/cli/debug enablewizard
systemctl restart gpfsgui
2. Run the following command to force the GUI to no longer display the wizard after login:
/usr/lpp/mmfs/gui/cli/debug disablewizard
systemctl restart gpfsgui
3. If the problem persists, reinstall the GUI RPM which can be found on emse node using the following
command:
/opt/ibm/gss/install/rhel7/ppc64/gui/gpfs.gui*

yum -Uvh /opt/ibm/gss/install/rhel7/ppc64/gui gpfs.gui*


4. If there is a possibility that the GUI database has become corrupt or has inconsistencies that is
preventing the GUI from loading properly, take the following steps.

CAUTION: This should be done as a last resort since the GUI configuration settings will be lost after
you execute the following steps:
a. Stop the GUI service.
systemctl stop gpfsgui
b. Drop the GUI schema from the postgres database.
psql postgres postgres -c "DROP SCHEMA FSCC CASCADE"
c. Start the GUI service.

© Copyright IBM Corporation © IBM 2014, 2018 35


systemctl start gpfsgui

36 ESS 5.3.1: Problem Determination Guide


Chapter 10. Recovery Group Issues
Use the mmlsrecoverygroup command to check which recovery groups are available:
#mmlsrecoverygroup

The command will give output similar to the following:

declustered
arrays with
recovery group vdisks vdisks servers
------------------ ------- ------ -------
rg_rchgss1-hs 3 5 rchgss1-hs.gpfs.rchland.ibm.com,rchgss2-hs.gpfs.rchland.ibm.com
rg_rchgss2-hs 3 5 rchgss2-hs.gpfs.rchland.ibm.com,rchgss1-hs.gpfs.rchland.ibm.com

List the active recovery group and their primary servers using the following command :
mmlsrecoverygroup rg_rchgss1-hs -L | grep active -A2

active recovery group server servers


----------------------------------- -------
rchgss1-hs.gpfs.rchland.ibm.com rchgss1-hs.gpfs.rchland.ibm.com,rchgss2-hs.gpfs.rchland.ibm.com

mmlsrecoverygroup rg_rchgss2-hs -L | grep active -A2

active recovery group server servers


----------------------------------- -------
rchgss2-hs.gpfs.rchland.ibm.com rchgss2-hs.gpfs.rchland.ibm.com,rchgss1-hs.gpfs.rchland.ibm.com

Each of the recovery groups must be served by its own server. If the server is unavailable due to
maintenance or other issues, the recovery group must be served by an available server. After a failure or
maintenance event, when the recovery group’s primary server becomes active again, it must
automatically begin serving its recovery group. You will find the following information under the
/var/adm/ras/mmfs.log.latest file under in the recovery group server:
v Now serving recovery group rg_rchgss1-hs.
v Reason for takeover of rg_rchgss1-hs: 'primary server became ready'.

If the recovery group is not being served by it’s respective server, examine the gpfs log on that server for
errors that might prevent the server from serving the recovery group. If there are no issues, you can
manually activate the recovery group. For example, to allow rchgss1-hs.gpfs.rchland.ibm.com to server
the rg_rchgss1-hs RG, execute:
mmchrecoverygroup rg_rchgss1-hs --active rchgss1-hs.gpfs.rchland.ibm.com

In certain situations, if an ESS server node experiences a disk failure, the disks may be marked down,
and does not automatically start. This can prevent the recovery group from becoming active. For more
information on troubleshooting disk problems, see Disk Issues in IBM Spectrum Scale documentation.

Before troubleshooting further, ensure that GPFS is in the active state for the node in question by running
the mmgetstate command:
mmgetstate -a

The command will give output similar to the following:

© Copyright IBM Corp. 2014, 2018 37


Node number Node name GPFS state
-------------------------------------------
1 rchgss1-hs active
2 rchgss2-hs active
3 rchems1 active

Execute the mmlsdisk command to check the status of the disks. The -e option will only display disks
with errors.
mmlsdisk gpfs0 -e

The command will give output similar to the following:


disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------------------------ -------- ------ ---------- -------- ----- ------------- ------------ ------------
rg_rchgss1_hs_MetaData_1M_3W_1 nsd 512 30 Yes No to be emptied up system

Attention: Due to an earlier configuration change the file system may contain data that is at risk of being
lost.
In the above example, the disk is in the suspended state, hence the to be emptied status. Other disks
may be in the non-ready state or whose availability is down so this prevents the disks from being used
by the GPFS/ESS.
disk driver sector failure holds holds storage
name type size group metadata data status availability disk id pool remarks
------------------------------------ -------- ------ --------- -------- ----- ------------- ------------ -------------- --------
rg_rchgss1_hs_MetaData_1M_3W_1 nsd 512 30 Yes No ready up 1 system
rg_rchgss1_hs_Data_16M_2p_1 nsd 512 30 No Yes ready up 2 data desc
rg_rchgss2_hs_MetaData_1M_3W_1 nsd 512 30 Yes No ready up 3 system desc
rg_rchgss2_hs_Data_16M_2p_1 nsd 512 30 No Yes ready up 4 data desc

You can try to manually start the disks by running the mmchdisk command.
mmchdisk gpfs0 start -d rg_rchgss1_hs_MetaData_1M_3W_1
mmnsddiscover: Attempting to rediscover the disks. This may take a while ...
mmnsddiscover: Finished.
rchgss1-hs.gpfs.rchland.ibm.com: Rediscovered nsd server access to rg_rchgss1_hs_MetaData_1M_3W_1

If multiple disks are down, you can run the command:


mmchdisk gpfs0 start -a

Note: Depending on number of disks that are down and their size, the mmnsddiscover command may
take a while to complete.

38 ESS 5.3.1: Problem Determination Guide


Chapter 11. Contacting IBM
Specific information about a problem such as: symptoms, traces, error logs, GPFS™ logs, and file system
status is vital to IBM in order to resolve an IBM Spectrum Scale RAID problem.

Obtain this information as quickly as you can after a problem is detected, so that error logs will not wrap
and system parameters that are always changing, will be captured as close to the point of failure as
possible. When a serious problem is detected, collect this information and then call IBM.

Information to collect before contacting the IBM Support Center


For effective communication with the IBM Support Center to help with problem diagnosis, you need to
collect certain information.

Information to collect for all problems related to IBM Spectrum Scale RAID

Regardless of the problem encountered with IBM Spectrum Scale RAID, the following data should be
available when you contact the IBM Support Center:
1. A description of the problem.
2. Output of the failing application, command, and so forth.
To collect the gpfs.snap data and the ESS tool logs, issue the following from the EMS:
gsssnap -g -i -n <IO node1>, <IOnode2>,... <ioNodeX>
3. A tar file generated by the gpfs.snap command that contains data from the nodes in the cluster. In
large clusters, the gpfs.snap command can collect data from certain nodes (for example, the affected
nodes, NSD servers, or manager nodes) using the -N option.
For more information about gathering data using the gpfs.snap command, see the IBM Spectrum Scale:
Problem Determination Guide.
If the gpfs.snap command cannot be run, collect these items:
a. Any error log entries that are related to the event:
v On a Linux node, create a tar file of all the entries in the /var/log/messages file from all nodes in
the cluster or the nodes that experienced the failure. For example, issue the following command
to create a tar file that includes all nodes in the cluster:
mmdsh -v -N all "cat /var/log/messages" > all.messages
v On an AIX® node, issue this command:
errpt -a
For more information about the operating system error log facility, see the IBM Spectrum Scale:
Problem Determination Guide.
b. A master GPFS log file that is merged and chronologically sorted for the date of the failure. (See
the IBM Spectrum Scale: Problem Determination Guide for information about creating a master GPFS
log file.
c. If the cluster was configured to store dumps, collect any internal GPFS dumps written to that
directory relating to the time of the failure. The default directory is /tmp/mmfs.
d. On a failing Linux node, gather the installed software packages and the versions of each package
by issuing this command:
rpm -qa
e. On a failing AIX node, gather the name, most recent level, state, and description of all installed
software packages by issuing this command:
lslpp -l

© Copyright IBM Corp. 2014, 2018 39


f. File system attributes for all of the failing file systems, issue:
mmlsfs Device
g. The current configuration and state of the disks for all of the failing file systems, issue:
mmlsdisk Device
h. A copy of file /var/mmfs/gen/mmsdrfs from the primary cluster configuration server.
4. If you are experiencing one of the following problems, see the appropriate section before contacting
the IBM Support Center:
v For delay and deadlock issues, see “Additional information to collect for delays and deadlocks.”
v For file system corruption or MMFS_FSSTRUCT errors, see “Additional information to collect for
file system corruption or MMFS_FSSTRUCT errors.”
v For GPFS daemon crashes, see “Additional information to collect for GPFS daemon crashes.”

Additional information to collect for delays and deadlocks

When a delay or deadlock situation is suspected, the IBM Support Center will need additional
information to assist with problem diagnosis. If you have not done so already, make sure you have the
following information available before contacting the IBM Support Center:
1. Everything that is listed in “Information to collect for all problems related to IBM Spectrum Scale
RAID” on page 39.
2. The deadlock debug data collected automatically.
3. If the cluster size is relatively small and the maxFilesToCache setting is not high (less than 10,000),
issue the following command:
gpfs.snap --deadlock
If the cluster size is large or the maxFilesToCache setting is high (greater than 1M), issue the
following command:
gpfs.snap --deadlock --quick
For more information about the --deadlock and --quick options, see the IBM Spectrum Scale: Problem
Determination Guide .

Additional information to collect for file system corruption or MMFS_FSSTRUCT


errors

When file system corruption or MMFS_FSSTRUCT errors are encountered, the IBM Support Center will
need additional information to assist with problem diagnosis. If you have not done so already, make sure
you have the following information available before contacting the IBM Support Center:
1. Everything that is listed in “Information to collect for all problems related to IBM Spectrum Scale
RAID” on page 39.
2. Unmount the file system everywhere, then run mmfsck -n in offline mode and redirect it to an output
file.

The IBM Support Center will determine when and if you should run the mmfsck -y command.

Additional information to collect for GPFS daemon crashes

When the GPFS daemon is repeatedly crashing, the IBM Support Center will need additional information
to assist with problem diagnosis. If you have not done so already, make sure you have the following
information available before contacting the IBM Support Center:
1. Everything that is listed in “Information to collect for all problems related to IBM Spectrum Scale
RAID” on page 39.
2. Make sure the /tmp/mmfs directory exists on all nodes. If this directory does not exist, the GPFS
daemon will not generate internal dumps.

40 ESS 5.3.1: Problem Determination Guide


3. Set the traces on this cluster and all clusters that mount any file system from this cluster:
mmtracectl --set --trace=def --trace-recycle=global
4. Start the trace facility by issuing:
mmtracectl --start
5. Recreate the problem if possible or wait for the assert to be triggered again.
6. Once the assert is encountered on the node, turn off the trace facility by issuing:
mmtracectl --off
If traces were started on multiple clusters, mmtracectl --off should be issued immediately on all
clusters.
7. Collect gpfs.snap output:
gpfs.snap

How to contact the IBM Support Center


IBM support is available for various types of IBM hardware and software problems that IBM Spectrum
Scale customers may encounter.

These problems include the following:


v IBM hardware failure
v Node halt or crash not related to a hardware failure
v Node hang or response problems
v Failure in other software supplied by IBM
If you have an IBM Software Maintenance service contract
If you have an IBM Software Maintenance service contract, contact IBM Support as follows:

Your location Method of contacting IBM Support


In the United States Call 1-800-IBM-SERV for support.
Outside the United States Contact your local IBM Support Center or see the
Directory of worldwide contacts (www.ibm.com/
planetwide).

When you contact IBM Support, the following will occur:


1. You will be asked for the information you collected in “Information to collect before
contacting the IBM Support Center” on page 39.
2. You will be given a time period during which an IBM representative will return your call. Be
sure that the person you identified as your contact can be reached at the phone number you
provided in the PMR.
3. An online Problem Management Record (PMR) will be created to track the problem you are
reporting, and you will be advised to record the PMR number for future reference.
4. You may be requested to send data related to the problem you are reporting, using the PMR
number to identify it.
5. Should you need to make subsequent calls to discuss the problem, you will also use the PMR
number to identify the problem.
If you do not have an IBM Software Maintenance service contract
If you do not have an IBM Software Maintenance service contract, contact your IBM sales
representative to find out how to proceed. Be prepared to provide the information you collected
in “Information to collect before contacting the IBM Support Center” on page 39.

For failures in non-IBM software, follow the problem-reporting procedures provided with that product.

Chapter 11. Contacting IBM 41


42 ESS 5.3.1: Problem Determination Guide
Chapter 12. Maintenance procedures
Very large disk systems, with thousands or tens of thousands of disks and servers, will likely experience
a variety of failures during normal operation.

To maintain system productivity, the vast majority of these failures must be handled automatically
without loss of data, without temporary loss of access to the data, and with minimal impact on the
performance of the system. Some failures require human intervention, such as replacing failed
components with spare parts or correcting faults that cannot be corrected by automated processes.

You can also use the ESS GUI to perform various maintenance tasks. The ESS GUI lists various
maintenance-related events in its event log in the Monitoring > Events page. You can set up email alerts
to get notified when such events are reported in the system. You can resolve these events or contact the
IBM Support Center for help as needed. The ESS GUI includes various maintenance procedures to guide
you through the fix process.

Updating the firmware for host adapters, enclosures, and drives


After creating a GPFS cluster, you can install the most current firmware for host adapters, enclosures, and
drives.

After creating a GPFS cluster, install the most current firmware for host adapters, enclosures, and drives
only if instructed to do so by IBM support. Then, address issues that occur because you have not
upgraded to a later version of ESS.

You can update the firmware either manually or with the help of directed maintenance procedures (DMP)
that are available in the GUI. The ESS GUI lists events in its event log in the Monitoring > Events page if
the host adapter, enclosure, or drive firmware is not up-to-date, compared to the currently-available
firmware packages on the servers. Select Run Fix Procedure from the Action menu for the
firmware-related event to launch the corresponding DMP in the GUI. For more information on the
available DMPs, see Directed maintenance procedures in Elastic Storage Server: Problem Determination Guide.

The most current firmware is packaged as the gpfs.gss.firmware RPM. You can find the most current
firmware on Fix Central.
1. Sign in with your IBM ID and password.
2. On the Find product tab:
a. In the Product selector field, type: IBM Spectrum Scale RAID and click on the arrow to the right
b. On the Installed Version drop-down menu, select: 5.0.0
c. On the Platform drop-down menu, select: Linux 64-bit,pSeries
d. Click on Continue
3. On the Select fixes page, select the most current fix pack.
4. Click on Continue
5. On the Download options page, select radio button to the left of your preferred downloading
method. Make sure the check box to the left of Include prerequisites and co-requisite fixes (you
can select the ones you need later) has a check mark in it.
6. Click on Continue to go to the Download files... page and download the fix pack files.

The gpfs.gss.firmware RPM needs to be installed on all ESS server nodes. It contains the most current
updates of the following types of supported firmware for a ESS configuration:
v Host adapter firmware

© Copyright IBM Corporation © IBM 2014, 2018 43


v Enclosure firmware
v Drive firmware
v Firmware loading tools.
For command syntax and examples, see mmchfirmware command in IBM Spectrum Scale RAID:
Administration.

Disk diagnosis
For information about disk hospital, see Disk hospital in IBM Spectrum Scale RAID: Administration.

When an individual disk I/O operation (read or write) encounters an error, IBM Spectrum Scale RAID
completes the NSD client request by reconstructing the data (for a read) or by marking the unwritten
data as stale and relying on successfully written parity or replica strips (for a write), and starts the disk
hospital to diagnose the disk. While the disk hospital is diagnosing, the affected disk will not be used for
serving NSD client requests.

Similarly, if an I/O operation does not complete in a reasonable time period, it is timed out, and the
client request is treated just like an I/O error. Again, the disk hospital will diagnose what went wrong. If
the timed-out operation is a disk write, the disk remains temporarily unusable until a pending timed-out
write (PTOW) completes.

The disk hospital then determines the exact nature of the problem. If the cause of the error was an actual
media error on the disk, the disk hospital marks the offending area on disk as temporarily unusable, and
overwrites it with the reconstructed data. This cures the media error on a typical HDD by relocating the
data to spare sectors reserved within that HDD.

If the disk reports that it can no longer write data, the disk is marked as readonly. This can happen when
no spare sectors are available for relocating in HDDs, or the flash memory write endurance in SSDs was
reached. Similarly, if a disk reports that it cannot function at all, for example not spin up, the disk
hospital marks the disk as dead.

The disk hospital also maintains various forms of disk badness, which measure accumulated errors from
the disk, and of relative performance, which compare the performance of this disk to other disks in the
same declustered array. If the badness level is high, the disk can be marked dead. For less severe cases,
the disk can be marked failing.

Finally, the IBM Spectrum Scale RAID server might lose communication with a disk. This can either be
caused by an actual failure of an individual disk, or by a fault in the disk interconnect network. In this
case, the disk is marked as missing. If the relative performance of the disk drops below 66% of the other
disks for an extended period, the disk will be declared slow.

If a disk would have to be marked dead, missing, or readonly, and the problem affects individual disks
only (not a large set of disks), the disk hospital tries to recover the disk. If the disk reports that it is not
started, the disk hospital attempts to start the disk. If nothing else helps, the disk hospital power-cycles
the disk (assuming the JBOD hardware supports that), and then waits for the disk to return online.

Before actually reporting an individual disk as missing, the disk hospital starts a search for that disk by
polling all disk interfaces to locate the disk. Only after that fast poll fails is the disk actually declared
missing.

If a large set of disks has faults, the IBM Spectrum Scale RAID server can continue to serve read and
write requests, provided that the number of failed disks does not exceed the fault tolerance of either the
RAID code for the vdisk or the IBM Spectrum Scale RAID vdisk configuration data. When any disk fails,
the server begins rebuilding its data onto spare space. If the failure is not considered critical, rebuilding is

44 ESS 5.3.1: Problem Determination Guide


throttled when user workload is present. This ensures that the performance impact to user workload is
minimal. A failure might be considered critical if a vdisk has no remaining redundancy information, for
example three disk faults for 4-way replication and 8 + 3p or two disk faults for 3-way replication and
8 + 2p. During a critical failure, critical rebuilding will run as fast as possible because the vdisk is in
imminent danger of data loss, even if that impacts the user workload. Because the data is declustered, or
spread out over many disks, and all disks in the declustered array participate in rebuilding, a vdisk will
remain in critical rebuild only for short periods of time (several minutes for a typical system). A double
or triple fault is extremely rare, so the performance impact of critical rebuild is minimized.

In a multiple fault scenario, the server might not have enough disks to fulfill a request. More specifically,
such a scenario occurs if the number of unavailable disks exceeds the fault tolerance of the RAID code. If
some of the disks are only temporarily unavailable, and are expected back online soon, the server will
stall the client I/O and wait for the disk to return to service. Disks can be temporarily unavailable for
any of the following reasons:
v The disk hospital is diagnosing an I/O error.
v A timed-out write operation is pending.
v A user intentionally suspended the disk, perhaps it is on a carrier with another failed disk that has been
removed for service.

If too many disks become unavailable for the primary server to proceed, it will fail over. In other words,
the whole recovery group is moved to the backup server. If the disks are not reachable from the backup
server either, then all vdisks in that recovery group become unavailable until connectivity is restored.

A vdisk will suffer data loss when the number of permanently failed disks exceeds the vdisk fault
tolerance. This data loss is reported to NSD clients when the data is accessed.

Background tasks
While IBM Spectrum Scale RAID primarily performs NSD client read and write operations in the
foreground, it also performs several long-running maintenance tasks in the background, which are
referred to as background tasks. The background task that is currently in progress for each declustered
array is reported in the long-form output of the mmlsrecoverygroup command. Table 3 describes the
long-running background tasks.
Table 3. Background tasks
Task Description
repair-RGD/VCD Repairing the internal recovery group data and vdisk configuration data from the failed disk
onto the other disks in the declustered array.
rebuild-critical Rebuilding virtual tracks that cannot tolerate any more disk failures.
rebuild-1r Rebuilding virtual tracks that can tolerate only one more disk failure.
rebuild-2r Rebuilding virtual tracks that can tolerate two more disk failures.
rebuild-offline Rebuilding virtual tracks where failures exceeded the fault tolerance.
rebalance Rebalancing the spare space in the declustered array for either a missing pdisk that was
discovered again, or a new pdisk that was added to an existing array.
scrub Scrubbing vdisks to detect any silent disk corruption or latent sector errors by reading the entire
virtual track, performing checksum verification, and performing consistency checks of the data
and its redundancy information. Any correctable errors found are fixed.

Chapter 12. Maintenance procedures 45


Server failover
If the primary IBM Spectrum Scale RAID server loses connectivity to a sufficient number of disks, the
recovery group attempts to fail over to the backup server. If the backup server is also unable to connect,
the recovery group becomes unavailable until connectivity is restored. If the backup server had taken
over, it will relinquish the recovery group to the primary server when it becomes available again.

Data checksums
IBM Spectrum Scale RAID stores checksums of the data and redundancy information on all disks for each
vdisk. Whenever data is read from disk or received from an NSD client, checksums are verified. If the
checksum verification on a data transfer to or from an NSD client fails, the data is retransmitted. If the
checksum verification fails for data read from disk, the error is treated similarly to a media error:
v The data is reconstructed from redundant data on other disks.
v The data on disk is rewritten with reconstructed good data.
v The disk badness is adjusted to reflect the silent read error.

Disk replacement
You can use the ESS GUI for detecting failed disks and for disk replacement.

When one disk fails, the system will rebuild the data that was on the failed disk onto spare space and
continue to operate normally, but at slightly reduced performance because the same workload is shared
among fewer disks. With the default setting of two spare disks for each large declustered array, failure of
a single disk would typically not be a sufficient reason for maintenance.

When several disks fail, the system continues to operate even if there is no more spare space. The next
disk failure would make the system unable to maintain the redundancy the user requested during vdisk
creation. At this point, a service request is sent to a maintenance management application that requests
replacement of the failed disks and specifies the disk FRU numbers and locations.

In general, disk maintenance is requested when the number of failed disks in a declustered array reaches
the disk replacement threshold. By default, that threshold is identical to the number of spare disks. For a
more conservative disk replacement policy, the threshold can be set to smaller values using the
mmchrecoverygroup command.

Disk maintenance is performed using the mmchcarrier command with the --release option, which:
v Suspends any functioning disks on the carrier if the multi-disk carrier is shared with the disk that is
being replaced.
v If possible, powers down the disk to be replaced or all of the disks on that carrier.
v Turns on indicators on the disk enclosure and disk or carrier to help locate and identify the disk that
needs to be replaced.
v If necessary, unlocks the carrier for disk replacement.
After the disk is replaced and the carrier reinserted, another mmchcarrier command with the --replace
option powers on the disks.

You can replace the disk either manually or with the help of directed maintenance procedures (DMP) that
are available in the GUI. The ESS GUI lists events in its event log in the Monitoring > Events page if a
disk failure is reported in the system. Select the gnr_pdisk_replaceable event from the list of events and
then select Run Fix Procedure from the Action menu to launch the replace disk DMP in the GUI. For
more information, see Replace disks in Elastic Storage Server: Problem Determination Guide.

46 ESS 5.3.1: Problem Determination Guide


Other hardware service
While IBM Spectrum Scale RAID can easily tolerate a single disk fault with no significant impact, and
failures of up to three disks with various levels of impact on performance and data availability, it still
relies on the vast majority of all disks being functional and reachable from the server. If a major
equipment malfunction prevents both the primary and backup server from accessing more than that
number of disks, or if those disks are actually destroyed, all vdisks in the recovery group will become
either unavailable or suffer permanent data loss. As IBM Spectrum Scale RAID cannot recover from such
catastrophic problems, it also does not attempt to diagnose them or orchestrate their maintenance.

In the case that a IBM Spectrum Scale RAID server becomes permanently disabled, a manual failover
procedure exists that requires recabling to an alternate server. For more information, see the
mmchrecoverygroup command in the IBM Spectrum Scale: Command and Programming Reference. If both
the primary and backup IBM Spectrum Scale RAID servers for a recovery group fail, the recovery group
is unavailable until one of the servers is repaired.

Replacing failed disks in an ESS recovery group: a sample scenario


The scenario presented here shows how to detect and replace failed disks in a recovery group built on an
ESS building block.

Detecting failed disks in your ESS enclosure

Assume a GL4 building block on which the following two recovery groups are defined:
v BB1RGL, containing the disks in the left side of each drawer
v BB1RGR, containing the disks in the right side of each drawer

Each recovery group contains the following:


v One log declustered array (LOG)
v Two data declustered arrays (DA1, DA2)

The data declustered arrays are defined according to GL4 best practices as follows:
v 58 pdisks per data declustered array
v Default disk replacement threshold value set to 2

The replacement threshold of 2 means that IBM Spectrum Scale RAID only requires disk replacement
when two or more disks fail in the declustered array; otherwise, rebuilding onto spare space or
reconstruction from redundancy is used to supply affected data. This configuration can be seen in the
output of mmlsrecoverygroup for the recovery groups, which are shown here for BB1RGL:
# mmlsrecoverygroup BB1RGL -L

declustered
recovery group arrays vdisks pdisks format version
----------------- ----------- ------ ------ --------------
BB1RGL 4 8 119 4.1.0.1

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
SSD no 1 1 0,0 1 186 GiB 14 days scrub 8% low
NVR no 1 2 0,0 1 3648 MiB 14 days scrub 8% low
DA1 no 3 58 2,31 2 50 TiB 14 days scrub 7% low
DA2 no 3 58 2,31 2 50 TiB 14 days scrub 7% low

declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
------------------ ------------------ ----------- ---------- ---------- ----------- ----- -------

Chapter 12. Maintenance procedures 47


ltip_BB1RGL 2WayReplication NVR 48 MiB 2 MiB 512 ok logTip
ltbackup_BB1RGL Unreplicated SSD 48 MiB 2 MiB 512 ok logTipBackup
lhome_BB1RGL 4WayReplication DA1 20 GiB 2 MiB 512 ok log
reserved1_BB1RGL 4WayReplication DA2 20 GiB 2 MiB 512 ok logReserved
BB1RGLMETA1 4WayReplication DA1 750 GiB 1 MiB 32 KiB ok
BB1RGLDATA1 8+3p DA1 35 TiB 16 MiB 32 KiB ok
BB1RGLMETA2 4WayReplication DA2 750 GiB 1 MiB 32 KiB ok
BB1RGLDATA2 8+3p DA2 35 TiB 16 MiB 32 KiB ok

config data declustered array VCD spares actual rebuild spare space remarks
------------------ ------------------ ------------- --------------------------------- ----------------
rebuild space DA1 31 35 pdisk
rebuild space DA2 31 35 pdisk

config data max disk group fault tolerance actual disk group fault tolerance remarks
---------------- -------------------------------- --------------------------------- ----------------
rg descriptor 1 enclosure + 1 drawer 1 enclosure + 1 drawer limiting fault tolerance
system index 2 enclosure 1 enclosure + 1 drawer limited by rg descriptor

vdisk max disk group fault tolerance actual disk group fault tolerance remarks
---------------- ----------------------------- --------------------------------- ----------------
ltip_BB1RGL 1 pdisk 1 pdisk
ltbackup_BB1RGL 0 pdisk 0 pdisk
lhome_BB1RGL 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
reserved1_BB1RGL 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
BB1RGLMETA1 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
BB1RGLDATA1 1 enclosure 1 enclosure
BB1RGLMETA2 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
BB1RGLDATA2 1 enclosure 1 enclosure

active recovery group server servers


----------------------------------------------- -------
c45f01n01-ib0.gpfs.net c45f01n01-ib0.gpfs.net,c45f01n02-ib0.gpfs.net

The indication that disk replacement is called for in this recovery group is the value of yes in the needs
service column for declustered array DA1.

The fact that DA1 is undergoing rebuild of its IBM Spectrum Scale RAID tracks that can tolerate one strip
failure is by itself not an indication that disk replacement is required; it merely indicates that data from a
failed disk is being rebuilt onto spare space. Only if the replacement threshold has been met will disks be
marked for replacement and the declustered array marked as needing service.

IBM Spectrum Scale RAID provides several indications that disk replacement is required:
v Entries in the Linux syslog
v The pdReplacePdisk callback, which can be configured to run an administrator-supplied script at the
moment a pdisk is marked for replacement
v The output from the following commands, which may be performed from the command line on any
IBM Spectrum Scale RAID cluster node (see the examples that follow):
1. mmlsrecoverygroup with the -L flag shows yes in the needs service column
2. mmlsrecoverygroup with the -L and --pdisk flags; this shows the states of all pdisks, which may be
examined for the replace pdisk state
3. mmlspdisk with the --replace flag, which lists only those pdisks that are marked for replacement

Note: Because the output of mmlsrecoverygroup -L --pdisk is long, this example shows only some of the
pdisks (but includes those marked for replacement).
# mmlsrecoverygroup BB1RGL -L --pdisk

declustered
recovery group arrays vdisks pdisks
----------------- ----------- ------ ------
BB1RGL 3 5 119

48 ESS 5.3.1: Problem Determination Guide


declustered needs replace scrub background activity
array service vdisks pdisks spares threshold free space duration task progress priority
---------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
LOG no 1 3 0 1 534 GiB 14 days scrub 1% low
DA1 yes 2 58 2 2 0 B 14 days rebuild-1r 4% low
DA2 no 2 58 2 2 1024 MiB 14 days scrub 27% low

n. active, declustered user state,


pdisk total paths array free space condition remarks
---------- ----------- ----------- ---------- ----------- -------
[...]
e1d4s06 2, 4 DA1 62 GiB normal ok
e1d5s01 0, 0 DA1 70 GiB replaceable slow/noPath/systemDrain/noRGD
/noVCD/replace
e1d5s02 2, 4 DA1 64 GiB normal ok
e1d5s03 2, 4 DA1 63 GiB normal ok
e1d5s04 0, 0 DA1 64 GiB replaceable failing/noPath
/systemDrain/noRGD/noVCD/replace
e1d5s05 2, 4 DA1 63 GiB normal ok
[...]

The preceding output shows that the following pdisks are marked for replacement:
v e1d5s01 in DA1
v e1d5s04 in DA1

The naming convention used during recovery group creation indicates that these disks are in Enclosure 1
Drawer 5 Slot 1 and Enclosure 1 Drawer 5 Slot 4. To confirm the physical locations of the failed disks, use
the mmlspdisk command to list information about the pdisks in declustered array DA1 of recovery group
BB1RGL that are marked for replacement:
# mmlspdisk BB1RGL --declustered-array DA1 --replace
pdisk:
replacementPriority = 0.98
name = "e1d5s01"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "slow/noPath/systemDrain/noRGD/noVCD/replace"
.
.
.
pdisk:
replacementPriority = 0.98
name = "e1d5s04"
device = ""
recoveryGroup = "BB1RGL"
declusteredArray = "DA1"
state = "failing/noPath/systemDrain/noRGD/noVCD/replace"
.
.
.

The physical locations of the failed disks are confirmed to be consistent with the pdisk naming
convention and with the IBM Spectrum Scale RAID component database:
--------------------------------------------------------------------------------------
Disk Location User Location
--------------------------------------------------------------------------------------
pdisk e1d5s01 SV21314035-5-1 Rack BB1 U01-04, Enclosure BB1ENC1 Drawer 5 Slot 1
--------------------------------------------------------------------------------------
pdisk e1d5s04 SV21314035-5-4 Rack BB1 U01-04, Enclosure BB1ENC1 Drawer 5 Slot 4
--------------------------------------------------------------------------------------

Chapter 12. Maintenance procedures 49


This shows how the component database provides an easier-to-use location reference for the affected
physical disks. The pdisk name e1d5s01 means “Enclosure 1 Drawer 5 Slot 1.” Additionally, the location
provides the serial number of enclosure 1, SV21314035, with the drawer and slot number. But the user
location that has been defined in the component database can be used to precisely locate the disk in an
equipment rack and a named disk enclosure: This is the disk enclosure that is labeled “BB1ENC1,” found
in compartments U01 - U04 of the rack labeled “BB1,” and the disk is in drawer 5, slot 1 of that enclosure.

The relationship between the enclosure serial number and the user location can be seen with the
mmlscomp command:
# mmlscomp --serial-number SV21314035

Storage Enclosure Components

Comp ID Part Number Serial Number Name Display ID


------- ----------- ------------- ------- ----------
2 1818-80E SV21314035 BB1ENC1

Replacing failed disks in a GL4 recovery group

Note: In this example, it is assumed that two new disks with the appropriate Field Replaceable Unit
(FRU) code, as indicated by the fru attribute (90Y8597 in this case), have been obtained as replacements
for the failed pdisks e1d5s01 and e1d5s04.

Replacing each disk is a three-step process:


1. Using the mmchcarrier command with the --release flag to inform IBM Spectrum Scale to locate the
disk, suspend it, and allow it to be removed.
2. Locating and removing the failed disk and replacing it with a new one.
3. Using the mmchcarrier command with the --replace flag to begin use of the new disk.

IBM Spectrum Scale RAID assigns a priority to pdisk replacement. Disks with smaller values for the
replacementPriority attribute should be replaced first. In this example, the only failed disks are in DA1
and both have the same replacementPriority.

Disk e1d5s01 is chosen to be replaced first.


1. To release pdisk e1d5s01 in recovery group BB1RGL:
# mmchcarrier BB1RGL --release --pdisk e1d5s01
[I] Suspending pdisk e1d5s01 of RG BB1RGL in location SV21314035-5-1.
[I] Location SV21314035-5-1 is Rack BB1 U01-04, Enclosure BB1ENC1 Drawer 5 Slot 1.
[I] Carrier released.

- Remove carrier.
- Replace disk in location SV21314035-5-1 with FRU 90Y8597.
- Reinsert carrier.
- Issue the following command:

mmchcarrier BB1RGL --replace --pdisk ’e1d5s01’


IBM Spectrum Scale RAID issues instructions as to the physical actions that must be taken, and
repeats the user-defined location to help find the disk.
2. To allow the enclosure BB1ENC1 with serial number SV21314035 to be located and identified, IBM
Spectrum Scale RAID will turn on the enclosure’s amber “service required” LED. The enclosure’s
bezel should be removed. This will reveal that the amber “service required” and blue “service
allowed” LEDs for drawer 5 have been turned on.
Drawer 5 should then be unlatched and pulled open. The disk in slot 1 will be seen to have its amber
and blue LEDs turned on.

50 ESS 5.3.1: Problem Determination Guide


Unlatch and pull up the handle for the identified disk in slot 1. Lift out the failed disk and set it
aside. The drive LEDs turn off when the slot is empty. A new disk with FRU 90Y8597 should be
lowered in place and have its handle pushed down and latched.
Since the second disk replacement in this example is also in drawer 5 of the same enclosure, leave the
drawer open and the enclosure bezel off. If the next replacement were in a different drawer, the
drawer would be closed; and if the next replacement were in a different enclosure, the enclosure bezel
would be replaced.
3. To finish the replacement of pdisk e1d5s01:
# mmchcarrier BB1RGL --replace --pdisk e1d5s01
[I] The following pdisks will be formatted on node server1:
/dev/sdmi
[I] Pdisk e1d5s01 of RG BB1RGL successfully replaced.
[I] Resuming pdisk e1d5s01#026 of RG BB1RGL.
[I] Carrier resumed.
When the mmchcarrier --replace command returns successfully, IBM Spectrum Scale RAID begins
rebuilding and rebalancing IBM Spectrum Scale RAID strips onto the new disk, which assumes the
pdisk name e1d5s01. The failed pdisk may remain in a temporary form (indicated here by the name
e1d5s01#026) until all data from it rebuilds, at which point it is finally deleted. Notice that only one
block device name is mentioned as being formatted as a pdisk; the second path will be discovered in
the background.
Disk e1d5s04 is still marked for replacement, and DA1 of BBRGL will still need service. This is because
the IBM Spectrum Scale RAID replacement policy expects all failed disks in the declustered array to
be replaced after the replacement threshold is reached.

Pdisk e1d5s04 is then replaced following the same process.


1. Release pdisk e1d5s04 in recovery group BB1RGL:
# mmchcarrier BB1RGL --release --pdisk e1d5s04
[I] Suspending pdisk e1d5s04 of RG BB1RGL in location SV21314035-5-4.
[I] Location SV21314035-5-4 is Rack BB1 U01-04, Enclosure BB1ENC1 Drawer 5 Slot 4.
[I] Carrier released.

- Remove carrier.
- Replace disk in location SV21314035-5-4 with FRU 90Y8597.
- Reinsert carrier.
- Issue the following command:

mmchcarrier BB1RGL --replace --pdisk ’e1d5s04’


2. Find the enclosure and drawer, unlatch and remove the disk in slot 4, place a new disk in slot 4, push
in the drawer, and replace the enclosure bezel.
3. To finish the replacement of pdisk e1d5s04:
# mmchcarrier BB1RGL --replace --pdisk e1d5s04
[I] The following pdisks will be formatted on node server1:
/dev/sdfd
[I] Pdisk e1d5s04 of RG BB1RGL successfully replaced.
[I] Resuming pdisk e1d5s04#029 of RG BB1RGL.
[I] Carrier resumed.

The disk replacements can be confirmed with mmlsrecoverygroup -L --pdisk:


# mmlsrecoverygroup BB1RGL -L --pdisk

declustered
recovery group arrays vdisks pdisks
----------------- ----------- ------ ------
BB1RGL 3 5 121

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------ ------ ------ ------ --------- ---------- -------- -------------------------

Chapter 12. Maintenance procedures 51


LOG no 1 3 0 1 534 GiB 14 days scrub 1% low
DA1 no 2 60 2 2 3647 GiB 14 days rebuild-1r 4% low
DA2 no 2 58 2 2 1024 MiB 14 days scrub 27% low

n. active, declustered user state,


pdisk total paths array free space condition remarks
----------- ----------- ----------- ---------- ----------- -------
[...]
e1d4s06 2, 4 DA1 62 GiB normal ok
e1d5s01 2, 4 DA1 1843 GiB normal ok
e1d5s01#026 0, 0 DA1 70 GiB draining slow/noPath/systemDrain
/adminDrain/noRGD/noVCD
e1d5s02 2, 4 DA1 64 GiB normal ok
e1d5s03 2, 4 DA1 63 GiB normal ok
e1d5s04 2, 4 DA1 1853 GiB normal ok
e1d5s04#029 0, 0 DA1 64 GiB draining failing/noPath/systemDrain
/adminDrain/noRGD/noVCD
e1d5s05 2, 4 DA1 62 GiB normal ok
[...]

Notice that the temporary pdisks (e1d5s01#026 and e1d5s04#029) representing the now-removed physical
disks are counted toward the total number of pdisks in the recovery group BB1RGL and the declustered
array DA1. They exist until IBM Spectrum Scale RAID rebuild completes the reconstruction of the data
that they carried onto other disks (including their replacements). When rebuild completes, the temporary
pdisks disappear, and the number of disks in DA1 will once again be 58, and the number of disks in BBRGL
will once again be 119.

Replacing failed ESS storage enclosure components: a sample


scenario
The scenario presented here shows how to detect and replace failed storage enclosure components in an
ESS building block.

Detecting failed storage enclosure components

The mmlsenclosure command can be used to show you which enclosures need service along with the
specific component. A best practice is to run this command every day to check for failures.
# mmlsenclosure all -L --not-ok

needs
serial number service nodes
------------- ------- ------
SV21313971 yes c45f02n01-ib0.gpfs.net

component type serial number component id failed value unit properties


-------------- ------------- ------------ ------ ----- ---- ----------
fan SV21313971 1_BOT_LEFT yes RPM FAILED

This indicates that enclosure SV21313971 has a failed fan.

When you are ready to replace the failed component, use the mmchenclosure command to identify
whether it is safe to complete the repair action or whether IBM Spectrum Scale needs to be shut down
first:
# mmchenclosure SV21313971 --component fan --component-id 1_BOT_LEFT

mmenclosure: Proceed with the replace operation.

The fan can now be replaced.

52 ESS 5.3.1: Problem Determination Guide


Special note about detecting failed enclosure components

In the following example, only the enclosure itself is being called out as having failed; the specific
component that has actually failed is not identified. This typically means that there are drive “Service
Action Required (Fault)” LEDs that have been turned on in the drawers. In such a situation, the
mmlspdisk all --not-ok command can be used to check for dead or failing disks.
mmlsenclosure all -L --not-ok

needs nodes
serial number service
------------- ------- ------
SV13306129 yes c45f01n01-ib0.gpfs.net

component type serial number component id failed value unit properties


-------------- ------------- ------------ ------ ----- ---- ----------
enclosure SV13306129 ONLY yes NOT_IDENTIFYING,FAILED

Replacing a failed ESS storage drawer: a sample scenario


Prerequisite information:
v IBM Spectrum Scale 4.1.1 PTF8 or 4.2.1 PTF1 is a prerequisite for this procedure to work. If you are not
at one of these levels or higher, contact IBM.
v This procedure is intended to be done as a partnership between the storage administrator and a
hardware service representative. The storage administrator is expected to understand the IBM
Spectrum Scale RAID concepts and the locations of the storage enclosures. The storage administrator is
responsible for all the steps except those in which the hardware is actually being worked on.
v The pdisks in a drawer span two recovery groups; therefore, it is very important that you examine the
pdisks and the fault tolerance of the vdisks in both recovery groups when going through these steps.
v An underlying principle is that drawer replacement should never deliberately put any vdisk into
critical state. When vdisks are in critical state, there is no redundancy and the next single sector or
IO error can cause unavailability or data loss. If drawer replacement is not possible without making
the system critical, then the ESS has to be shut down before the drawer is removed. An example of
drawer replacement will follow these instructions.

Replacing a failed ESS storage drawer requires the following steps:


1. If IBM Spectrum Scale is shut down: perform drawer replacement as soon as possible. Perform steps
4b and 4c and then restart IBM Spectrum Scale.
2. Examine the states of the pdisks in the affected drawer. If all the pdisk states are missing, dead, or
replace, then go to step 4b to perform drawer replacement as soon as possible without going through
any of the other steps in this procedure.
Assuming that you know the enclosure number and drawer number and are using standard pdisk
naming conventions, you could use the following commands to display the pdisks and their states:
mmlsrecoverygroup LeftRecoveryGroupName -L --pdisk | grep e{EnclosureNumber}d{DrawerNumber}
mmlsrecoverygroup RightRecoveryGroupName -L --pdisk | grep e{EnclosureNumber}d{DrawerNumber}
3. Determine whether online replacement is possible.
a. Consult the following table to see if drawer replacement is theoretically possible for this
configuration. The only required input at this step is the ESS model.
The table shows each possible ESS system as well as the configuration parameters for the systems.
If the table indicates that online replacement is impossible, IBM Spectrum Scale will need to be
shut down (on at least the two I/O servers involved) and you should go back to step 1. The fault
tolerance notation uses E for enclosure, D for drawer, and P for pdisk.
Additional background information on interpreting the fault tolerance values:

Chapter 12. Maintenance procedures 53


v For many of the systems, 1E is reported as the fault tolerance; however, this does not mean that
failure of x arbitrary drawers or y arbitrary pdisks can be tolerated. It means that the failure of
all the entities in one entire enclosure can be tolerated.
v A fault tolerance of 1E+1D or 2D implies that the failure of two arbitrary drawers can be
tolerated.
Table 4. ESS fault tolerance for drawer/enclosure
Is online
replacement
Hardware type (model name...) DA configuration Fault tolerance possible?
IBM ESS Enclosure # Encl. # Data DA Disks per # Spares RG desc Mirrored vdisk Parity vdisk
type per RG DA
GS1 2U-24 1 1 12 1 4P 3Way 2P 8+2p 2P No drawers,
enclosure
impossible
4Way 3P 8+3p 3P No drawers,
enclosure
impossible
GS2 2U-24 2 1 24 2 4P 3Way 2P 8+2p 2P No drawers,
enclosure
impossible
4Way 3P 8+3p 3P No drawers,
enclosure
impossible
GS4 2U-24 4 1 48 2 1E+1P 3Way 1E+1P 8+2p 2P No drawers,
enclosure
impossible
4Way 1E+1P 8+3p 1E No drawers,
enclosure
impossible
GS6 2U-24 6 1 72 2 1E+1P 3Way 1E+1P 8+2p 1E No drawers,
enclosure
impossible
4Way 1E+1P 8+3p 1E+1P No drawers,
enclosure
possible.
GL2 4U-60 (5d) 2 1 58 2 4D 3Way 2D 8+2p 2D Drawer
possible,
enclosure
impossible.
4Way 3D 8+3p 1D+1P Drawer
possible,
enclosure
impossible
GL4 4U-60 (5d) 4 2 58 2 1E+1D 3Way 1E+1D 8+2p 2D Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E Drawer
possible,
enclosure
impossible
GL4 4U-60 (5d) 4 1 116 4 1E+1D 3Way 1E+1D 8+2p 2D Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E Drawer
possible,
enclosure
impossible

54 ESS 5.3.1: Problem Determination Guide


Table 4. ESS fault tolerance for drawer/enclosure (continued)
Is online
replacement
Hardware type (model name...) DA configuration Fault tolerance possible?
GL6 4U-60 (5d) 6 3 58 2 1E+1D 3Way 1E+1D 8+2p 1E Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E+1D Drawer
possible,
enclosure
possible.
GL6 4U-60 (5d) 6 1 174 6 1E+1D 3Way 1E+1D 8+2p 1E Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E+1D Drawer
possible,
enclosure
possible.

b. Determine the actual disk group fault tolerance of the vdisks in both recovery groups using the
mmlsrecoverygroup RecoveryGroupName -L command. The rg descriptor and all the vdisks must be
able to tolerate the loss of the item being replaced plus one other item. This is necessary because
the disk group fault tolerance code uses a definition of "tolerance" that includes the system
running in critical mode. But since putting the system into critical is not advised, one other
item is required. For example, all the following would be a valid fault tolerance to continue with
drawer replacement: 1E+1D, 1D+1P, and 2D.
c. Compare the actual disk group fault tolerance with the disk group fault tolerance listed in Table 4
on page 54. If the system is using a mix of 2-fault-tolerant and 3-fault-tolerant vdisks, the
comparisons must be done with the weaker (2-fault-tolerant) values. If the fault tolerance can
tolerate at least the item being replaced plus one other item, then replacement can proceed. Go to
step 4.
4. Drawer Replacement procedure.
a. Quiesce the pdisks.
Choose one of the following methods to suspend all the pdisks in the drawer.
v Using the chdrawer sample script:
/usr/lpp/mmfs/samples/vdisk/chdrawer EnclosureSerialNumber DrawerNumber --release
v Manually using the mmchpdisk command:
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk LeftRecoveryGroupName --pdisk \
e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --suspend ; done

for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk RightRecoveryGroupName --pdisk \


e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --suspend ; done
Verify that the pdisks were suspended using the mmlsrecoverygroup command as shown in step
2.
b. Remove the drives; make sure to record the location of the drives and label them. You will need to
replace them in the corresponding slots of the new drawer later.
c. Replace the drawer following standard hardware procedures.
d. Replace the drives in the corresponding slots of the new drawer.
e. Resume the pdisks.
Choose one of the following methods to resume all the pdisks in the drawer.
v Using the chdrawer sample script:
/usr/lpp/mmfs/samples/vdisk/chdrawer EnclosureSerialNumber DrawerNumber --replace
v Manually using the mmchpdisk command:

Chapter 12. Maintenance procedures 55


for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk LeftRecoveryGroupName --pdisk
e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --resume ; done

for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk RightRecoveryGroupName --pdisk


e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --resume ; done
You can verify that the pdisks are no longer suspended using the mmlsrecoverygroup command
as shown in step 2.
5. Verify that the drawer has been successfully replaced.
Examine the states of the pdisks in the affected drawers. All the pdisk states should be ok and the
second column of the output should all be "2" indicating that 2 paths are being seen. Assuming that
you know the enclosure number and drawer number and are using standard pdisk naming
conventions, you could use the following commands to display the pdisks and their states:
mmlsrecoverygroup LeftRecoveryGroupName -L --pdisk | grep e{EnclosureNumber}d{DrawerNumber}
mmlsrecoverygroup RightRecoveryGroupName -L --pdisk | grep e{EnclosureNumber}d{DrawerNumber}

Example

The system is a GL4 with vdisks that have 4way mirroring and 8+3p RAID codes. Assume that the
drawer that contains pdisk e2d3s01 needs to be replaced because one of the drawer control modules has
failed (so that you only see one path to the drives instead of 2). This means that you are trying to replace
drawer 3 in enclosure 2. Assume that the drawer spans recovery groups rgL and rgR.

Determine the enclosure serial number:


> mmlspdisk rgL --pdisk e2d3s01 | grep -w location
location = "SV21106537-3-1"

Examine the states of the pdisks and find that they are all ok.
> mmlsrecoverygroup rgL -L --pdisk | grep e2d3
e2d3s01 1, 2 DA1 1862 GiB normal ok
e2d3s02 1, 2 DA1 1862 GiB normal ok
e2d3s03 1, 2 DA1 1862 GiB normal ok
e2d3s04 1, 2 DA1 1862 GiB normal ok
e2d3s05 1, 2 DA1 1862 GiB normal ok
e2d3s06 1, 2 DA1 1862 GiB normal ok

> mmlsrecoverygroup rgR -L --pdisk | grep e2d3


e2d3s07 1, 2 DA1 1862 GiB normal ok
e2d3s08 1, 2 DA1 1862 GiB normal ok
e2d3s09 1, 2 DA1 1862 GiB normal ok
e2d3s10 1, 2 DA1 1862 GiB normal ok
e2d3s11 1, 2 DA1 1862 GiB normal ok
e2d3s12 1, 2 DA1 1862 GiB normal ok

Determine whether online replacement is theoretically possible by consulting Table 4 on page 54.

The system is ESS GL4, so according to the last column drawer replacement is theoretically possible.

Determine the actual disk group fault tolerance of the vdisks in both recovery groups.
> mmlsrecoverygroup rgL -L
declustered
recovery group arrays vdisks pdisks format version
----------------- ------------- ------ ------ --------------
rgL 4 5 119 4.2.0.1

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- ----------------------------
SSD no 1 1 0,0 1 186 GiB 14 days scrub 8% low
NVR no 1 2 0,0 1 3632 MiB 14 days scrub 8% low

56 ESS 5.3.1: Problem Determination Guide


DA1 no 3 58 2,31 2 16 GiB 14 days scrub 5% low
DA2 no 0 58 2,31 2 152 TiB 14 days inactive 0% low

declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
----------------- ----------------- ----------- ---------- ---------- ----------- ----- -------
logtip_rgL 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip
logtipbackup_rgL Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup
loghome_rgL 4WayReplication DA1 20 GiB 2 MiB 4096 ok log
md_DA1_rgL 4WayReplication DA1 101 GiB 512 KiB 32 KiB ok
da_DA1_rgL 8+3p DA1 110 TiB 8 MiB 32 KiB ok

config data declustered array VCD spares actual rebuild spare space remarks
----------------- ----------------- ----------- -------------------------------- ------------------------
rebuild space DA1 31 35 pdisk
rebuild space DA2 31 36 pdisk

config data max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
rg descriptor 1 enclosure + 1 drawer 1 enclosure + 1 drawer limiting fault tolerance
system index 2 enclosure 1 enclosure + 1 drawer limited by rg descriptor

vdisk max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
logtip_rgL 1 pdisk 1 pdisk
logtipbackup_rgL 0 pdisk 0 pdisk
loghome_rgL 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
md_DA1_rgL 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
da_DA1_rgL 1 enclosure 1 enclosure

active recovery group server servers


--------------------------------------------- -------
c55f05n01-te0.gpfs.net c55f05n01-te0.gpfs.net,c55f05n02-te0.gpfs.net

.
.
.
> mmlsrecoverygroup rgR -L
declustered
recovery group arrays vdisks pdisks format version
----------------- ------------- ------ ------ --------------
rgR 4 5 119 4.2.0.1

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- ----------------------------
SSD no 1 1 0,0 1 186 GiB 14 days scrub 8% low
NVR no 1 2 0,0 1 3632 MiB 14 days scrub 8% low
DA1 no 3 58 2,31 2 16 GiB 14 days scrub 5% low
DA2 no 0 58 2,31 2 152 TiB 14 days inactive 0% low

declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
----------------- ----------------- ----------- ---------- ---------- ----------- ----- -------
logtip_rgR 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip
logtipbackup_rgR Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup
loghome_rgR 4WayReplication DA1 20 GiB 2 MiB 4096 ok log
md_DA1_rgR 4WayReplication DA1 101 GiB 512 KiB 32 KiB ok
da_DA1_rgR 8+3p DA1 110 TiB 8 MiB 32 KiB ok

config data declustered array VCD spares actual rebuild spare space remarks
----------------- ----------------- ----------- -------------------------------- ------------------------
rebuild space DA1 31 35 pdisk
rebuild space DA2 31 36 pdisk

config data max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
rg descriptor 1 enclosure + 1 drawer 1 enclosure + 1 drawer limiting fault tolerance
system index 2 enclosure 1 enclosure + 1 drawer limited by rg descriptor

vdisk max disk group fault tolerance actual disk group fault tolerance remarks

Chapter 12. Maintenance procedures 57


----------------- ------------------------------ --------------------------------- ------------------------
logtip_rgR 1 pdisk 1 pdisk
logtipbackup_rgR 0 pdisk 0 pdisk
loghome_rgR 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
md_DA1_rgR 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
da_DA1_rgR 1 enclosure 1 enclosure

active recovery group server servers


--------------------------------------------- -------
c55f05n02-te0.gpfs.net c55f05n02-te0.gpfs.net,c55f05n01-te0.gpfs.net

The rg descriptor has an actual fault tolerance of 1 enclosure + 1 drawer (1E+1D). The data vdisks have a
RAID code of 8+3P and an actual fault tolerance of 1 enclosure (1E). The metadata vdisks have a RAID
code of 4WayReplication and an actual fault tolerance of 1 enclosure + 1 drawer (1E+1D).

Compare the actual disk group fault tolerance with the disk group fault tolerance listed in Table 4 on
page 54.

The actual values match the table values exactly. Therefore, drawer replacement can proceed.

Quiesce the pdisks.

Choose one of the following methods to suspend all the pdisks in the drawer.
v Using the chdrawer sample script:
/usr/lpp/mmfs/samples/vdisk/chdrawer SV21106537 3 --release
v Manually using the mmchpdisk command:
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d3s$slotNumber --suspend ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d3s$slotNumber --suspend ; done

Verify the states of the pdisks and find that they are all suspended.

> mmlsrecoverygroup rgL -L --pdisk | grep e2d3


e2d3s01 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s02 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s03 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s04 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s05 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s06 0, 2 DA1 1862 GiB normal ok/suspended
> mmlsrecoverygroup rgR -L --pdisk | grep e2d3
e2d3s07 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s08 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s09 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s10 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s11 0, 2 DA1 1862 GiB normal ok/suspended
e2d3s12 0, 2 DA1 1862 GiB normal ok/suspended

Remove the drives; make sure to record the location of the drives and label them. You will need to
replace them in the corresponding slots of the new drawer later.

Replace the drawer following standard hardware procedures.

Replace the drives in the corresponding slots of the new drawer.

Resume the pdisks.


v Using the chdrawer sample script:
/usr/lpp/mmfs/samples/vdisk/chdrawer EnclosureSerialNumber DrawerNumber --replace
v Manually using the mmchpdisk command:

58 ESS 5.3.1: Problem Determination Guide


for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d3s$slotNumber --resume ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d3s$slotNumber --resume ; done

Verify that all the pdisks have been resumed.


> mmlsrecoverygroup rgL -L --pdisk | grep e2d3
e2d3s01 2, 2 DA1 1862 GiB normal ok
e2d3s02 2, 2 DA1 1862 GiB normal ok
e2d3s03 2, 2 DA1 1862 GiB normal ok
e2d3s04 2, 2 DA1 1862 GiB normal ok
e2d3s05 2, 2 DA1 1862 GiB normal ok
e2d3s06 2, 2 DA1 1862 GiB normal ok

> mmlsrecoverygroup rgR -L --pdisk | grep e2d3


e2d3s07 2, 2 DA1 1862 GiB normal ok
e2d3s08 2, 2 DA1 1862 GiB normal ok
e2d3s09 2, 2 DA1 1862 GiB normal ok
e2d3s10 2, 2 DA1 1862 GiB normal ok
e2d3s11 2, 2 DA1 1862 GiB normal ok
e2d3s12 2, 2 DA1 1862 GiB normal ok

Replacing a failed ESS storage enclosure: a sample scenario


Enclosure replacement should be rare. Online replacement of an enclosure is only possible on a GL6 and
GS6.

Prerequisite information:
v IBM Spectrum Scale 4.1.1 PTF8 or 4.2.1 PTF1 is a prerequisite for this procedure to work. If you are not
at one of these levels or higher, contact IBM.
v This procedure is intended to be done as a partnership between the storage administrator and a
hardware service representative. The storage administrator is expected to understand the IBM
Spectrum Scale RAID concepts and the locations of the storage enclosures. The storage administrator is
responsible for all the steps except those in which the hardware is actually being worked on.
v The pdisks in a drawer span two recovery groups; therefore, it is very important that you examine the
pdisks and the fault tolerance of the vdisks in both recovery groups when going through these steps.
v An underlying principle is that enclosure replacement should never deliberately put any vdisk into
critical state. When vdisks are in critical state, there is no redundancy and the next single sector or
IO error can cause unavailability or data loss. If drawer replacement is not possible without making
the system critical, then the ESS has to be shut down before the drawer is removed. An example of
drawer replacement will follow these instructions.
1. If IBM Spectrum Scale is shut down: perform the enclosure replacement as soon as possible. Perform
steps 4b through 4h and then restart IBM Spectrum Scale.
2. Examine the states of the pdisks in the affected enclosure. If all the pdisk states are missing, dead, or
replace, then go to step 4b to perform drawer replacement as soon as possible without going through
any of the other steps in this procedure.
Assuming that you know the enclosure number and are using standard pdisk naming conventions,
you could use the following commands to display the pdisks and their states:
mmlsrecoverygroup LeftRecoveryGroupName -L --pdisk | grep e{EnclosureNumber}
mmlsrecoverygroup RightRecoveryGroupName -L --pdisk | grep e{EnclosureNumber}
3. Determine whether online replacement is possible.
a. Consult the following table to see if enclosure replacement is theoretically possible for this
configuration. The only required input at this step is the ESS model. The table shows each
possibleESS system as well as the configuration parameters for the systems. If the table indicates
that online replacement is impossible, IBM Spectrum Scale will need to be shut down (on at least
the two I/O servers involved) and you should go back to step 1. The fault tolerance notation uses
E for enclosure, D for drawer, and P for pdisk.

Chapter 12. Maintenance procedures 59


Additional background information on interpreting the fault tolerance values:
v For many of the systems, 1E is reported as the fault tolerance; however, this does not mean that
failure of x arbitrary drawers or y arbitrary pdisks can be tolerated. It means that the failure of
all the entities in one entire enclosure can be tolerated.
v A fault tolerance of 1E+1D or 2D implies that the failure of two arbitrary drawers can be
tolerated.
Table 5. ESS fault tolerance for drawer/enclosure
Is online
replacement
Hardware type (model name...) DA configuration Fault tolerance possible?
IBM ESS Enclosure # Encl. # Data DA Disks per # Spares RG desc Mirrored vdisk Parity vdisk
type per RG DA
GS1 2U-24 1 1 12 1 4P 3Way 2P 8+2p 2P No drawers,
enclosure
impossible
4Way 3P 8+3p 3P No drawers,
enclosure
impossible
GS2 2U-24 2 1 24 2 4P 3Way 2P 8+2p 2P No drawers,
enclosure
impossible
4Way 3P 8+3p 3P No drawers,
enclosure
impossible
GS4 2U-24 4 1 48 2 1E+1P 3Way 1E+1P 8+2p 2P No drawers,
enclosure
impossible
4Way 1E+1P 8+3p 1E No drawers,
enclosure
impossible
GS6 2U-24 6 1 72 2 1E+1P 3Way 1E+1P 8+2p 1E No drawers,
enclosure
impossible
4Way 1E+1P 8+3p 1E+1P No drawers,
enclosure
possible.
GL2 4U-60 (5d) 2 1 58 2 4D 3Way 2D 8+2p 2D Drawer
possible,
enclosure
impossible.
4Way 3D 8+3p 1D+1P Drawer
possible,
enclosure
impossible
GL4 4U-60 (5d) 4 2 58 2 1E+1D 3Way 1E+1D 8+2p 2D Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E Drawer
possible,
enclosure
impossible
GL4 4U-60 (5d) 4 1 116 4 1E+1D 3Way 1E+1D 8+2p 2D Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E Drawer
possible,
enclosure
impossible

60 ESS 5.3.1: Problem Determination Guide


Table 5. ESS fault tolerance for drawer/enclosure (continued)
Is online
replacement
Hardware type (model name...) DA configuration Fault tolerance possible?
GL6 4U-60 (5d) 6 3 58 2 1E+1D 3Way 1E+1D 8+2p 1E Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E+1D Drawer
possible,
enclosure
possible.
GL6 4U-60 (5d) 6 1 174 6 1E+1D 3Way 1E+1D 8+2p 1E Drawer
possible,
enclosure
impossible
4Way 1E+1D 8+3p 1E+1D Drawer
possible,
enclosure
possible.

b. Determine the actual disk group fault tolerance of the vdisks in both recovery groups using the
mmlsrecoverygroup RecoveryGroupName -L command. The rg descriptor and all the vdisks must be
able to tolerate the loss of the item being replaced plus one other item. This is necessary because
the disk group fault tolerance code uses a definition of "tolerance" that includes the system
running in critical mode. But since putting the system into critical is not advised, one other
item is required. For example, all the following would be a valid fault tolerance to continue with
enclosure replacement: 1E+1D and 1E+1P.
c. Compare the actual disk group fault tolerance with the disk group fault tolerance listed in Table 5
on page 60. If the system is using a mix of 2-fault-tolerant and 3-fault-tolerant vdisks, the
comparisons must be done with the weaker (2-fault-tolerant) values. If the fault tolerance can
tolerate at least the item being replaced plus one other item, then replacement can proceed. Go to
step 4.
4. Enclosure Replacement procedure.
a. Quiesce the pdisks.
For GL systems, issue the following commands for each drawer.
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk LeftRecoveryGroupName --pdisk
e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --suspend ; done

for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk RightRecoveryGroupName --pdisk


e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --suspend ; done
For GS systems, issue:
for slotNumber in 01 02 03 04 05 06 07 08 09 10 11 12 ; do mmchpdisk LeftRecoveryGroupName --pdisk
e{EnclosureNumber}s{$slotNumber} --suspend ; done

for slotNumber in 13 14 15 16 17 18 19 20 21 22 23 24 ; do mmchpdisk RightRecoveryGroupName --pdisk


e{EnclosureNumber}s{$slotNumber} --suspend ; done
Verify that the pdisks were suspended using the mmlsrecoverygroup command as shown in step 2.
b. Remove the drives; make sure to record the location of the drives and label them. You will need to
replace them in the corresponding slots of the new enclosure later.
c. Replace the enclosure following standard hardware procedures.
v Remove the SAS connections in the rear of the enclosure.
v Remove the enclosure.
v Install the new enclosure.
d. Replace the drives in the corresponding slots of the new enclosure.
e. Connect the SAS connections in the rear of the new enclosure.
f. Power up the enclosure.

Chapter 12. Maintenance procedures 61


g. Verify the SAS topology on the servers to ensure that all drives from the new storage enclosure are
present.
h. Update the necessary firmware on the new storage enclosure as needed.
i. Resume the pdisks.
For GL systems, issue:
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk LeftRecoveryGroupName --pdisk
e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --resume ; done

for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk RightRecoveryGroupName --pdisk


e{EnclosureNumber}d{DrawerNumber}s{$slotNumber} --resume ; done
For GS systems, issue:
for slotNumber in 01 02 03 04 05 06 07 08 09 10 11 12 ; do mmchpdisk LeftRecoveryGroupName --pdisk
e{EnclosureNumber}s{$slotNumber} --resume ; done

for slotNumber in 13 14 15 16 17 18 19 20 21 22 23 24 ; do mmchpdisk RightRecoveryGroupName --pdisk


e{EnclosureNumber}s{$slotNumber} --resume ; done
Verify that the pdisks were resumed using the mmlsrecoverygroup command as shown in step 2.

Example

The system is a GL6 with vdisks that have 4way mirroring and 8+3p RAID codes. Assume that the
enclosure that contains pdisk e2d3s01 needs to be replaced. This means that you are trying to replace
enclosure 2.

Assume that the enclosure spans recovery groups rgL and rgR.

Determine the enclosure serial number:


> mmlspdisk rgL --pdisk e2d3s01 | grep -w location
location = "SV21106537-3-1"

Examine the states of the pdisks and find that they are all ok instead of missing. (Given that you have
a failed enclosure, all the drives would not likely be in an ok state, but this is just an example.)
> mmlsrecoverygroup rgL -L --pdisk | grep e2

e2d1s01 2, 4 DA1 96 GiB normal ok


e2d1s02 2, 4 DA1 96 GiB normal ok
e2d1s04 2, 4 DA1 96 GiB normal ok
e2d1s05 2, 4 DA2 2792 GiB normal ok/noData
e2d1s06 2, 4 DA2 2792 GiB normal ok/noData
e2d2s01 2, 4 DA1 96 GiB normal ok
e2d2s02 2, 4 DA1 98 GiB normal ok
e2d2s03 2, 4 DA1 96 GiB normal ok
e2d2s04 2, 4 DA2 2792 GiB normal ok/noData
e2d2s05 2, 4 DA2 2792 GiB normal ok/noData
e2d2s06 2, 4 DA2 2792 GiB normal ok/noData
e2d3s01 2, 4 DA1 96 GiB normal ok
e2d3s02 2, 4 DA1 94 GiB normal ok
e2d3s03 2, 4 DA1 96 GiB normal ok
e2d3s04 2, 4 DA2 2792 GiB normal ok/noData
e2d3s05 2, 4 DA2 2792 GiB normal ok/noData
e2d3s06 2, 4 DA2 2792 GiB normal ok/noData
e2d4s01 2, 4 DA1 96 GiB normal ok
e2d4s02 2, 4 DA1 96 GiB normal ok
e2d4s03 2, 4 DA1 96 GiB normal ok
e2d4s04 2, 4 DA2 2792 GiB normal ok/noData
e2d4s05 2, 4 DA2 2792 GiB normal ok/noData
e2d4s06 2, 4 DA2 2792 GiB normal ok/noData
e2d5s01 2, 4 DA1 96 GiB normal ok

62 ESS 5.3.1: Problem Determination Guide


e2d5s02 2, 4 DA1 96 GiB normal ok
e2d5s03 2, 4 DA1 96 GiB normal ok
e2d5s04 2, 4 DA2 2792 GiB normal ok/noData
e2d5s05 2, 4 DA2 2792 GiB normal ok/noData
e2d5s06 2, 4 DA2 2792 GiB normal ok/noData
> mmlsrecoverygroup rgR -L --pdisk | grep e2
e2d1s07 2, 4 DA1 96 GiB normal ok
e2d1s08 2, 4 DA1 94 GiB normal ok
e2d1s09 2, 4 DA1 96 GiB normal ok
e2d1s10 2, 4 DA2 2792 GiB normal ok/noData
e2d1s11 2, 4 DA2 2792 GiB normal ok/noData
e2d1s12 2, 4 DA2 2792 GiB normal ok/noData
e2d2s07 2, 4 DA1 96 GiB normal ok
e2d2s08 2, 4 DA1 96 GiB normal ok
e2d2s09 2, 4 DA1 94 GiB normal ok
e2d2s10 2, 4 DA2 2792 GiB normal ok/noData
e2d2s11 2, 4 DA2 2792 GiB normal ok/noData
e2d2s12 2, 4 DA2 2792 GiB normal ok/noData
e2d3s07 2, 4 DA1 94 GiB normal ok
e2d3s08 2, 4 DA1 96 GiB normal ok
e2d3s09 2, 4 DA1 96 GiB normal ok
e2d3s10 2, 4 DA2 2792 GiB normal ok/noData
e2d3s11 2, 4 DA2 2792 GiB normal ok/noData
e2d3s12 2, 4 DA2 2792 GiB normal ok/noData
e2d4s07 2, 4 DA1 96 GiB normal ok
e2d4s08 2, 4 DA1 94 GiB normal ok
e2d4s09 2, 4 DA1 96 GiB normal ok
e2d4s10 2, 4 DA2 2792 GiB normal ok/noData
e2d4s11 2, 4 DA2 2792 GiB normal ok/noData
e2d4s12 2, 4 DA2 2792 GiB normal ok/noData
e2d5s07 2, 4 DA2 2792 GiB normal ok/noData
e2d5s08 2, 4 DA1 108 GiB normal ok
e2d5s09 2, 4 DA1 108 GiB normal ok
e2d5s10 2, 4 DA2 2792 GiB normal ok/noData
e2d5s11 2, 4 DA2 2792 GiB normal ok/noData

Determine whether online replacement is theoretically possible by consulting Table 5 on page 60.

The system is ESS GL6, so according to the last column enclosure replacement is theoretically possible.

Determine the actual disk group fault tolerance of the vdisks in both recovery groups.
> mmlsrecoverygroup rgL -L
declustered
recovery group arrays vdisks pdisks format version
----------------- ------------- ------ ------ --------------
rgL 4 5 177 4.2.0.1

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- ----------------------------
SSD no 1 1 0,0 1 186 GiB 14 days scrub 8% low
NVR no 1 2 0,0 1 3632 MiB 14 days scrub 8% low
DA1 no 3 174 2,31 2 16 GiB 14 days scrub 5% low

declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
----------------- ----------------- ----------- ---------- ---------- ----------- ----- -------
logtip_rgL 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip
logtipbackup_rgL Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup
loghome_rgL 4WayReplication DA1 20 GiB 2 MiB 4096 ok log
md_DA1_rgL 4WayReplication DA1 101 GiB 512 KiB 32 KiB ok
da_DA1_rgL 8+3p DA1 110 TiB 8 MiB 32 KiB ok

config data declustered array VCD spares actual rebuild spare space remarks
----------------- ----------------- ----------- -------------------------------- ------------------------
rebuild space DA1 31 35 pdisk

Chapter 12. Maintenance procedures 63


config data max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
rg descriptor 1 enclosure + 1 drawer 1 enclosure + 1 drawer limiting fault tolerance
system index 2 enclosure 1 enclosure + 1 drawer limited by rg descriptor

vdisk max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
logtip_rgL 1 pdisk 1 pdisk
logtipbackup_rgL 0 pdisk 0 pdisk
loghome_rgL 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
md_DA1_rgL 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
da_DA1_rgL 1 enclosure + 1 drawer 1 enclosure + 1 drawer

active recovery group server servers


--------------------------------------------- -------
c55f05n01-te0.gpfs.net c55f05n01-te0.gpfs.net,c55f05n02-te0.gpfs.net

.
.
.
> mmlsrecoverygroup rgR -L
declustered
recovery group arrays vdisks pdisks format version
----------------- ------------- ------ ------ --------------
rgR 4 5 177 4.2.0.1

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- ----------------------------
SSD no 1 1 0,0 1 186 GiB 14 days scrub 8% low
NVR no 1 2 0,0 1 3632 MiB 14 days scrub 8% low
DA1 no 3 174 2,31 2 16 GiB 14 days scrub 5% low

declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
----------------- ----------------- ----------- ---------- ---------- ----------- ----- -------
logtip_rgR 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip
logtipbackup_rgR Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup
loghome_rgR 4WayReplication DA1 20 GiB 2 MiB 4096 ok log
md_DA1_rgR 4WayReplication DA1 101 GiB 512 KiB 32 KiB ok
da_DA1_rgR 8+3p DA1 110 TiB 8 MiB 32 KiB ok

config data declustered array VCD spares actual rebuild spare space remarks
----------------- ----------------- ----------- -------------------------------- ------------------------
rebuild space DA1 31 35 pdisk

config data max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
rg descriptor 1 enclosure + 1 drawer 1 enclosure + 1 drawer limiting fault tolerance
system index 2 enclosure 1 enclosure + 1 drawer limited by rg descriptor

vdisk max disk group fault tolerance actual disk group fault tolerance remarks
----------------- ------------------------------ --------------------------------- ------------------------
logtip_rgR 1 pdisk 1 pdisk
logtipbackup_rgR 0 pdisk 0 pdisk
loghome_rgR 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
md_DA1_rgR 3 enclosure 1 enclosure + 1 drawer limited by rg descriptor
da_DA1_rgR 1 enclosure + 1 drawer 1 enclosure + 1 drawer

active recovery group server servers


--------------------------------------------- -------
c55f05n02-te0.gpfs.net c55f05n02-te0.gpfs.net,c55f05n01-te0.gpfs.net

The rg descriptor has an actual fault tolerance of 1 enclosure + 1 drawer (1E+1D). The data vdisks have a
RAID code of 8+3P and an actual fault tolerance of 1 enclosure (1E). The metadata vdisks have a RAID
code of 4WayReplication and an actual fault tolerance of 1 enclosure + 1 drawer (1E+1D).

Compare the actual disk group fault tolerance with the disk group fault tolerance listed in Table 5 on
page 60.

The actual values match the table values exactly. Therefore, enclosure replacement can proceed.

Quiesce the pdisks.


for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d1s$slotNumber --suspend ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d1s$slotNumber --suspend ; done

64 ESS 5.3.1: Problem Determination Guide


for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d2s$slotNumber --suspend ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d2s$slotNumber --suspend ; done

for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d3s$slotNumber --suspend ; done


for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d3s$slotNumber --suspend ; done

for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d4s$slotNumber --suspend ; done


for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d4s$slotNumber --suspend ; done

for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d5s$slotNumber --suspend ; done


for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d5s$slotNumber --suspend ; done

Verify the pdisks were suspended using the mmlsrecoverygroup command. You should see suspended as
part of the pdisk state.
> mmlsrecoverygroup rgL -L --pdisk | grep e2d
e2d1s01 0, 4 DA1 96 GiB normal ok/suspended
e2d1s02 0, 4 DA1 96 GiB normal ok/suspended
e2d1s04 0, 4 DA1 96 GiB normal ok/suspended
e2d1s05 0, 4 DA2 2792 GiB normal ok/suspended
e2d1s06 0, 4 DA2 2792 GiB normal ok/suspended
e2d2s01 0, 4 DA1 96 GiB normal ok/suspended
.
.
.

> mmlsrecoverygroup rgR -L --pdisk | grep e2d


e2d1s07 0, 4 DA1 96 GiB normal ok/suspended
e2d1s08 0, 4 DA1 94 GiB normal ok/suspended
e2d1s09 0, 4 DA1 96 GiB normal ok/suspended
e2d1s10 0, 4 DA2 2792 GiB normal ok/suspended
e2d1s11 0, 4 DA2 2792 GiB normal ok/suspended
e2d1s12 0, 4 DA2 2792 GiB normal ok/suspended
e2d2s07 0, 4 DA1 96 GiB normal ok/suspended
e2d2s08 0, 4 DA1 96 GiB normal ok/suspended
.
.
.

Remove the drives; make sure to record the location of the drives and label them. You will need to
replace them in the corresponding drawer slots of the new enclosure later.

Replace the enclosure following standard hardware procedures.


v Remove the SAS connections in the rear of the enclosure.
v Remove the enclosure.
v Install the new enclosure.

Replace the drives in the corresponding drawer slots of the new enclosure.

Connect the SAS connections in the rear of the new enclosure.

Power up the enclosure.

Verify the SAS topology on the servers to ensure that all drives from the new storage enclosure are
present.

Update the necessary firmware on the new storage enclosure as needed.

Resume the pdisks.

Chapter 12. Maintenance procedures 65


for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d1s$slotNumber --resume ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d1s$slotNumber --resume ; done
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d2s$slotNumber --resume ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d2s$slotNumber --resume ; done
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d3s$slotNumber --resume ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d3s$slotNumber --resume ; done
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d4s$slotNumber --resume ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d4s$slotNumber --resume ; done
for slotNumber in 01 02 03 04 05 06 ; do mmchpdisk rgL --pdisk e2d5s$slotNumber --resume ; done
for slotNumber in 07 08 09 10 11 12 ; do mmchpdisk rgR --pdisk e2d5s$slotNumber --resume ; done

Verify that the pdisks were resumed by using the mmlsrecoverygroup command.
> mmlsrecoverygroup rgL -L --pdisk | grep e2

e2d1s01 2, 4 DA1 96 GiB normal ok


e2d1s02 2, 4 DA1 96 GiB normal ok
e2d1s04 2, 4 DA1 96 GiB normal ok
e2d1s05 2, 4 DA2 2792 GiB normal ok/noData
e2d1s06 2, 4 DA2 2792 GiB normal ok/noData
e2d2s01 2, 4 DA1 96 GiB normal ok
.
.
.

> mmlsrecoverygroup rgR -L --pdisk | grep e2

e2d1s07 2, 4 DA1 96 GiB normal ok


e2d1s08 2, 4 DA1 94 GiB normal ok
e2d1s09 2, 4 DA1 96 GiB normal ok
e2d1s10 2, 4 DA2 2792 GiB normal ok/noData
e2d1s11 2, 4 DA2 2792 GiB normal ok/noData
e2d1s12 2, 4 DA2 2792 GiB normal ok/noData
e2d2s07 2, 4 DA1 96 GiB normal ok
e2d2s08 2, 4 DA1 96 GiB normal ok
.
.
.

Replacing failed disks in a Power 775 Disk Enclosure recovery group:


a sample scenario
The scenario presented here shows how to detect and replace failed disks in a recovery group built on a
Power 775 Disk Enclosure.

Detecting failed disks in your enclosure

Assume a fully-populated Power 775 Disk Enclosure (serial number 000DE37) on which the following
two recovery groups are defined:
v 000DE37TOP containing the disks in the top set of carriers
v 000DE37BOT containing the disks in the bottom set of carriers

Each recovery group contains the following:


v one log declustered array (LOG)
v four data declustered arrays (DA1, DA2, DA3, DA4)
The data declustered arrays are defined according to Power 775 Disk Enclosure best practice as follows:
v 47 pdisks per data declustered array
v each member pdisk from the same carrier slot

66 ESS 5.3.1: Problem Determination Guide


v default disk replacement threshold value set to 2

The replacement threshold of 2 means that GNR will only require disk replacement when two or more
disks have failed in the declustered array; otherwise, rebuilding onto spare space or reconstruction from
redundancy will be used to supply affected data.

This configuration can be seen in the output of mmlsrecoverygroup for the recovery groups, shown here
for 000DE37TOP:
# mmlsrecoverygroup 000DE37TOP -L

declustered
recovery group arrays vdisks pdisks
----------------- ----------- ------ ------
000DE37TOP 5 9 192

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
DA1 no 2 47 2 2 3072 MiB 14 days scrub 63% low
DA2 no 2 47 2 2 3072 MiB 14 days scrub 19% low
DA3 yes 2 47 2 2 0 B 14 days rebuild-2r 48% low
DA4 no 2 47 2 2 3072 MiB 14 days scrub 33% low
LOG no 1 4 1 1 546 GiB 14 days scrub 87% low

declustered
vdisk RAID code array vdisk size remarks
------------------ ------------------ ----------- ---------- -------
000DE37TOPLOG 3WayReplication LOG 4144 MiB log
000DE37TOPDA1META 4WayReplication DA1 250 GiB
000DE37TOPDA1DATA 8+3p DA1 17 TiB
000DE37TOPDA2META 4WayReplication DA2 250 GiB
000DE37TOPDA2DATA 8+3p DA2 17 TiB
000DE37TOPDA3META 4WayReplication DA3 250 GiB
000DE37TOPDA3DATA 8+3p DA3 17 TiB
000DE37TOPDA4META 4WayReplication DA4 250 GiB
000DE37TOPDA4DATA 8+3p DA4 17 TiB

active recovery group server servers


----------------------------------------------- -------
server1 server1,server2

The indication that disk replacement is called for in this recovery group is the value of yes in the needs
service column for declustered array DA3.

The fact that DA3 (the declustered array on the disks in carrier slot 3) is undergoing rebuild of its RAID
tracks that can tolerate two strip failures is by itself not an indication that disk replacement is required; it
merely indicates that data from a failed disk is being rebuilt onto spare space. Only if the replacement
threshold has been met will disks be marked for replacement and the declustered array marked as
needing service.

GNR provides several indications that disk replacement is required:


v entries in the AIX error report or the Linux syslog
v the pdReplacePdisk callback, which can be configured to run an administrator-supplied script at the
moment a pdisk is marked for replacement
v the POWER7® cluster event notification TEAL agent, which can be configured to send disk replacement
notices when they occur to the POWER7 cluster EMS
v the output from the following commands, which may be performed from the command line on any
GPFS cluster node (see the examples that follow):
1. mmlsrecoverygroup with the -L flag shows yes in the needs service column

Chapter 12. Maintenance procedures 67


2. mmlsrecoverygroup with the -L and --pdisk flags; this shows the states of all pdisks, which may be
examined for the replace pdisk state
3. mmlspdisk with the --replace flag, which lists only those pdisks that are marked for replacement

Note: Because the output of mmlsrecoverygroup -L --pdisk for a fully-populated disk enclosure is very
long, this example shows only some of the pdisks (but includes those marked for replacement).
# mmlsrecoverygroup 000DE37TOP -L --pdisk

declustered
recovery group arrays vdisks pdisks
----------------- ----------- ------ ------
000DE37TOP 5 9 192

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
DA1 no 2 47 2 2 3072 MiB 14 days scrub 63% low
DA2 no 2 47 2 2 3072 MiB 14 days scrub 19% low
DA3 yes 2 47 2 2 0 B 14 days rebuild-2r 68% low
DA4 no 2 47 2 2 3072 MiB 14 days scrub 34% low
LOG no 1 4 1 1 546 GiB 14 days scrub 87% low

n. active, declustered user state,


pdisk total paths array free space condition remarks
----------------- ----------- ----------- ---------- ----------- -------
[...]
c014d1 2, 4 DA1 62 GiB normal ok
c014d2 2, 4 DA2 279 GiB normal ok
c014d3 0, 0 DA3 279 GiB replaceable dead/systemDrain/noRGD/noVCD/replace
c014d4 2, 4 DA4 12 GiB normal ok
[...]
c018d1 2, 4 DA1 24 GiB normal ok
c018d2 2, 4 DA2 24 GiB normal ok
c018d3 2, 4 DA3 558 GiB replaceable dead/systemDrain/noRGD/noVCD/noData/replace
c018d4 2, 4 DA4 12 GiB normal ok
[...]

The preceding output shows that the following pdisks are marked for replacement:
v c014d3 in DA3
v c018d3 in DA3

The naming convention used during recovery group creation indicates that these are the disks in slot 3 of
carriers 14 and 18. To confirm the physical locations of the failed disks, use the mmlspdisk command to
list information about those pdisks in declustered array DA3 of recovery group 000DE37TOP that are
marked for replacement:
# mmlspdisk 000DE37TOP --declustered-array DA3 --replace
pdisk:
replacementPriority = 1.00
name = "c014d3"
device = "/dev/rhdisk158,/dev/rhdisk62"
recoveryGroup = "000DE37TOP"
declusteredArray = "DA3"
state = "dead/systemDrain/noRGD/noVCD/replace"
.
.
.
pdisk:
replacementPriority = 1.00
name = "c018d3"
device = "/dev/rhdisk630,/dev/rhdisk726"
recoveryGroup = "000DE37TOP"
declusteredArray = "DA3"
state = "dead/systemDrain/noRGD/noVCD/noData/replace"
.
.
.

68 ESS 5.3.1: Problem Determination Guide


The preceding location code attributes confirm the pdisk naming convention:

Disk Location code Interpretation


pdisk c014d3 78AD.001.000DE37-C14-D3 Disk 3 in carrier 14 in the disk enclosure
identified by enclosure type 78AD.001 and
serial number 000DE37
pdisk c018d3 78AD.001.000DE37-C18-D3 Disk 3 in carrier 18 in the disk enclosure
identified by enclosure type 78AD.001 and
serial number 000DE37

Replacing the failed disks in a Power 775 Disk Enclosure recovery group

Note: In this example, it is assumed that two new disks with the appropriate Field Replaceable Unit
(FRU) code, as indicated by the fru attribute (74Y4936 in this case), have been obtained as replacements
for the failed pdisks c014d3 and c018d3.

Replacing each disk is a three-step process:


1. Using the mmchcarrier command with the --release flag to suspend use of the other disks in the
carrier and to release the carrier.
2. Removing the carrier and replacing the failed disk within with a new one.
3. Using the mmchcarrier command with the --replace flag to resume use of the suspended disks and to
begin use of the new disk.
GNR assigns a priority to pdisk replacement. Disks with smaller values for the replacementPriority
attribute should be replaced first. In this example, the only failed disks are in DA3 and both have the same
replacementPriority.

Disk c014d3 is chosen to be replaced first.


1. To release carrier 14 in disk enclosure 000DE37:
# mmchcarrier 000DE37TOP --release --pdisk c014d3
[I] Suspending pdisk c014d1 of RG 000DE37TOP in location 78AD.001.000DE37-C14-D1.
[I] Suspending pdisk c014d2 of RG 000DE37TOP in location 78AD.001.000DE37-C14-D2.
[I] Suspending pdisk c014d3 of RG 000DE37TOP in location 78AD.001.000DE37-C14-D3.
[I] Suspending pdisk c014d4 of RG 000DE37TOP in location 78AD.001.000DE37-C14-D4.
[I] Carrier released.

- Remove carrier.
- Replace disk in location 78AD.001.000DE37-C14-D3 with FRU 74Y4936.
- Reinsert carrier.
- Issue the following command:

mmchcarrier 000DE37TOP --replace --pdisk ’c014d3’

Repair timer is running. Perform the above within 5 minutes


to avoid pdisks being reported as missing.
GNR issues instructions as to the physical actions that must be taken. Note that disks may be
suspended only so long before they are declared missing; therefore the mechanical process of
physically performing disk replacement must be accomplished promptly.
Use of the other three disks in carrier 14 has been suspended, and carrier 14 is unlocked. The identify
lights for carrier 14 and for disk 3 are on.
2. Carrier 14 should be unlatched and removed. The failed disk 3, as indicated by the internal identify
light, should be removed, and the new disk with FRU 74Y4936 should be inserted in its place. Carrier
14 should then be reinserted and the latch closed.
3. To finish the replacement of pdisk c014d3:

Chapter 12. Maintenance procedures 69


# mmchcarrier 000DE37TOP --replace --pdisk c014d3
[I] The following pdisks will be formatted on node server1:
/dev/rhdisk354
[I] Pdisk c014d3 of RG 000DE37TOP successfully replaced.
[I] Resuming pdisk c014d1 of RG 000DE37TOP.
[I] Resuming pdisk c014d2 of RG 000DE37TOP.
[I] Resuming pdisk c014d3#162 of RG 000DE37TOP.
[I] Resuming pdisk c014d4 of RG 000DE37TOP.
[I] Carrier resumed.

When the mmchcarrier --replace command returns successfully, GNR has resumed use of the other 3
disks. The failed pdisk may remain in a temporary form (indicated here by the name c014d3#162) until all
data from it has been rebuilt, at which point it is finally deleted. The new replacement disk, which has
assumed the name c014d3, will have RAID tracks rebuilt and rebalanced onto it. Notice that only one
block device name is mentioned as being formatted as a pdisk; the second path will be discovered in the
background.

This can be confirmed with mmlsrecoverygroup -L --pdisk:


# mmlsrecoverygroup 000DE37TOP -L --pdisk

declustered
recovery group arrays vdisks pdisks
----------------- ----------- ------ ------
000DE37TOP 5 9 193

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
DA1 no 2 47 2 2 3072 MiB 14 days scrub 63% low
DA2 no 2 47 2 2 3072 MiB 14 days scrub 19% low
DA3 yes 2 48 2 2 0 B 14 days rebuild-2r 89% low
DA4 no 2 47 2 2 3072 MiB 14 days scrub 34% low
LOG no 1 4 1 1 546 GiB 14 days scrub 87% low

n. active, declustered user state,


pdisk total paths array free space condition remarks
----------------- ----------- ----------- ---------- ----------- -------
[...]
c014d1 2, 4 DA1 23 GiB normal ok
c014d2 2, 4 DA2 23 GiB normal ok
c014d3 2, 4 DA3 550 GiB normal ok
c014d3#162 0, 0 DA3 543 GiB replaceable dead/adminDrain/noRGD/noVCD/noPath
c014d4 2, 4 DA4 23 GiB normal ok
[...]
c018d1 2, 4 DA1 24 GiB normal ok
c018d2 2, 4 DA2 24 GiB normal ok
c018d3 0, 0 DA3 558 GiB replaceable dead/systemDrain/noRGD/noVCD/noData/replace
c018d4 2, 4 DA4 23 GiB normal ok
[...]

Notice that the temporary pdisk c014d3#162 is counted in the total number of pdisks in declustered array
DA3 and in the recovery group, until it is finally drained and deleted.

Notice also that pdisk c018d3 is still marked for replacement, and that DA3 still needs service. This is
because GNR replacement policy expects all failed disks in the declustered array to be replaced once the
replacement threshold is reached. The replace state on a pdisk is not removed when the total number of
failed disks goes under the threshold.

Pdisk c018d3 is replaced following the same process.


1. Release carrier 18 in disk enclosure 000DE37:
# mmchcarrier 000DE37TOP --release --pdisk c018d3
[I] Suspending pdisk c018d1 of RG 000DE37TOP in location 78AD.001.000DE37-C18-D1.
[I] Suspending pdisk c018d2 of RG 000DE37TOP in location 78AD.001.000DE37-C18-D2.
[I] Suspending pdisk c018d3 of RG 000DE37TOP in location 78AD.001.000DE37-C18-D3.
[I] Suspending pdisk c018d4 of RG 000DE37TOP in location 78AD.001.000DE37-C18-D4.
[I] Carrier released.

70 ESS 5.3.1: Problem Determination Guide


- Remove carrier.
- Replace disk in location 78AD.001.000DE37-C18-D3 with FRU 74Y4936.
- Reinsert carrier.
- Issue the following command:

mmchcarrier 000DE37TOP --replace --pdisk ’c018d3’

Repair timer is running. Perform the above within 5 minutes


to avoid pdisks being reported as missing.
2. Unlatch and remove carrier 18, remove and replace failed disk 3, reinsert carrier 18, and close the
latch.
3. To finish the replacement of pdisk c018d3:
# mmchcarrier 000DE37TOP --replace --pdisk c018d3

[I] The following pdisks will be formatted on node server1:


/dev/rhdisk674
[I] Pdisk c018d3 of RG 000DE37TOP successfully replaced.
[I] Resuming pdisk c018d1 of RG 000DE37TOP.
[I] Resuming pdisk c018d2 of RG 000DE37TOP.
[I] Resuming pdisk c018d3#166 of RG 000DE37TOP.
[I] Resuming pdisk c018d4 of RG 000DE37TOP.
[I] Carrier resumed.

Running mmlsrecoverygroup again will confirm the second replacement:


# mmlsrecoverygroup 000DE37TOP -L --pdisk

declustered
recovery group arrays vdisks pdisks
----------------- ----------- ------ ------
000DE37TOP 5 9 192

declustered needs replace scrub background activity


array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
DA1 no 2 47 2 2 3072 MiB 14 days scrub 64% low
DA2 no 2 47 2 2 3072 MiB 14 days scrub 22% low
DA3 no 2 47 2 2 2048 MiB 14 days rebalance 12% low
DA4 no 2 47 2 2 3072 MiB 14 days scrub 36% low
LOG no 1 4 1 1 546 GiB 14 days scrub 89% low

n. active, declustered user state,


pdisk total paths array free space condition remarks
----------------- ----------- ----------- ---------- ----------- -------
[...]
c014d1 2, 4 DA1 23 GiB normal ok
c014d2 2, 4 DA2 23 GiB normal ok
c014d3 2, 4 DA3 271 GiB normal ok
c014d4 2, 4 DA4 23 GiB normal ok
[...]
c018d1 2, 4 DA1 24 GiB normal ok
c018d2 2, 4 DA2 24 GiB normal ok
c018d3 2, 4 DA3 542 GiB normal ok
c018d4 2, 4 DA4 23 GiB normal ok
[...]

Notice that both temporary pdisks have been deleted. This is because c014d3#162 has finished draining,
and because pdisk c018d3#166 had, before it was replaced, already been completely drained (as
evidenced by the noData flag). Declustered array DA3 no longer needs service and once again contains 47
pdisks, and the recovery group once again contains 192 pdisks.

Chapter 12. Maintenance procedures 71


Directed maintenance procedures available in the GUI
The directed maintenance procedures (DMPs) assist you to repair a problem when you select the action
Run fix procedure on a selected event from the Monitoring > Events page. DMPs are present for only a
few events reported in the system.

The following table provides details of the available DMPs and the corresponding events.
Table 6. DMPs
DMP Event ID
Replace disks gnr_pdisk_replaceable
Update enclosure firmware enclosure_firmware_wrong
Update drive firmware drive_firmware_wrong
Update host-adapter firmware adapter_firmware_wrong
Start NSD disk_down
Start GPFS daemon gpfs_down
Increase fileset space inode_error_high and inode_warn_high
Synchronize Node Clocks time_not_in_sync
Start performance monitoring collector service pmcollector_down
Start performance monitoring sensor service pmsensors_down
Activate AFM performance monitoring sensors afm_sensors_inactive
Activate NFS performance monitoring sensors nfs_sensors_inactive
Activate SMB performance monitoring sensors smb_sensors_inactive

Replace disks
The replace disks DMP assists you to replace the disks.

The following are the corresponding event details and proposed solution:
v Event name: gnr_pdisk_replaceable
v Problem: The state of a physical disk is changed to “replaceable”.
v Solution: Replace the disk.

The ESS GUI detects if a disk is broken and whether it needs to be replaced. In this case, launch this
DMP to get support to replace the broken disks. You can use this DMP either to replace one disk or
multiple disks.

The DMP automatically launches in corresponding mode depending on situation. You can launch this
DMP from the pages in the GUI and follow the wizard to release one or more disks:
v Monitoring > Hardware page: Select Replace Broken Disks from the Actionsmenu.
v Monitoring > Hardware page: Select the broken disk to be replaced in an enclosure and then select
Replace from the Actions menu.
v Monitoring > Events page: Select the gnr_pdisk_replaceable event from the event listing and then select
Run Fix Procedure from the Actions menu.
v Storage > Physical page: Select Replace Broken Disks from the Actions menu.
v Storage > Physical page: Select the disk to be replaced and then select Replace Disk from the Actions
menu.

The system issues the mmchcarrier command to replace disks as given in the following format:

72 ESS 5.3.1: Problem Determination Guide


/usr/lpp/mmfs/bin/mmchcarrier <<Disk_RecoveryGroup>>
--replace|--release|--resume --pdisk <<Disk_Name>> [--force-release]

For example: /usr/lpp/mmfs/bin/mmchcarrier G1 --replace --pdisk G1FSP11

| The system uses the following command on an mmvdisk-enabled environment to release and replace the
| disk:
| mmvdisk pdisk replace [--prepare | --cancel] --recovery-group DiskRecoveryGroup --pdisk DiskName

Update enclosure firmware


The update enclosure firmware DMP assists to update the enclosure firmware to the latest level.

The following are the corresponding event details and the proposed solution:
v Event name: enclosure_firmware_wrong
v Problem: The reported firmware level of the environmental service module is not compliant with the
recommendation.
v Solution: Update the firmware.

If more than one enclosure is not running the newest version of the firmware, the system prompts to
update the firmware. The system issues the mmchfirmware command to update firmware as given in the
following format:
mmchfirmware --esms <<ESM_Name>> --cluster
<<Cluster_Id>>- for all the enclosures : mmchfirmware --esms --cluster
<<Cluster_Id>>

For example, for a single enclosure:


mmchfirmware --esms 181880E-SV20706999_ESM_B –cluster 1857390657572243170

For all enclosures:


mmchfirmware --esms –cluster 1857390657572243170

Update drive firmware


The update drive firmware DMP assists to update the drive firmware to the latest level so that the
physical disk becomes compliant.

The following are the corresponding event details and the proposed solution:
v Event name: drive_firmware_wrong
v Problem: The reported firmware level of the physical disk is not compliant with the recommendation.
v Solution: Update the firmware.

If more than one disk is not running the newest version of the firmware, the system prompts to update
the firmware. The system issues the chfirmware command to update firmware as given in the following
format:

For singe disk:


chfirmware --pdisks <<entity_name>> --cluster <<Cluster_Id>>

For example:
chfirmware --pdisks <<ENC123001/DRV-2>> --cluster 1857390657572243170

For all disks:


chfirmware --pdisks --cluster <<Cluster_Id>>

Chapter 12. Maintenance procedures 73


For example:
chfirmware --pdisks –cluster 1857390657572243170

Update host-adapter firmware


The Update host-adapter firmware DMP assists to update the host-adapter firmware to the latest level.

The following are the corresponding event details and the proposed solution:
v Event name: adapter_firmware_wrong
v Problem: The reported firmware level of the host adapter is not compliant with the recommendation.
v Solution: Update the firmware.

If more than one host-adapter is not running the newest version of the firmware, the system prompts to
update the firmware. The system issues the chfirmware command to update firmware as given in the
following format:

For singe disk:


chfirmware --hostadapter <<Host_Adapter_Name>> --cluster <<Cluster_Id>>

For example:
chfirmware --hostadapter <<c45f02n04_HBA_2>> --cluster 1857390657572243170

For all disks:


chfirmware --hostadapter --cluster <<Cluster_Id>>

For example:
chfirmware --pdisks –cluster 1857390657572243170

Start NSD
The Start NSD DMP assists to start NSDs that are not working.

The following are the corresponding event details and the proposed solution:
v Event ID: disk_down
v Problem: The availability of an NSD is changed to “down”.
v Solution: Recover the NSD

The DMP provides the option to start the NSDs that are not functioning. If multiple NSDs are down, you
can select whether to recover only one NSD or all of them.

The system issues the mmchdisk command to recover NSDs as given in the following format:
/usr/lpp/mmfs/bin/mmchdisk <device> start -d <disk description>

For example: /usr/lpp/mmfs/bin/mmchdisk r1_FS start -d G1_r1_FS_data_0

Start GPFS daemon


When the GPFS daemon is down, GPFS functions do not work properly on the node.

The following are the corresponding event details and the proposed solution:
v Event ID: gpfs_down
v Problem: The GPFS daemon is down. GPFS is not operational on node.
v Solution: Start GPFS daemon.

74 ESS 5.3.1: Problem Determination Guide


The system issues the mmstartup -N command to restart GPFS daemon as given in the following format:
/usr/lpp/mmfs/bin/mmstartup -N <Node>

For example: usr/lpp/mmfs/bin/mmstartup -N gss-05.localnet.com

Increase fileset space


The system needs inodes to allow I/O on a fileset. If the inodes allocated to the fileset are exhausted, you
need to either increase the number of maximum inodes or delete the existing data to free up space.

The procedure helps to increase the maximum number of inodes by a percentage of the already allocated
inodes. The following are the corresponding event details and the proposed solution:
v Event ID: inode_error_high and inode_warn_high
v Problem: The inode usage in the fileset reached an exhausted level
v Solution: increase the maximum number of inodes

The system issues the mmchfileset command to recover NSDs as given in the following format:
/usr/lpp/mmfs/bin/mmchfileset <Device> <Fileset> --inode-limit <inodesMaxNumber>

For example: /usr/lpp/mmfs/bin/mmchfileset r1_FS testFileset --inode-limit 2048

Synchronize node clocks


The time must be in sync with the time set on the GUI node. If the time is not in sync, the data that is
displayed in the GUI might be wrong or it does not even display the details. For example, the GUI does
not display the performance data if time is not in sync.

The procedure assists to fix timing issue on a single node or on all nodes that are out of sync. The
following are the corresponding event details and the proposed solution:
v Event ID: time_not_in_sync
v Limitation: This DMP is not available in sudo wrapper clusters. In a sudo wrapper cluster, the user
name is different from 'root'. The system detects the user name by finding the parameter
GPFS_USER=<user name>, which is available in the file /usr/lpp/mmfs/gui/conf/gpfsgui.properties.
v Problem: The time on the node is not synchronous with the time on the GUI node. It differs more
than 1 minute.
v Solution: Synchronize the time with the time on the GUI node.

The system issues the sync_node_time command as given in the following format to synchronize the time
in the nodes:
/usr/lpp/mmfs/gui/bin/sync_node_time <nodeName>

For example: /usr/lpp/mmfs/gui/bin/sync_node_time c55f06n04.gpfs.net

Start performance monitoring collector service


The collector services on the GUI node must be functioning properly to display the performance data in
the IBM Spectrum Scale management GUI.

The following are the corresponding event details and the proposed solution:
v Event ID: pmcollector_down
v Limitation: This DMP is not available in sudo wrapper clusters when a remote pmcollector service is
used by the GUI. A remote pmcollector service is detected in case a different value than localhost is
specified in the ZIMonAddress in file, which is located at: /usr/lpp/mmfs/gui/conf/

Chapter 12. Maintenance procedures 75


gpfsgui.properties. In a sudo wrapper cluster, the user name is different from 'root'. The system
detects the user name by finding the parameter GPFS_USER=<user name>, which is available in the file
/usr/lpp/mmfs/gui/conf/gpfsgui.properties.
v Problem: The performance monitoring collector service pmcollector is in inactive state.
v Solution: Issue the systemctl status pmcollector to check the status of the collector. If pmcollector
service is inactive, issue systemctl start pmcollector.

The system restarts the performance monitoring services by issuing the systemctl restart pmcollector
command.

The performance monitoring collector service might be on some other node of the current cluster. In this
case, the DMP first connects to that node, then restarts the performance monitoring collector service.
ssh <nodeAddress> systemctl restart pmcollector

For example: ssh 10.0.100.21 systemctl restart pmcollector

In a sudo wrapper cluster, when collector on remote node is down, the DMP does not restart the collector
services by itself. You need to do it manually.

Start performance monitoring sensor service


You need to start the sensor service to get the performance details in the collectors. If sensors and
collectors are not started, the GUI and CLI do not display the performance data in the IBM Spectrum
Scale management GUI.

The following are the corresponding event details and the proposed solution:
v Event ID: pmsensors_down
v Limitation: This DMP is not available in sudo wrapper clusters. In a sudo wrapper cluster, the user
name is different from 'root'. The system detects the user name by finding the parameter
GPFS_USER=<user name>, which is available in the file /usr/lpp/mmfs/gui/conf/gpfsgui.properties.
v Problem: The performance monitoring sensor service pmsensor is not sending any data. The service
might be down or the difference between the time of the node and the node hosting the performance
monitoring collector service pmcollector is more than 15 minutes.
v Solution: Issue systemctl status pmsensors to verify the status of the sensor service. If pmsensor
service is inactive, issue systemctl start pmsensors.

The system restarts the sensors by issuing systemctl restart pmsensors command.

For example: ssh gss-15.localnet.com systemctl restart pmsensors

76 ESS 5.3.1: Problem Determination Guide


Chapter 13. References
The IBM Elastic Storage Server system displays a warning or error message when it encounters an issue
that needs user attention. The message severity tags indicate the severity of the issue

Events
The recorded events are stored in local database on each node. The user can get a list of recorded events
by using the mmhealth node eventlog command.

The recorded events can also be displayed through GUI.

The following sections list the RAS events that are applicable to various components of the IBM Spectrum
Scale system:

Messages
This topic contains explanations for IBM Spectrum Scale RAID and ESS GUI messages.

For information about IBM Spectrum Scale messages, see the IBM Spectrum Scale: Problem Determination
Guide.

Message severity tags


IBM Spectrum Scale and ESS GUI messages include message severity tags.

A severity tag is a one-character alphabetic code (A through Z).

For IBM Spectrum Scale messages, the severity tag is optionally followed by a colon (:) and a number,
and surrounded by an opening and closing bracket ([ ]). For example:
[E] or [E:nnn]

If more than one substring within a message matches this pattern (for example, [A] or [A:nnn]), the
severity tag is the first such matching string.

When the severity tag includes a numeric code (nnn), this is an error code associated with the message. If
this were the only problem encountered by the command, the command return code would be nnn.

If a message does not have a severity tag, the message does not conform to this specification. You can
determine the message severity by examining the text or any supplemental information provided in the
message catalog, or by contacting the IBM Support Center.

Each message severity tag has an assigned priority.

For IBM Spectrum Scale messages, this priority can be used to filter the messages that are sent to the
error log on Linux. Filtering is controlled with the mmchconfig attribute systemLogLevel. The default for
systemLogLevel is error, which means that IBM Spectrum Scale will send all error [E], critical [X], and
alert [A] messages to the error log. The values allowed for systemLogLevel are: alert, critical, error,
warning, notice, configuration, informational, detail, or debug. Additionally, the value none can be
specified so no messages are sent to the error log.

© Copyright IBM Corp. 2014, 2018 77


For IBM Spectrum Scale messages, alert [A] messages have the highest priority and debug [B] messages
have the lowest priority. If the systemLogLevel default of error is changed, only messages with the
specified severity and all those with a higher priority are sent to the error log.

The following table lists the IBM Spectrum Scale message severity tags in order of priority:
Table 7. IBM Spectrum Scale message severity tags ordered by priority
Type of message
(systemLogLevel
Severity tag attribute) Meaning
A alert Indicates a problem where action must be taken immediately. Notify the
appropriate person to correct the problem.
X critical Indicates a critical condition that should be corrected immediately. The
system discovered an internal inconsistency of some kind. Command
execution might be halted or the system might attempt to continue despite
the inconsistency. Report these errors to IBM.
E error Indicates an error condition. Command execution might or might not
continue, but this error was likely caused by a persistent condition and will
remain until corrected by some other program or administrative action. For
example, a command operating on a single file or other GPFS object might
terminate upon encountering any condition of severity E. As another
example, a command operating on a list of files, finding that one of the files
has permission bits set that disallow the operation, might continue to
operate on all other files within the specified list of files.
W warning Indicates a problem, but command execution continues. The problem can be
a transient inconsistency. It can be that the command has skipped some
operations on some objects, or is reporting an irregularity that could be of
interest. For example, if a multipass command operating on many files
discovers during its second pass that a file that was present during the first
pass is no longer present, the file might have been removed by another
command or program.
N notice Indicates a normal but significant condition. These events are unusual, but
are not error conditions, and could be summarized in an email to
developers or administrators for spotting potential problems. No immediate
action is required.
C configuration Indicates a configuration change; such as, creating a file system or removing
a node from the cluster.
I informational Indicates normal operation. This message by itself indicates that nothing is
wrong; no action is required.
D detail Indicates verbose operational messages; no is action required.
B debug Indicates debug-level messages that are useful to application developers for
debugging purposes. This information is not useful during operations.

For ESS GUI messages, error messages ((E)) have the highest priority and informational messages (I)
have the lowest priority.

The following table lists the ESS GUI message severity tags in order of priority:
Table 8. ESS GUI message severity tags ordered by priority
Severity tag Type of message Meaning
E Error Indicates a critical condition that should be corrected immediately. The
system discovered an internal inconsistency of some kind. Command
execution might be halted or the system might attempt to continue despite
the inconsistency. Report these errors to IBM.

78 ESS 5.3.1: Problem Determination Guide


6027-1850 [E] • 6027-1857 [E]

Table 8. ESS GUI message severity tags ordered by priority (continued)


Severity tag Type of message Meaning
W warning Indicates a problem, but command execution continues. The problem can be
a transient inconsistency. It can be that the command has skipped some
operations on some objects, or is reporting an irregularity that could be of
interest. For example, if a multipass command operating on many files
discovers during its second pass that a file that was present during the first
pass is no longer present, the file might have been removed by another
command or program.
I informational Indicates normal operation. This message by itself indicates that nothing is
wrong; no action is required.

IBM Spectrum Scale RAID messages


This section lists the IBM Spectrum Scale RAID messages.

For information about the severity designations of these messages, see “Message severity tags” on page
77.

Explanation: A command was issued to a RAID


6027-1850 [E] NSD-RAID services are not configured
recovery group that does not exist, or is not in the
on node nodeName. Check the
active state.
nsdRAIDTracks and
nsdRAIDBufferPoolSizePct User response: Retry the command with a valid RAID
configuration attributes. recovery group name or wait for the recovery group to
become active.
Explanation: A IBM Spectrum Scale RAID command
is being executed, but NSD-RAID services are not
initialized either because the specified attributes have 6027-1854 [E] Cannot find declustered array arrayName
not been set or had invalid values. in recovery group recoveryGroupName.
User response: Correct the attributes and restart the Explanation: The specified declustered array name
GPFS daemon. was not found in the RAID recovery group.
User response: Specify a valid declustered array name
6027-1851 [A] Cannot configure NSD-RAID services. within the RAID recovery group.
The nsdRAIDBufferPoolSizePct of the
pagepool must result in at least 128MiB
of space. 6027-1855 [E] Cannot find pdisk pdiskName in recovery
group recoveryGroupName.
Explanation: The GPFS daemon is starting and cannot
initialize the NSD-RAID services because of the Explanation: The specified pdisk was not found.
memory consideration specified. User response: Retry the command with a valid pdisk
User response: Correct the name.
nsdRAIDBufferPoolSizePct attribute and restart the
GPFS daemon. 6027-1856 [E] Vdisk vdiskName not found.
Explanation: The specified vdisk was not found.
6027-1852 [A] Cannot configure NSD-RAID services.
nsdRAIDTracks is too large, the User response: Retry the command with a valid vdisk
maximum on this node is value. name.

Explanation: The GPFS daemon is starting and cannot


initialize the NSD-RAID services because the 6027-1857 [E] A recovery group must contain between
nsdRAIDTracks attribute is too large. number and number pdisks.

User response: Correct the nsdRAIDTracks attribute Explanation: The number of pdisks specified is not
and restart the GPFS daemon. valid.
User response: Correct the input and retry the
6027-1853 [E] Recovery group recoveryGroupName does command.
not exist or is not active.

Chapter 13. References 79


6027-1858 [E] • 6027-1870 [E]

6027-1858 [E] Cannot create declustered array | 6027-1864 [E] [E] At least one declustered array must
arrayName; there can be at most number | contain number + vdisk configuration
declustered arrays in a recovery group. | data spares or more pdisks and be
| eligible to hold vdisk configuration
Explanation: The number of declustered arrays
| data.
allowed in a recovery group has been exceeded.
| Explanation: When creating a new RAID recovery
User response: Reduce the number of declustered
| group, at least one of the declustered arrays in the
arrays in the input file and retry the command.
| recovery group must contain at least 2T+1 pdisks,
| where T is the maximum number of disk failures that
6027-1859 [E] Sector size of pdisk pdiskName is invalid. | can be tolerated within a declustered array. This is
| necessary in order to store the on-disk vdisk
Explanation: All pdisks in a recovery group must | configuration data safely. This declustered array cannot
have the same physical sector size. | have canHoldVCD set to no.
User response: Correct the input file to use a different | User response: Supply at least the indicated number
disk and retry the command. | of pdisks in at least one declustered array of the
| recovery group, or do not specify canHoldVCD=no for
| 6027-1860 [E] Pdisk pdiskName must have a capacity of | that declustered array.
| at least number bytes.
| Explanation: The pdisk must be at least as large as the 6027-1866 [E] Disk descriptor for diskName refers to an
| indicated minimum size in order to be added to this existing NSD.
| declustered array. Explanation: A disk being added to a recovery group
| User response: Correct the input file and retry the appears to already be in-use as an NSD disk.
| command. User response: Carefully check the disks given to
tscrrecgroup, tsaddpdisk or tschcarrier. If you are
6027-1861 [W] Size of pdisk pdiskName is too large for certain the disk is not actually in-use, override the
declustered array arrayName. Only check by specifying the -v no option.
number of number bytes of that capacity
will be used. 6027-1867 [E] Disk descriptor for diskName refers to an
Explanation: For optimal utilization of space, pdisks existing pdisk.
added to this declustered array should be no larger Explanation: A disk being added to a recovery group
than the indicated maximum size. Only the indicated appears to already be in-use as a pdisk.
portion of the total capacity of the pdisk will be
available for use. User response: Carefully check the disks given to
tscrrecgroup, tsaddpdisk or tschcarrier. If you are
User response: Consider creating a new declustered certain the disk is not actually in-use, override the
array consisting of all larger pdisks. check by specifying the -v no option.

6027-1862 [E] Cannot add pdisk pdiskName to 6027-1869 [E] Error updating the recovery group
declustered array arrayName; there can descriptor.
be at most number pdisks in a
declustered array. Explanation: Error occurred updating the RAID
recovery group descriptor.
Explanation: The maximum number of pdisks that
can be added to a declustered array was exceeded. User response: Retry the command.
User response: None.
6027-1870 [E] Recovery group name name is already in
use.
6027-1863 [E] Pdisk sizes within a declustered array
cannot vary by more than number. Explanation: The recovery group name already exists.
Explanation: The disk sizes within each declustered User response: Choose a new recovery group name
array must be nearly the same. using the characters a-z, A-Z, 0-9, and underscore, at
most 63 characters in length.
User response: Create separate declustered arrays for
each disk size.

80 ESS 5.3.1: Problem Determination Guide


6027-1871 [E] • 6027-1881 [E]

6027-1871 [E] There is only enough free space to 6027-1877 [E] Cannot remove declustered array
allocate number spare(s) in declustered arrayName because the array still
array arrayName. contains vdisks.
Explanation: Too many spares were specified. Explanation: Declustered arrays that still contain
vdisks cannot be deleted.
User response: Retry the command with a valid
number of spares. User response: Delete any vdisks remaining in this
declustered array using the tsdelvdisk command before
retrying this command.
6027-1872 [E] Recovery group still contains vdisks.
Explanation: RAID recovery groups that still contain
6027-1878 [E] Cannot remove pdisk pdiskName because
vdisks cannot be deleted.
it is the last remaining pdisk in
User response: Delete any vdisks remaining in this declustered array arrayName. Remove the
RAID recovery group using the tsdelvdisk command declustered array instead.
before retrying this command.
Explanation: The tsdelpdisk command can be used
either to delete individual pdisks from a declustered
6027-1873 [E] Pdisk creation failed for pdisk array, or to delete a full declustered array from a
pdiskName: err=errorNum. recovery group. You cannot, however, delete a
declustered array by deleting all of its pdisks -- at least
Explanation: Pdisk creation failed because of the one must remain.
specified error.
User response: Delete the declustered array instead of
User response: None. removing all of its pdisks.

6027-1874 [E] Error adding pdisk to a recovery group. 6027-1879 [E] Cannot remove pdisk pdiskName because
Explanation: tsaddpdisk failed to add new pdisks to a arrayName is the only remaining
recovery group. declustered array with at least number
pdisks.
User response: Check the list of pdisks in the -d or -F
parameter of tsaddpdisk. Explanation: The command failed to remove a pdisk
from a declustered array because no other declustered
array in the recovery group has sufficient pdisks to
6027-1875 [E] Cannot delete the only declustered store the on-disk recovery group descriptor at the
array. required fault tolerance level.
Explanation: Cannot delete the only remaining User response: Add pdisks to another declustered
declustered array from a recovery group. array in this recovery group before removing pdisks
User response: Instead, delete the entire recovery from this one.
group.
6027-1880 [E] Cannot remove pdisk pdiskName because
| 6027-1876 [E] Cannot remove declustered array the number of pdisks in declustered
| arrayName because it is the only array arrayName would fall below the
| remaining declustered array with at code width of one or more of its vdisks.
| least number pdisks eligible to hold Explanation: The number of pdisks in a declustered
| vdisk configuration data. array must be at least the maximum code width of any
| Explanation: The command failed to remove a vdisk in the declustered array.
| declustered array because no other declustered array in User response: Either add pdisks or remove vdisks
| the recovery group has sufficient pdisks to store the from the declustered array.
| on-disk recovery group descriptor at the required fault
| tolerance level.
6027-1881 [E] Cannot remove pdisk pdiskName because
| User response: Add pdisks to another declustered of insufficient free space in declustered
| array in this recovery group before removing this one. array arrayName.
Explanation: The tsdelpdisk command could not
delete a pdisk because there was not enough free space
in the declustered array.
User response: Either add pdisks or remove vdisks
from the declustered array.

Chapter 13. References 81


6027-1882 [E] • 6027-1894 [E]

6027-1882 [E] Cannot remove pdisk pdiskName; unable 6027-1888 [E] Recovery group already contains number
to drain the data from the pdisk. vdisks.
Explanation: Pdisk deletion failed because the system Explanation: The RAID recovery group already
could not find enough free space on other pdisks to contains the maximum number of vdisks.
drain all of the data from the disk.
User response: Create vdisks in another RAID
User response: Either add pdisks or remove vdisks recovery group, or delete one or more of the vdisks in
from the declustered array. the current RAID recovery group before retrying the
tscrvdisk command.
6027-1883 [E] Pdisk pdiskName deletion failed: process
interrupted. 6027-1889 [E] Vdisk name vdiskName is already in use.
Explanation: Pdisk deletion failed because the deletion Explanation: The vdisk name given on the tscrvdisk
process was interrupted. This is most likely because of command already exists.
the recovery group failing over to a different server.
User response: Choose a new vdisk name less than 64
User response: Retry the command. characters using the characters a-z, A-Z, 0-9, and
underscore.
6027-1884 [E] Missing or invalid vdisk name.
6027-1890 [E] A recovery group may only contain one
Explanation: No vdisk name was given on the
log home vdisk.
tscrvdisk command.
Explanation: A log vdisk already exists in the
User response: Specify a vdisk name using the
recovery group.
characters a-z, A-Z, 0-9, and underscore of at most 63
characters in length. User response: None.

6027-1885 [E] Vdisk block size must be a power of 2. 6027-1891 [E] Cannot create vdisk before the log home
vdisk is created.
Explanation: The -B or --blockSize parameter of
tscrvdisk must be a power of 2. Explanation: The log vdisk must be the first vdisk
created in a recovery group.
User response: Reissue the tscrvdisk command with a
correct value for block size. User response: Retry the command after creating the
log home vdisk.
6027-1886 [E] Vdisk block size cannot exceed
maxBlockSize (number). 6027-1892 [E] Log vdisks must use replication.
Explanation: The virtual block size of a vdisk cannot Explanation: The log vdisk must use a RAID code
be larger than the value of the maxblocksize that uses replication.
configuration attribute of the IBM Spectrum Scale
User response: Retry the command with a valid RAID
mmchconfig command.
code.
User response: Use a smaller vdisk virtual block size,
or increase the value of maxBlockSize using
6027-1893 [E] The declustered array must contain at
mmchconfig maxblocksize=newSize.
least as many non-spare pdisks as the
width of the code.
6027-1887 [E] Vdisk block size must be between
Explanation: The RAID code specified requires a
number and number for the specified
minimum number of disks larger than the size of the
code.
declustered array that was given.
Explanation: An invalid vdisk block size was
User response: Place the vdisk in a wider declustered
specified. The message lists the allowable range of
array or use a narrower code.
block sizes.
User response: Use a vdisk virtual block size within
6027-1894 [E] There is not enough space in the
the range shown, or use a different vdisk RAID code.
declustered array to create additional
vdisks.
Explanation: There is insufficient space in the
declustered array to create even a minimum size vdisk
with the given RAID code.

82 ESS 5.3.1: Problem Determination Guide


6027-1895 [E] • 6027-3006 [W]

User response: Add additional pdisks to the an alternative method of replacing failed disks.
declustered array, reduce the number of spares or use a
different RAID code.
6027-3001 [E] Location of pdisk pdiskName of recovery
group recoveryGroupName is not known.
6027-1895 [E] Unable to create vdisk vdiskName
Explanation: IBM Spectrum Scale is unable to find the
because there are too many failed
location of the given pdisk.
pdisks in declustered array
declusteredArrayName. User response: Check the disk enclosure hardware.
Explanation: Cannot create the specified vdisk,
because there are too many failed pdisks in the array. 6027-3002 [E] Disk location code locationCode is not
known.
User response: Replace failed pdisks in the
declustered array and allow time for rebalance Explanation: A disk location code specified on the
operations to more evenly distribute the space. command line was not found.
User response: Check the disk location code.
6027-1896 [E] Insufficient memory for vdisk metadata.
Explanation: There was not enough pinned memory 6027-3003 [E] Disk location code locationCode was
for IBM Spectrum Scale to hold all of the metadata specified more than once.
necessary to describe a vdisk.
Explanation: The same disk location code was
User response: Increase the size of the GPFS page specified more than once in the tschcarrier command.
pool.
User response: Check the command usage and run
again.
6027-1897 [E] Error formatting vdisk.
Explanation: An error occurred formatting the vdisk. 6027-3004 [E] Disk location codes locationCode and
locationCode are not in the same disk
User response: None.
carrier.
Explanation: The tschcarrier command cannot be used
6027-1898 [E] The log home vdisk cannot be destroyed
to operate on more than one disk carrier at a time.
if there are other vdisks.
User response: Check the command usage and rerun.
Explanation: The log home vdisk of a recovery group
cannot be destroyed if vdisks other than the log tip
vdisk still exist within the recovery group. 6027-3005 [W] Pdisk in location locationCode is
controlled by recovery group
User response: Remove the user vdisks and then retry
recoveryGroupName.
the command.
Explanation: The tschcarrier command detected that a
pdisk in the indicated location is controlled by a
6027-1899 [E] Vdisk vdiskName is still in use.
different recovery group than the one specified.
Explanation: The vdisk named on the tsdelvdisk
User response: Check the disk location code and
command is being used as an NSD disk.
recovery group name.
User response: Remove the vdisk with the mmdelnsd
command before attempting to delete it.
6027-3006 [W] Pdisk in location locationCode is
controlled by recovery group id
6027-3000 [E] No disk enclosures were found on the idNumber.
target node.
Explanation: The tschcarrier command detected that a
Explanation: IBM Spectrum Scale is unable to pdisk in the indicated location is controlled by a
communicate with any disk enclosures on the node different recovery group than the one specified.
serving the specified pdisks. This might be because
User response: Check the disk location code and
there are no disk enclosures attached to the node, or it
recovery group name.
might indicate a problem in communicating with the
disk enclosures. While the problem persists, disk
maintenance with the mmchcarrier command is not
available.
User response: Check disk enclosure connections and
run the command again. Use mmaddpdisk --replace as

Chapter 13. References 83


6027-3007 [E] • 6027-3017 [E]

User response: Check the disk location code.


6027-3007 [E] Carrier contains pdisks from more than
one recovery group.
6027-3013 [W] Disk location locationCode failed to
Explanation: The tschcarrier command detected that a
power on.
disk carrier contains pdisks controlled by more than
one recovery group. Explanation: The mmchcarrier command detected an
error when trying to power on a disk.
User response: Use the tschpdisk command to bring
the pdisks in each of the other recovery groups offline User response: Make sure the disk is firmly seated
and then rerun the command using the --force-RG flag. and run the command again.

6027-3008 [E] Incorrect recovery group given for 6027-3014 [E] Pdisk pdiskName of recovery group
location. recoveryGroupName was expected to be
replaced with a new disk; instead, it
Explanation: The mmchcarrier command detected that
was moved from location locationCode to
the specified recovery group name given does not
location locationCode.
match that of the pdisk in the specified location.
Explanation: The mmchcarrier command expected a
User response: Check the disk location code and
pdisk to be removed and replaced with a new disk. But
recovery group name. If you are sure that the disks in
instead of being replaced, the old pdisk was moved
the carrier are not being used by other recovery groups,
into a different location.
it is possible to override the check using the --force-RG
flag. Use this flag with caution as it can cause disk User response: Repeat the disk replacement
errors and potential data loss in other recovery groups. procedure.

6027-3009 [E] Pdisk pdiskName of recovery group 6027-3015 [E] Pdisk pdiskName of recovery group
recoveryGroupName is not currently recoveryGroupName in location
scheduled for replacement. locationCode cannot be used as a
replacement for pdisk pdiskName of
Explanation: A pdisk specified in a tschcarrier or
recovery group recoveryGroupName.
tsaddpdisk command is not currently scheduled for
replacement. Explanation: The tschcarrier command expected a
pdisk to be removed and replaced with a new disk. But
User response: Make sure the correct disk location
instead of finding a new disk, the mmchcarrier
code or pdisk name was given. For the mmchcarrier
command found that another pdisk was moved to the
command, the --force-release option can be used to
replacement location.
override the check.
User response: Repeat the disk replacement
procedure, making sure to replace the failed pdisk with
6027-3010 [E] Command interrupted.
a new disk.
Explanation: The mmchcarrier command was
interrupted by a conflicting operation, for example the
6027-3016 [E] Replacement disk in location
mmchpdisk --resume command on the same pdisk.
locationCode has an incorrect type fruCode;
User response: Run the mmchcarrier command again. expected type code is fruCode.
Explanation: The replacement disk has a different
6027-3011 [W] Disk location locationCode failed to field replaceable unit type code than that of the original
power off. disk.
Explanation: The mmchcarrier command detected an User response: Replace the pdisk with a disk of the
error when trying to power off a disk. same part number. If you are certain the new disk is a
valid substitute, override this check by running the
User response: Check the disk enclosure hardware. If
command again with the --force-fru option.
the disk carrier has a lock and does not unlock, try
running the command again or use the manual carrier
release. 6027-3017 [E] Error formatting replacement disk
diskName.
6027-3012 [E] Cannot find a pdisk in location Explanation: An error occurred when trying to format
locationCode. a replacement pdisk.
Explanation: The tschcarrier command cannot find a User response: Check the replacement disk.
pdisk to replace in the given location.

84 ESS 5.3.1: Problem Determination Guide


6027-3018 [E] • 6027-3030 [E]

6027-3018 [E] A replacement for pdisk pdiskName of 6027-3025 [E] Device deviceName does not exist or is
recovery group recoveryGroupName was not active on this node.
not found in location locationCode.
Explanation: The specified device was not found on
Explanation: The tschcarrier command expected a this node.
pdisk to be removed and replaced with a new disk, but
User response: None.
no replacement disk was found.
User response: Make sure a replacement disk was
6027-3026 [E] Recovery group recoveryGroupName does
inserted into the correct slot.
not have an active log home vdisk.
Explanation: The indicated recovery group does not
6027-3019 [E] Pdisk pdiskName of recovery group
have an active log vdisk. This may be because the log
recoveryGroupName in location
home vdisk has not yet been created, because a
locationCode was not replaced.
previously existing log home vdisk has been deleted, or
Explanation: The tschcarrier command expected a because the server is in the process of recovery.
pdisk to be removed and replaced with a new disk, but
User response: Create a log home vdisk if none exists.
the original pdisk was still found in the replacement
Retry the command.
location.
User response: Repeat the disk replacement, making
6027-3027 [E] Cannot configure NSD-RAID services
sure to replace the pdisk with a new disk.
on this node.
Explanation: NSD-RAID services are not supported on
6027-3020 [E] Invalid state change, stateChangeName,
this operating system or node hardware.
for pdisk pdiskName.
User response: Configure a supported node type as
Explanation: The tschpdisk command received an
the NSD RAID server and restart the GPFS daemon.
state change request that is not permitted.
User response: Correct the input and reissue the
6027-3028 [E] There is not enough space in
command.
declustered array declusteredArrayName
for the requested vdisk size. The
6027-3021 [E] Unable to change identify state to maximum possible size for this vdisk is
identifyState for pdisk pdiskName: size.
err=errorNum.
Explanation: There is not enough space in the
Explanation: The tschpdisk command failed on an declustered array for the requested vdisk size.
identify request.
User response: Create a smaller vdisk, remove
User response: Check the disk enclosure hardware. existing vdisks or add additional pdisks to the
declustered array.
6027-3022 [E] Unable to create vdisk layout.
6027-3029 [E] There must be at least number non-spare
Explanation: The tscrvdisk command could not create
pdisks in declustered array
the necessary layout for the specified vdisk.
declusteredArrayName to avoid falling
User response: Change the vdisk arguments and retry below the code width of vdisk
the command. vdiskName.
Explanation: A change of spares operation failed
6027-3023 [E] Error initializing vdisk. because the resulting number of non-spare pdisks
would fall below the code width of the indicated vdisk.
Explanation: The tscrvdisk command could not
initialize the vdisk. User response: Add additional pdisks to the
declustered array.
User response: Retry the command.

6027-3030 [E] There must be at least number non-spare


6027-3024 [E] Error retrieving recovery group pdisks in declustered array
recoveryGroupName event log. declusteredArrayName for configuration
Explanation: Because of an error, the data replicas.
tslsrecoverygroupevents command was unable to Explanation: A delete pdisk or change of spares
retrieve the full event log. operation failed because the resulting number of
User response: None. non-spare pdisks would fall below the number required

Chapter 13. References 85


6027-3031 [E] • 6027-3042 [E]

to hold configuration data for the declustered array.


6027-3037 [E] Partition size must be between number
User response: Add additional pdisks to the and number.
declustered array. If replacing a pdisk, use mmchcarrier
Explanation: The partitionSize parameter of some
or mmaddpdisk --replace.
declustered array was invalid.
User response: Correct the partitionSize parameter to
6027-3031 [E] There is not enough available
a power of 2 within the specified range and reissue the
configuration data space in declustered
command.
array declusteredArrayName to complete
this operation.
6027-3038 [E] AU log too small; must be at least
Explanation: Creating a vdisk, deleting a pdisk, or
number bytes.
changing the number of spares failed because there is
not enough available space in the declustered array for Explanation: The auLogSize parameter of a new
configuration data. declustered array was invalid.
User response: Replace any failed pdisks in the User response: Increase the auLogSize parameter and
declustered array and allow time for rebalance reissue the command.
operations to more evenly distribute the available
space. Add pdisks to the declustered array.
6027-3039 [E] A vdisk with disk usage vdiskLogTip
must be the first vdisk created in a
6027-3032 [E] Temporarily unable to create vdisk recovery group.
vdiskName because more time is required
Explanation: The --logTip disk usage was specified
to rebalance the available space in
for a vdisk other than the first one created in a
declustered array declusteredArrayName.
recovery group.
Explanation: Cannot create the specified vdisk until
User response: Retry the command with a different
rebuild and rebalance processes are able to more evenly
disk usage.
distribute the available space.
User response: Replace any failed pdisks in the
6027-3040 [E] Declustered array configuration data
recovery group, allow time for rebuild and rebalance
does not fit.
processes to more evenly distribute the spare space
within the array, and retry the command. Explanation: There is not enough space in the pdisks
of a new declustered array to hold the AU log area
using the current partition size.
6027-3034 [E] The input pdisk name (pdiskName) did
not match the pdisk name found on User response: Increase the partitionSize parameter
disk (pdiskName). or decrease the auLogSize parameter and reissue the
command.
Explanation: Cannot add the specified pdisk, because
the input pdiskName did not match the pdiskName that
was written on the disk. | 6027-3041 [E] Declustered array attributes cannot be
| changed.
User response: Verify the input file and retry the
command. | Explanation: The partitionSize, auLogSize, and
| canHoldVCD attributes of a declustered array cannot
| be changed after the the declustered array has been
6027-3035 [A] Cannot configure NSD-RAID services.
| created. They may only be set by a command that
maxblocksize must be at least value.
| creates the declustered array.
Explanation: The GPFS daemon is starting and cannot
| User response: Remove the partitionSize, auLogSize,
initialize the NSD-RAID services because the
| and canHoldVCD attributes from the input file of the
maxblocksize attribute is too small.
| mmaddpdisk command and reissue the command.
User response: Correct the maxblocksize attribute and
restart the GPFS daemon.
6027-3042 [E] The log tip vdisk cannot be destroyed if
there are other vdisks.
6027-3036 [E] Partition size must be a power of 2.
Explanation: In recovery groups with versions prior to
Explanation: The partitionSize parameter of some 3.5.0.11, the log tip vdisk cannot be destroyed if other
declustered array was invalid. vdisks still exist within the recovery group.
User response: Correct the partitionSize parameter User response: Remove the user vdisks or upgrade
and reissue the command. the version of the recovery group with

86 ESS 5.3.1: Problem Determination Guide


6027-3043 [E] • 6027-3051 [E]

mmchrecoverygroup --version, then retry the the server has sufficient real memory to support the
command to remove the log tip vdisk. configured values. The specified configuration variables
should be the same for the recovery group servers.
6027-3043 [E] Log vdisks cannot have multiple use Use the mmchconfig command to correct the
specifications. configuration.
Explanation: A vdisk can have usage vdiskLog,
vdiskLogTip, or vdiskLogReserved, but not more than 6027-3047 [E] Location of pdisk pdiskName is not
one. known.
User response: Retry the command with only one of Explanation: IBM Spectrum Scale is unable to find the
the --log, --logTip, or --logReserved attributes. location of the given pdisk.
User response: Check the disk enclosure hardware.
6027-3044 [E] Unable to determine resource
requirements for all the recovery groups
6027-3048 [E] Pdisk pdiskName is not currently
served by node value: to override this
scheduled for replacement.
check reissue the command with the -v
no flag. Explanation: A pdisk specified in a tschcarrier or
tsaddpdisk command is not currently scheduled for
Explanation: A recovery group or vdisk is being
replacement.
created, but IBM Spectrum Scale can not determine if
there are enough non-stealable buffer resources to allow User response: Make sure the correct disk location
the node to successfully serve all the recovery groups code or pdisk name was given. For the tschcarrier
at the same time once the new object is created. command, the --force-release option can be used to
override the check.
User response: You can override this check by
reissuing the command with the -v flag.
6027-3049 [E] The minimum size for vdisk vdiskName
is number.
6027-3045 [W] Buffer request exceeds the
non-stealable buffer limit. Check the Explanation: The vdisk size was too small.
configuration attributes of the recovery
group servers: pagepool, User response: Increase the size of the vdisk and retry
nsdRAIDBufferPoolSizePct, the command.
nsdRAIDNonStealableBufPct.
Explanation: The limit of non-stealable buffers has 6027-3050 [E] There are already number suspended
been exceeded. This is probably because the system is pdisks in declustered array arrayName.
not configured correctly. You must resume pdisks in the array
before suspending more.
User response: Check the settings of the pagepool,
nsdRAIDBufferPoolSizePct, and Explanation: The number of suspended pdisks in the
nsdRAIDNonStealableBufPct attributes and make sure declustered array has reached the maximum limit.
the server has enough real memory to support the Allowing more pdisks to be suspended in the array
configured values. would put data availability at risk.

Use the mmchconfig command to correct the User response: Resume one more suspended pdisks in
configuration. the array by using the mmchcarrier or mmchpdisk
commands then retry the command.

6027-3046 [E] The nonStealable buffer limit may be


too low on server serverName or the 6027-3051 [E] Checksum granularity must be number
pagepool is too small. Check the or number.
configuration attributes of the recovery Explanation: The only allowable values for the
group servers: pagepool, checksumGranularity attribute of a data vdisk are 8K
nsdRAIDBufferPoolSizePct, and 32K.
nsdRAIDNonStealableBufPct.
User response: Change the checksumGranularity
Explanation: The limit of non-stealable buffers is too attribute of the vdisk, then retry the command.
low on the specified recovery group server. This is
probably because the system is not configured correctly.
User response: Check the settings of the pagepool,
nsdRAIDBufferPoolSizePct, and
nsdRAIDNonStealableBufPct attributes and make sure

Chapter 13. References 87


6027-3052 [E] • 6027-3062 [E]

for improperly-seated connectors within the disk


6027-3052 [E] Checksum granularity cannot be
enclosure.
specified for log vdisks.
Explanation: The checksumGranularity attribute
6027-3058 [A] GSS license failure - IBM Spectrum
cannot be applied to a log vdisk.
Scale RAID services will not be
User response: Remove the checksumGranularity configured on this node.
attribute of the log vdisk, then retry the command.
Explanation: The Elastic Storage Server has not been
installed validly. Therefore, IBM Spectrum Scale RAID
6027-3053 [E] Vdisk block size must be between services will not be configured.
number and number for the specified
User response: Install a licensed copy of the base IBM
code when checksum granularity number
Spectrum Scale code and restart the GPFS daemon.
is used.
Explanation: An invalid vdisk block size was
6027-3059 [E] The serviceDrain state is only permitted
specified. The message lists the allowable range of
when all nodes in the cluster are
block sizes.
running daemon version version or
User response: Use a vdisk virtual block size within higher.
the range shown, or use a different vdisk RAID code,
Explanation: The mmchpdisk command option
or use a different checksum granularity.
--begin-service-drain was issued, but there are
backlevel nodes in the cluster that do not support this
6027-3054 [W] Disk in location locationCode failed to action.
come online.
User response: Upgrade the nodes in the cluster to at
Explanation: The mmchcarrier command detected an least the specified version and run the command again.
error when trying to bring a disk back online.
User response: Make sure the disk is firmly seated 6027-3060 [E] Block sizes of all log vdisks must be the
and run the command again. Check the operating same.
system error log.
Explanation: The block sizes of the log tip vdisk, the
log tip backup vdisk, and the log home vdisk must all
6027-3055 [E] The fault tolerance of the code cannot be the same.
be greater than the fault tolerance of the
User response: Try running the command again after
internal configuration data.
adjusting the block sizes of the log vdisks.
Explanation: The RAID code specified for a new vdisk
is more fault-tolerant than the configuration data that
6027-3061 [E] Cannot delete path pathName because
will describe the vdisk.
there would be no other working paths
User response: Use a code with a smaller fault to pdisk pdiskName of RG
tolerance. recoveryGroupName.
Explanation: When the -v yes option is specified on
6027-3056 [E] Long and short term event log size and the --delete-paths subcommand of the tschrecgroup
fast write log percentage are only command, it is not allowed to delete the last working
applicable to log home vdisk. path to a pdisk.
Explanation: The longTermEventLogSize, User response: Try running the command again after
shortTermEventLogSize, and fastWriteLogPct options repairing other broken paths for the named pdisk, or
are only applicable to log home vdisk. reduce the list of paths being deleted, or run the
command with -v no.
User response: Remove any of these options and retry
vdisk creation.
6027-3062 [E] Recovery group version version is not
compatible with the current recovery
6027-3057 [E] Disk enclosure is no longer reporting
group version.
information on location locationCode.
Explanation: The recovery group version specified
Explanation: The disk enclosure reported an error
with the --version option does not support all of the
when IBM Spectrum Scale tried to obtain updated
features currently supported by the recovery group.
status on the disk location.
User response: Run the command with a new value
User response: Try running the command again. Make
for --version. The allowable values will be listed
sure that the disk enclosure firmware is current. Check
following this message.

88 ESS 5.3.1: Problem Determination Guide


6027-3063 [E] • 6027-3076 [E]

6027-3063 [E] Unknown recovery group version 6027-3070 [E] Log vdisk vdiskName cannot appear in
version. the same declustered array as log vdisk
vdiskName.
Explanation: The recovery group version named by
the argument of the --version option was not Explanation: No two log vdisks may appear in the
recognized. same declustered array.
User response: Run the command with a new value User response: Specify a different declustered array
for --version. The allowable values will be listed for the new log vdisk and retry the command.
following this message.
6027-3071 [E] Device not found: deviceName.
6027-3064 [I] Allowable recovery group versions are:
Explanation: A device name given in an
Explanation: Informational message listing allowable mmcrrecoverygroup or mmaddpdisk command was
recovery group versions. not found.
User response: Run the command with one of the User response: Check the device name.
recovery group versions listed.
6027-3072 [E] Invalid device name: deviceName.
6027-3065 [E] The maximum size of a log tip vdisk is
Explanation: A device name given in an
size.
mmcrrecoverygroup or mmaddpdisk command is
Explanation: Running mmcrvdisk for a log tip vdisk invalid.
failed because the size is too large.
User response: Check the device name.
User response: Correct the size parameter and run the
command again.
6027-3073 [E] Error formatting pdisk pdiskName on
device diskName.
6027-3066 [E] A recovery group may only contain one
Explanation: An error occurred when trying to format
log tip vdisk.
a new pdisk.
Explanation: A log tip vdisk already exists in the
User response: Check that the disk is working
recovery group.
properly.
User response: None.
6027-3074 [E] Node nodeName not found in cluster
6027-3067 [E] Log tip backup vdisks not supported by configuration.
this recovery group version.
Explanation: A node name specified in a command
Explanation: Vdisks with usage type does not exist in the cluster configuration.
vdiskLogTipBackup are not supported by all recovery
User response: Check the command arguments.
group versions.
User response: Upgrade the recovery group to a later
6027-3075 [E] The --servers list must contain the
version using the --version option of
current node, nodeName.
mmchrecoverygroup.
Explanation: The --servers list of a tscrrecgroup
command does not list the server on which the
6027-3068 [E] The sizes of the log tip vdisk and the
command is being run.
log tip backup vdisk must be the same.
User response: Check the --servers list. Make sure the
Explanation: The log tip vdisk must be the same size
tscrrecgroup command is run on a server that will
as the log tip backup vdisk.
actually server the recovery group.
User response: Adjust the vdisk sizes and retry the
mmcrvdisk command.
6027-3076 [E] Remote pdisks are not supported by this
recovery group version.
6027-3069 [E] Log vdisks cannot use code codeName.
Explanation: Pdisks that are not directly attached are
Explanation: Log vdisks must use a RAID code that not supported by all recovery group versions.
uses replication, or be unreplicated. They cannot use
User response: Upgrade the recovery group to a later
parity-based codes such as 8+2P.
version using the --version option of
User response: Retry the command with a valid RAID mmchrecoverygroup.
code.

Chapter 13. References 89


6027-3077 [E] • 6027-3089 [E]

is recommended in the error message and retry the


6027-3077 [E] There must be at least number pdisks in
command.
recovery group recoveryGroupName for
configuration data replicas.
6027-3085 [E] The number of VCD spares must be
Explanation: A change of pdisks failed because the
greater than or equal to the number of
resulting number of pdisks would fall below the
spares in declustered array
needed replication factor for the recovery group
declusteredArrayName.
descriptor.
Explanation: Too many spares or too few vdisk
User response: Do not attempt to delete more pdisks.
configuration data (VCD) spares were specified.
User response: Retry the command with a smaller
6027-3078 [E] Replacement threshold for declustered
number of spares or a larger number of VCD spares.
array declusteredArrayName of recovery
group recoveryGroupName cannot exceed
number. 6027-3086 [E] There is only enough free space to
allocate n VCD spare(s) in declustered
Explanation: The replacement threshold cannot be
array declusteredArrayName.
larger than the maximum number of pdisks in a
declustered array. The maximum number of pdisks in a Explanation: Too many vdisk configuration data
declustered array depends on the version number of (VCD) spares were specified.
the recovery group. The current limit is given in this
message. User response: Retry the command with a smaller
number of VCD spares.
User response: Use a smaller replacement threshold or
upgrade the recovery group version.
6027-3087 [E] Specifying Pdisk rotation rate not
supported by this recovery group
6027-3079 [E] Number of spares for declustered array version.
declusteredArrayName of recovery group
recoveryGroupName cannot exceed number. Explanation: Specifying the Pdisk rotation rate is not
supported by all recovery group versions.
Explanation: The number of spares cannot be larger
than the maximum number of pdisks in a declustered User response: Upgrade the recovery group to a later
array. The maximum number of pdisks in a declustered version using the --version option of the
array depends on the version number of the recovery mmchrecoverygroup command. Or, don't specify a
group. The current limit is given in this message. rotation rate.

User response: Use a smaller number of spares or


upgrade the recovery group version. 6027-3088 [E] Specifying Pdisk expected number of
paths not supported by this recovery
group version.
6027-3080 [E] Cannot remove pdisk pdiskName because
declustered array declusteredArrayName Explanation: Specifying the expected number of active
would have fewer disks than its or total pdisk paths is not supported by all recovery
replacement threshold. group versions.

Explanation: The replacement threshold for a User response: Upgrade the recovery group to a later
declustered array must not be larger than the number version using the --version option of the
of pdisks in the declustered array. mmchrecoverygroup command. Or, don't specify the
expected number of paths.
User response: Reduce the replacement threshold for
the declustered array, then retry the mmdelpdisk
command. 6027-3089 [E] Pdisk pdiskName location locationCode is
already in use.

6027-3084 [E] VCD spares feature must be enabled Explanation: The pdisk location that was specified in
before being changed. Upgrade recovery the command conflicts with another pdisk that is
group version to at least version to already in that location. No two pdisks can be in the
enable it. same location.

Explanation: The vdisk configuration data (VCD) User response: Specify a unique location for this
spares feature is not supported in the current recovery pdisk.
group version.
User response: Apply the recovery group version that

90 ESS 5.3.1: Problem Determination Guide


6027-3090 [E] • 6027-3800 [E]

User response: Change the checksumGranularity


6027-3090 [E] Enclosure control command failed for
attribute of the new log vdisk to the indicated value,
pdisk pdiskName of RG
then retry the command.
recoveryGroupName in location
locationCode: err errorNum. Examine mmfs
log for tsctlenclslot, tsonosdisk and 6027-3095 [E] The specified declustered array name
tsoffosdisk errors. (declusteredArrayName) for the new pdisk
pdiskName must be declusteredArrayName.
Explanation: A command used to control a disk
enclosure slot failed. Explanation: When replacing an existing pdisk with a
new pdisk, the declustered array name for the new
User response: Examine the mmfs log files for more
pdisk must match the declustered array name for the
specific error messages from the tsctlenclslot,
existing pdisk.
tsonosdisk, and tsoffosdisk commands.
User response: Change the specified declustered array
name to the indicated value, then run the command
| 6027-3091 [W] A command to control the disk
again.
| enclosure failed with error code
| errorNum. As a result, enclosure
| indicator lights may not have changed 6027-3096 [E] Internal error encountered in
| to the correct states. Examine the mmfs NSD-RAID command: err=errorNum.
| log on nodes attached to the disk
| enclosure for messages from the Explanation: An unexpected GPFS NSD-RAID internal
| tsctlenclslot, tsonosdisk, and error occurred.
| tsoffosdisk commands for more User response: Contact the IBM Support Center.
| detailed information.
| Explanation: A command used to control disk 6027-3097 [E] Missing or invalid pdisk name
| enclosure lights and carrier locks failed. This is not a (pdiskName).
| fatal error.
Explanation: A pdisk name specified in an
| User response: Examine the mmfs log files on nodes mmcrrecoverygroup or mmaddpdisk command is not
| attached to the disk enclosure for error messages from valid.
| the tsctlenclslot, tsonosdisk, and tsoffosdisk
| commands for more detailed information. If the carrier User response: Specify a pdisk name that is 63
| failed to unlock, either retry the command or use the characters or less. Valid characters are: a to z, A to Z, 0
| manual override. to 9, and underscore ( _ ).

| 6027-3092 [I] Recovery group recoveryGroupName 6027-3098 [E] Pdisk name pdiskName is already in use
| assignment delay delaySeconds seconds in recovery group recoveryGroupName.
| for safe recovery. Explanation: The pdisk name already exists in the
| Explanation: The recovery group must wait before specified recovery group.
| meta-data recovery. Prior disk lease for the failing User response: Choose a pdisk name that is not
| manager must first expire. already in use.
| User response: None.
6027-3099 [E] Device with path(s) pathName is
6027-3093 [E] Checksum granularity must be number specified for both new pdisks pdiskName
or number for log vdisks. and pdiskName.

Explanation: The only allowable values for the Explanation: The same device is specified for more
checksumGranularity attribute of a log vdisk are 512 than one pdisk in the stanza file. The device can have
and 4K. multiple paths, which are shown in the error message.

User response: Change the checksumGranularity User response: Specify different devices for different
attribute of the vdisk, then retry the command. new pdisks, respectively, and run the command again.

6027-3094 [E] Due to the attributes of other log 6027-3800 [E] Device with path(s) pathName for new
vdisks, the checksum granularity of this pdisk pdiskName is already in use by
vdisk must be number. pdisk pdiskName of recovery group
recoveryGroupName.
Explanation: The checksum granularities of the log tip
vdisk, the log tip backup vdisk, and the log home Explanation: The device specified for a new pdisk is
vdisk must all be the same. already being used by an existing pdisk. The device

Chapter 13. References 91


6027-3801 [E] • 6027-3810 [W]

can have multiple paths, which are shown in the error


| 6027-3805 [E] NSD format version 2 feature is not
message.
| supported by the current recovery group
User response: Specify an unused device for the pdisk | version. A recovery group version of at
and run the command again. | least rgVersion is required for this
| feature.

6027-3801 [E] [E] The checksum granularity for log | Explanation: NSD format version 2 feature is not
vdisks in declustered array | supported in the current recovery group version.
declusteredArrayName of RG
| User response: Apply the recovery group version
recoveryGroupName must be at least
| recommended in the error message and retry the
number bytes.
| command.
Explanation: Use a checksum granularity that is not
smaller than the minimum value given. You can use the
| 6027-3806 [E] The device given for pdisk pdiskName
mmlspdisk command to view the logical block sizes of
| has a logical block size of logicalBlockSize
the pdisks in this array to identify which pdisks are
| bytes, which is not supported by the
driving the limit.
| recovery group version.
User response: Change the checksumGranularity
| Explanation: The current recovery group version does
attribute of the new log vdisk to the indicated value,
| not support disk drives with the indicated logical block
and then retry the command.
| size.
| User response: Use a different disk device or upgrade
6027-3802 [E] [E] Pdisk pdiskName of RG
| the recovery group version and retry the command.
recoveryGroupName has a logical block
size of number bytes; the maximum
logical block size for pdisks in 6027-3807 [E] NSD version 1 specified for pdisk
declustered array declusteredArrayName pdiskName requires a disk with a logical
cannot exceed the log checksum block size of 512 bytes. The supplied
granularity of number bytes. disk has a block size of logicalBlockSize
bytes. For this disk, you must use at
Explanation: Logical block size of pdisks added to this
least NSD version 2.
declustered array must not be larger than any log
vdisk's checksum granularity. Explanation: Requested logical block size is not
supported by NSD format version 1.
User response: Use pdisks with equal or smaller
logical block size than the log vdisk's checksum User response: Correct the input file to use a different
granularity. disk or specify a higher NSD format version.

6027-3803 [E] [E] NSD format version 2 feature must 6027-3808 [E] Pdisk pdiskName must have a capacity of
be enabled before being changed. at least number bytes for NSD version 2.
Upgrade recovery group version to at
Explanation: The pdisk must be at least as large as the
least recoveryGroupVersion to enable it.
indicated minimum size in order to be added to the
Explanation: NSD format version 2 feature is not declustered array.
supported in current recovery group version.
User response: Correct the input file and retry the
User response: Apply the recovery group version command.
recommended in the error message and retry the
command.
| 6027-3809 [I] Pdisk pdiskName can be added as NSD
| version 1.
| 6027-3804 [W] Skipping upgrade of pdisk pdiskName
| Explanation: The pdisk has enough space to be
| because the disk capacity of number
| configured as NSD version 1.
| bytes is less than the number bytes
| required for the new format. | User response: Specify NSD version 1 for this disk.
| Explanation: The existing format of the indicated
| pdisk is not compatible with NSD V2 descriptors. 6027-3810 [W] [W] Skipping the upgrade of pdisk
pdiskName because no I/O paths are
| User response: A complete format of the declustered
currently available.
| array is required in order to upgrade to NSD V2.
Explanation: There is no I/O path available to the
indicated pdisk.

92 ESS 5.3.1: Problem Determination Guide


6027-3811 [E] • 6027-3822 [I]

User response: Try running the command again after


| 6027-3817 [E] Invalid log group name (logGroupName).
repairing the broken I/O path to the specified pdisk.
| Explanation: A log group name given in the
| mmcrrecoverygroup or mmaddpdisk command is invalid.
6027-3811 [E] Unable to action vdisk MDI.
| User response: Use only the characters a-z, A-Z, 0-9,
Explanation: The tscrvdisk command could not
| and underscore to specify a declustered array name
create or write the necessary vdisk MDI.
| and you can specify up to 63 characters.
User response: Retry the command.
| 6027-3818 [E] Cannot create log group logGroupName;
| 6027-3812 [I] Log group logGroupName assignment | there can be at most number log groups
| delay delaySeconds seconds for safe | in a recovery group.
| recovery.
| Explanation: The number of log groups allowed in a
| Explanation: The recovery group configuration | recovery group has been exceeded.
| manager must wait. Prior disk lease for the failing
| User response: Reduce the number of log groups in
| manager must expire before assigning a new worker to
| the input file and retry the command.
| the log group.
| User response: None.
| 6027-3819 [I] Recovery group recoveryGroupName delay
| delaySeconds seconds for assignment.
| 6027-3813 [A] Recovery group recoveryGroupName
| Explanation: The recovery group configuration
| could not be served by node nodeName.
| manager must wait before assigning a new manager to
| Explanation: The recovery group configuration | the recovery group.
| manager could not perform a node assignment to
| User response: None.
| manage the recovery group.
| User response: Check whether there are sufficient
| 6027-3820 [E] Specifying canHoldVCD not supported
| nodes and whether errors are recorded in the recovery
| by this recovery group version.
| group event log.
| Explanation: The ability to override the default
| decision of whether a declustered array is allowed to
| 6027-3814 [A] Log group logGroupName could not be
| hold vdisk configuration data is not supported by all
| served by node nodeName.
| recovery group versions.
| Explanation: The recovery group configuration
| User response: Upgrade the recovery group to a later
| manager could not perform a node assignment to
| version using the --version option of the
| manage the log group.
| mmchrecoverygroup command.
| User response: Check whether there are sufficient
| nodes and whether errors are recorded in the recovery
| 6027-3821 [E] Cannot set canHoldVCD=yes for small
| group event log.
| declustered arrays.
| Explanation: Declustered arrays with less than
| 6027-3815 [E] Erasure code not supported by this
| 9+vcdSpares disks cannot hold vdisk configuration
| recovery group version.
| data.
| Explanation: Vdisks with 4+2P and 4+3P erasure
| User response: Add more disks to the declustered
| codes are not supported by all recovery group versions.
| array or do not specify canHoldVCD=yes.
| User response: Upgrade the recovery group to a later
| version using the --version option of the
| 6027-3822 [I] Recovery group recoveryGroupName
| mmchrecoverygroup command.
| working index delay delaySeconds
| seconds for safe recovery.
| 6027-3816 [E] Invalid declustered array name
| Explanation: Prior disk lease for the workers must
| (declusteredArrayName).
| expire before recovering the working index metadata.
| Explanation: A declustered array name given in the
| User response: None.
| mmcrrecoverygroup or mmaddpdisk command is invalid.
| User response: Use only the characters a-z, A-Z, 0-9,
| and underscore to specify a declustered array name
| and you can specify up to 63 characters.

Chapter 13. References 93


6027-3823 [E] • 6027-3840 [E]

| 6027-3823 [E] Unknown node nodeName in the | 6027-3830 [E] Too many servers specified.
| recovery group configuration.
| Explanation: An input node list has too many nodes
| Explanation: A node name does not exist in the | specified.
| recovery group configuration manager.
| User response: Verify the list of nodes and shorten the
| User response: Check for damage to the mmsdrfs file. | list to the supported number.

| 6027-3824 [E] The defined server serverName for | 6027-3831 [E] A vdisk name must be provided.
| recovery group recoveryGroupName could
| Explanation: A vdisk name is not specified.
| not be resolved.
| User response: Specify a vdisk name.
| Explanation: The host name of recovery group server
| could not be resolved by gethostbyName().
| 6027-3832 [E] A recovery group name must be
| User response: Fix host name resolution.
| provided.
| Explanation: A recovery group name is not specified.
| 6027-3825 [E] The defined server serverName for node
| class nodeClassName could not be | User response: Specify a recovery group name.
| resolved.
| Explanation: The host name of recovery group server | 6027-3833 [E] Recovery group recoveryGroupName does
| could not be resolved by gethostbyName(). | not have an active root log group.
| User response: Fix host name resolution. | Explanation: The root log group must be active before
| the operation is permitted.
| 6027-3826 [A] Error reading volume identifier for | User response: Retry the command after the recovery
| recovery group recoveryGroupName from | group becomes fully active.
| configuration file.
| Explanation: The volume identifier for the named | 6027-3836 [I] Cannot retrieve MSID for device:
| recovery group could not be read from the mmsdrfs file. | devFileName.
| This should never occur.
| Explanation: Command usage message for tsgetmsid.
| User response: Check for damage to the mmsdrfs file.
| User response: None.

| 6027-3827 [A] Error reading volume identifier for


| vdisk vdiskName from configuration file. | 6027-3837 [E] Error creating worker vdisk.

| Explanation: The volume identifier for the named | Explanation: The tscrvdisk command could not
| vdisk could not\ be read from the mmsdrfs file. This | initialize the vdisk at the worker node.
| should never occur. | User response: Retry the command.
| User response: Check for damage to the mmsdrfs file.
| 6027-3838 [E] Unable to write new vdisk MDI.
| 6027-3828 [E] Vdisk vdiskName could not be associated | Explanation: The tscrvdisk command could not write
| with its recovery group | the necessary vdisk MDI.
| recoveryGroupName and will be ignored.
| User response: Retry the command.
| Explanation: The named vdisk cannot be associated
| with its recovery group.
| 6027-3839 [E] Unable to write update vdisk MDI.
| User response: Check for damage to the mmsdrfs file.
| Explanation: The tscrvdisk command could not write
| the necessary vdisk MDI.
| 6027-3829 [E] A server list must be provided.
| User response: Retry the command.
| Explanation: No server list is specified.
| User response: Specify a list of valid servers. | 6027-3840 [E] Unable to delete worker vdisk vdiskName
| err=errorNum.
| Explanation: The specified vdisk worker object could
| not be deleted.

94 ESS 5.3.1: Problem Determination Guide


6027-3841 [E] • 6027-3849 [E]

| User response: Retry the command with a valid vdisk | User response: Some internal pdisk state flags can be
| name. | set indirectly by running other commands. For
| example, the deleting state can be set by using the
| mmdelpdisk command.
| 6027-3841 [E] Unable to create new vdisk MDI.
| Explanation: The tscrvdisk command could not
| 6027-3847 [E] [E] The serviceDrain state feature must be
| create the necessary vdisk MDI.
| enabled to use this command. Upgrade
| User response: Retry the command. | the recovery group version to at least
| version to enable it.

| 6027-3843 [E] Error returned from node nodeName | Explanation: The mmchpdisk command option
| when preparing new pdisk pdiskName of | --begin-service-drain was issued, but there are
| RG recoveryGroupName for use: err | back-level nodes in the cluster that do not support this
| errorNum | action.

| Explanation: The system received an error from the | User response: Upgrade the nodes in the cluster to at
| given node when trying to prepare a new pdisk for | least the specified version and run the command again.
| use.
| User response: Retry the command. | 6027-3848 [E] The simulated dead and failing state
| feature must be enabled to use this
| command. Upgrade the recovery group
| 6027-3844 [E] Unable to prepare new pdisk pdiskName | version to at least version to enable it.
| of RG recoveryGroupName for use: exit
| status exitStatus. | Explanation: The mmchpdisk command option
| --begin-service-drain was issued, but there are
| Explanation: The system received an error from the | back-level nodes in the cluster that do not support this
| tspreparenewpdiskforuse script when trying to prepare | action.
| a new pdisk for use.
| User response: Upgrade the nodes in the cluster to at
| User response: Check the new disk and retry the | least the specified version and run the command again.
| command.

| 6027-3849 [E] The pdisk pdiskName of recovery group


| 6027-3845 [E] Unrecognized pdisk state: pdiskState. | recoveryGroupName could not be revived.
| Explanation: The given pdisk state name is invalid. | Pdisk state is pdiskState.

| User response: Use a valid pdisk state name. | Explanation: An mmchpdisk --revive command was
| unable to bring a pdisk back online.

| 6027-3846 [E] Pdisk state change pdiskState is not | User response: If the state is missing, restore
| permitted. | connectivity to the disk. If the disk is in failed state
| replace the pdisk. A pdisk with the status dead,
| Explanation: An attempt was made to use the | readOnly, failing, or slot is considered as failed.
| mmchpdisk command either to change an internal pdisk
| state, or to create an invalid combination of states.

Chapter 13. References 95


96 ESS 5.3.1: Problem Determination Guide
Notices
This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries.
Consult your local IBM representative for information on the products and services currently available in
your area. Any reference to an IBM product, program, or service is not intended to state or imply that
only that IBM product, program, or service may be used. Any functionally equivalent product, program,
or service that does not infringe any IBM intellectual property right may be used instead. However, it is
the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or
service.

IBM may have patents or pending patent applications covering subject matter described in this
document. The furnishing of this document does not grant you any license to these patents. You can send
license inquiries, in writing, to:

IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A.

For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property
Department in your country or send inquiries, in writing, to:

Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 19-21,

Nihonbashi-Hakozakicho, Chuo-ku Tokyo 103-8510, Japan

The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law:

INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS"


WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied
warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically
made to the information herein; these changes will be incorporated in new editions of the publication.
IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in
any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of
the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the
exchange of information between independently created programs and other programs (including this
one) and (ii) the mutual use of the information which has been exchanged, should contact:

IBM Corporation
Dept. 30ZA/Building 707
Mail Station P300

© Copyright IBM Corporation © IBM 2014, 2018 97


2455 South Road,
Poughkeepsie, NY 12601-5400
U.S.A.

Such information may be available, subject to appropriate terms and conditions, including in some cases,
payment or a fee.

The licensed program described in this document and all licensed material available for it are provided
by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or
any equivalent agreement between us.

Any performance data contained herein was determined in a controlled environment. Therefore, the
results obtained in other operating environments may vary significantly. Some measurements may have
been made on development-level systems and there is no guarantee that these measurements will be the
same on generally available systems. Furthermore, some measurements may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.

Information concerning non-IBM products was obtained from the suppliers of those products, their
published announcements or other publicly available sources. IBM has not tested those products and
cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of
those products.

This information contains examples of data and reports used in daily business operations. To illustrate
them as completely as possible, the examples include the names of individuals, companies, brands, and
products. All of these names are fictitious and any similarity to the names and addresses used by an
actual business enterprise is entirely coincidental.

COPYRIGHT LICENSE:

This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs
in any form without payment to IBM, for the purposes of developing, using, marketing or distributing
application programs conforming to the application programming interface for the operating platform for
which the sample programs are written. These examples have not been thoroughly tested under all
conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these
programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be
liable for any damages arising out of your use of the sample programs.

If you are viewing this information softcopy, the photographs and color illustrations may not appear.

Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at
“Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.

Intel is a trademark of Intel Corporation or its subsidiaries in the United States and other countries.

Java™ and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

98 ESS 5.3.1: Problem Determination Guide


Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation in the United States,
other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Notices 99
100 ESS 5.3.1: Problem Determination Guide
Glossary
This glossary provides terms and definitions for managers. The cluster manager is the
the ESS solution. node with the lowest node number
among the quorum nodes that are
The following cross-references are used in this operating at a particular time.
glossary:
compute node
v See refers you from a non-preferred term to the A node with a mounted GPFS file system
preferred term or from an abbreviation to the that is used specifically to run a customer
spelled-out form. job. ESS disks are not directly visible from
v See also refers you to a related or contrasting and are not managed by this type of
term. node.
CPC See central processor complex (CPC).
For other terms and definitions, see the IBM
Terminology website (opens in new window):
D
http://www.ibm.com/software/globalization/ DA See declustered array (DA).
terminology
datagram
A basic transfer unit associated with a
B
packet-switched network.
building block
DCM See drawer control module (DCM).
A pair of servers with shared disk
enclosures attached. declustered array (DA)
A disjoint subset of the pdisks in a
BOOTP
recovery group.
See Bootstrap Protocol (BOOTP).
dependent fileset
Bootstrap Protocol (BOOTP)
A fileset that shares the inode space of an
A computer networking protocol thst is
existing independent fileset.
used in IP networks to automatically
assign an IP address to network devices DFM See direct FSP management (DFM).
from a configuration server.
DHCP See Dynamic Host Configuration Protocol
(DHCP).
C
direct FSP management (DFM)
CEC See central processor complex (CPC).
The ability of the xCAT software to
central electronic complex (CEC) communicate directly with the Power
See central processor complex (CPC). Systems server's service processor without
the use of the HMC for management.
central processor complex (CPC)
A physical collection of hardware that drawer control module (DCM)
consists of channels, timers, main storage, Essentially, a SAS expander on a storage
and one or more central processors. enclosure drawer.
cluster Dynamic Host Configuration Protocol (DHCP)
A loosely-coupled collection of A standardized network protocol that is
independent systems, or nodes, organized used on IP networks to dynamically
into a network for the purpose of sharing distribute such network configuration
resources and communicating with each parameters as IP addresses for interfaces
other. See also GPFS cluster. and services.
cluster manager
E
The node that monitors node status using
disk leases, detects failures, drives Elastic Storage Server (ESS)
recovery, and selects file system A high-performance, GPFS NSD solution
© Copyright IBM Corporation © IBM 2014, 2018 101
made up of one or more building blocks failure group
that runs on IBM Power Systems servers. A collection of disks that share common
The ESS software runs on ESS nodes - access paths or adapter connection, and
management server nodes and I/O server could all become unavailable through a
nodes. single hardware failure.
ESS Management Server (EMS) FEK See file encryption key (FEK).
An xCAT server is required to discover
file encryption key (FEK)
the I/O server nodes (working with the
A key used to encrypt sectors of an
HMC), provision the operating system
individual file. See also encryption key.
(OS) on the I/O server nodes, and deploy
the ESS software on the management file system
node and I/O server nodes. One The methods and data structures used to
management server is required for each control how data is stored and retrieved.
ESS system composed of one or more
file system descriptor
building blocks.
A data structure containing key
encryption key information about a file system. This
A mathematical value that allows information includes the disks assigned to
components to verify that they are in the file system (stripe group), the current
communication with the expected server. state of the file system, and pointers to
Encryption keys are based on a public or key files such as quota files and log files.
private key pair that is created during the
file system descriptor quorum
installation process. See also file encryption
The number of disks needed in order to
key (FEK), master encryption key (MEK).
write the file system descriptor correctly.
ESS See Elastic Storage Server (ESS).
file system manager
environmental service module (ESM) The provider of services for all the nodes
Essentially, a SAS expander that attaches using a single file system. A file system
to the storage enclosure drives. In the manager processes changes to the state or
case of multiple drawers in a storage description of the file system, controls the
enclosure, the ESM attaches to drawer regions of disks that are allocated to each
control modules. node, and controls token management
and quota management.
ESM See environmental service module (ESM).
fileset A hierarchical grouping of files managed
Extreme Cluster/Cloud Administration Toolkit
as a unit for balancing workload across a
(xCAT)
cluster. See also dependent fileset,
Scalable, open-source cluster management
independent fileset.
software. The management infrastructure
of ESS is deployed by xCAT. fileset snapshot
A snapshot of an independent fileset plus
F all dependent filesets.
failback flexible service processor (FSP)
Cluster recovery from failover following Firmware that provices diagnosis,
repair. See also failover. initialization, configuration, runtime error
detection, and correction. Connects to the
failover
HMC.
(1) The assumption of file system duties
by another node when a node fails. (2) FQDN
The process of transferring all control of See fully-qualified domain name (FQDN).
the ESS to a single cluster in the ESS
FSP See flexible service processor (FSP).
when the other clusters in the ESS fails.
See also cluster. (3) The routing of all fully-qualified domain name (FQDN)
transactions to a second controller when The complete domain name for a specific
the first controller fails. See also cluster. computer, or host, on the Internet. The
FQDN consists of two parts: the hostname
and the domain name.

102 ESS 5.3.1: Problem Determination Guide


G IP See Internet Protocol (IP).
GPFS cluster IP over InfiniBand (IPoIB)
A cluster of nodes defined as being Provides an IP network emulation layer
available for use by GPFS file systems. on top of InfiniBand RDMA networks,
which allows existing applications to run
GPFS portability layer
over InfiniBand networks unmodified.
The interface module that each
installation must build for its specific IPoIB See IP over InfiniBand (IPoIB).
hardware platform and Linux
ISKLM
distribution.
See IBM Security Key Lifecycle Manager
GPFS Storage Server (GSS) (ISKLM).
A high-performance, GPFS NSD solution
made up of one or more building blocks J
that runs on System x servers.
JBOD array
GSS See GPFS Storage Server (GSS). The total collection of disks and
enclosures over which a recovery group
H pair is defined.
Hardware Management Console (HMC)
K
Standard interface for configuring and
operating partitioned (LPAR) and SMP kernel The part of an operating system that
systems. contains programs for such tasks as
input/output, management and control of
HMC See Hardware Management Console (HMC).
hardware, and the scheduling of user
tasks.
I
IBM Security Key Lifecycle Manager (ISKLM) L
For GPFS encryption, the ISKLM is used
LACP See Link Aggregation Control Protocol
as an RKM server to store MEKs.
(LACP).
independent fileset
Link Aggregation Control Protocol (LACP)
A fileset that has its own inode space.
Provides a way to control the bundling of
indirect block several physical ports together to form a
A block that contains pointers to other single logical channel.
blocks.
logical partition (LPAR)
inode The internal structure that describes the A subset of a server's hardware resources
individual files in the file system. There is virtualized as a separate computer, each
one inode for each file. with its own operating system. See also
node.
inode space
A collection of inode number ranges LPAR See logical partition (LPAR).
reserved for an independent fileset, which
enables more efficient per-fileset M
functions.
management network
Internet Protocol (IP) A network that is primarily responsible
The primary communication protocol for for booting and installing the designated
relaying datagrams across network server and compute nodes from the
boundaries. Its routing function enables management server.
internetworking and essentially
management server (MS)
establishes the Internet.
An ESS node that hosts the ESS GUI and
I/O server node xCAT and is not connected to storage. It
An ESS node that is attached to the ESS can be part of a GPFS cluster. From a
storage enclosures. It is the NSD server system management perspective, it is the
for the GPFS cluster.

Glossary 103
central coordinator of the cluster. It also node number
serves as a client node in an ESS building A number that is generated and
block. maintained by IBM Spectrum Scale as the
cluster is created, and as nodes are added
master encryption key (MEK)
to or deleted from the cluster.
A key that is used to encrypt other keys.
See also encryption key. node quorum
The minimum number of nodes that must
maximum transmission unit (MTU)
be running in order for the daemon to
The largest packet or frame, specified in
start.
octets (eight-bit bytes), that can be sent in
a packet- or frame-based network, such as node quorum with tiebreaker disks
the Internet. The TCP uses the MTU to A form of quorum that allows IBM
determine the maximum size of each Spectrum Scale to run with as little as one
packet in any transmission. quorum node available, as long as there is
access to a majority of the quorum disks.
MEK See master encryption key (MEK).
non-quorum node
metadata
A node in a cluster that is not counted for
A data structure that contains access
the purposes of quorum determination.
information about file data. Such
structures include inodes, indirect blocks,
O
and directories. These data structures are
not accessible to user applications. OFED See OpenFabrics Enterprise Distribution
(OFED).
MS See management server (MS).
OpenFabrics Enterprise Distribution (OFED)
MTU See maximum transmission unit (MTU).
An open-source software stack includes
software drivers, core kernel code,
N
middleware, and user-level interfaces.
Network File System (NFS)
A protocol (developed by Sun P
Microsystems, Incorporated) that allows
pdisk A physical disk.
any host in a network to gain access to
another host or netgroup and their file PortFast
directories. A Cisco network function that can be
configured to resolve any problems that
Network Shared Disk (NSD)
could be caused by the amount of time
A component for cluster-wide disk
STP takes to transition ports to the
naming and access.
Forwarding state.
NSD volume ID
A unique 16-digit hexadecimal number R
that is used to identify and access all
RAID See redundant array of independent disks
NSDs.
(RAID).
node An individual operating-system image
RDMA
within a cluster. Depending on the way in
See remote direct memory access (RDMA).
which the computer system is partitioned,
it can contain one or more nodes. In a redundant array of independent disks (RAID)
Power Systems environment, synonymous A collection of two or more disk physical
with logical partition. drives that present to the host an image
of one or more logical disk drives. In the
node descriptor
event of a single physical device failure,
A definition that indicates how IBM
the data can be read or regenerated from
Spectrum Scale uses a node. Possible
the other disk drives in the array due to
functions include: manager node, client
data redundancy.
node, quorum node, and non-quorum
node. recovery
The process of restoring access to file

104 ESS 5.3.1: Problem Determination Guide


system data when a failure has occurred. Ethernet local-area network. The basic
Recovery can involve reconstructing data function of STP is to prevent bridge loops
or providing alternative routing through a and the broadcast radiation that results
different server. from them.
recovery group (RG) SSH See secure shell (SSH).
A collection of disks that is set up by IBM
STP See Spanning Tree Protocol (STP).
Spectrum Scale RAID, in which each disk
is connected physically to two servers: a symmetric multiprocessing (SMP)
primary server and a backup server. A computer architecture that provides fast
performance by making multiple
remote direct memory access (RDMA)
processors available to complete
A direct memory access from the memory
individual processes simultaneously.
of one computer into that of another
without involving either one's operating
T
system. This permits high-throughput,
low-latency networking, which is TCP See Transmission Control Protocol (TCP).
especially useful in massively-parallel
Transmission Control Protocol (TCP)
computer clusters.
A core protocol of the Internet Protocol
RGD See recovery group data (RGD). Suite that provides reliable, ordered, and
error-checked delivery of a stream of
remote key management server (RKM server)
octets between applications running on
A server that is used to store master
hosts communicating over an IP network.
encryption keys.
RG See recovery group (RG). V
recovery group data (RGD) VCD See vdisk configuration data (VCD).
Data that is associated with a recovery
vdisk A virtual disk.
group.
vdisk configuration data (VCD)
RKM server
Configuration data that is associated with
See remote key management server (RKM
a virtual disk.
server).

S X
xCAT See Extreme Cluster/Cloud Administration
SAS See Serial Attached SCSI (SAS).
Toolkit.
secure shell (SSH)
A cryptographic (encrypted) network
protocol for initiating text-based shell
sessions securely on remote computers.
Serial Attached SCSI (SAS)
A point-to-point serial protocol that
moves data to and from such computer
storage devices as hard drives and tape
drives.
service network
A private network that is dedicated to
managing POWER8 servers. Provides
Ethernet-based connectivity among the
FSP, CPC, HMC, and management server.
SMP See symmetric multiprocessing (SMP).
Spanning Tree Protocol (STP)
A network protocol that ensures a
loop-free topology for any bridged

Glossary 105
106 ESS 5.3.1: Problem Determination Guide
Index
Special characters directories
/tmp/mmfs 39
/tmp/mmfs directory 39 disks
diagnosis 44
hardware service 47
A hospital 44
array, declustered maintaining 43
background tasks 45 replacement 46
replacing failed 47, 66
DMP 72
B replace disks 72
update drive firmware 73
back up data 27 update enclosure firmware 73
background tasks 45 update host-adapter firmware 74
best practices for troubleshooting 27, 31, 33 documentation
on web vii
drive firmware
C updating 43
call home
5146 system 1
5148 System 1 E
background 1 Electronic Service Agent
overview 1 activation 3
problem report 7 configuration 4
problem report details 9 Installing 2
Call home login 3
monitoring 11 Reinstalling 12
Post setup activities 14 Uninstalling 12
test 12 enclosure components
upload data 11 replacing failed 52
checksum enclosure firmware
data 46 updating 43
commands errpt command 39
errpt 39 events 77
gpfs.snap 39
lslpp 39
mmlsdisk 40
mmlsfs 40 F
rpm 39 failed disks, replacing 47, 66
comments ix failed enclosure components, replacing 52
components of storage enclosures failover, server 46
replacing failed 52 files
contacting IBM 41 mmfs.log 39
firmware
updating 43
D
data checksum 46
declustered array G
background tasks 45 getting started with troubleshooting 27
diagnosis, disk 44 GPFS
directed maintenance procedure 72 events 77
increase fileset space 75 RAS events 77
replace disks 72 GPFS log 39
start gpfs daemon 74 gpfs.snap command 39
start NSD 74 GUI
start performance monitoring collector service 75 directed maintenance procedure 72
start performance monitoring sensor service 76 DMP 72
synchronize node clocks 75 logs 35
update drive firmware 73 logsIssues with loading GUI 35, 37
update enclosure firmware 73
update host-adapter firmware 74

© Copyright IBM Corp. 2014, 2018 107


H P
hardware service 47 patent information 97
hospital, disk 44 PMR 41
host adapter firmware preface vii
updating 43 problem determination
documentation 39
reporting a problem to IBM 39
I Problem Management Record 41
I/O node failure
restore 21
IBM Elastic Storage Server R
best practices for troubleshooting 31, 33 RAS events 77
IBM Spectrum Scale rebalance, background task 45
back up data 27 rebuild-1r, background task 45
best practices for troubleshooting 27, 31 rebuild-2r, background task 45
call home 1 rebuild-critical, background task 45
monitoring 11 rebuild-offline, background task 45
Post setup activities 14 recovery groups
test 12 server failover 46
upload data 11 repair-RGD/VCD, background task 45
Electronic Service Agent 2, 12 replace disks 72
ESA replacement, disk 46
activation 3 replacing failed disks 47, 66
configuration 4 replacing failed storage enclosure components 52
create problem report 7, 9 report problems 29
login 3 reporting a problem to IBM 39
problem details 9 resolve events 28
events 77 resources
RAS events 77 on web vii
troubleshooting 27, 35, 37 Restore
best practices 28, 29 I/O node 21
getting started 27 rpm command 39
warranty and maintenance 29
information overview vii
S
scrub, background task 45
L sda
license inquiries 97 NVR Partitions 17
lslpp command 39 server failover 46
service
reporting a problem to IBM 39
M service, hardware 47
severity tags
maintenance
messages 77
disks 43
submitting ix
message severity tags 77
support notifications 28
mmfs.log 39
mmlsdisk command 40
mmlsfs command 40
T
tasks, background 45
N the IBM Support Center 41
trademarks 98
node
troubleshooting
crash 41
best practices 27, 31, 33
hang 41
report problems 29
notices 97
resolve events 28
NVR Partitions 17
support notifications 28
NVRAM pdisks 19
update software 28
recreate 19
call home 1, 2, 12
call home data upload 11
call home monitoring 11
O Electronic Service Agent
overview problem details 9
of information vii problem report creation 7
ESA 2, 3, 4, 12

108 ESS 5.3.1: Problem Determination Guide


troubleshooting (continued)
getting started 27
Post setup activities for call home 14
testing call home 12
warranty and maintenance 29
Troubleshooting
GUI 35
Recovery Grooups 37

U
update drive firmware 73
update enclosure firmware 73
update host-adapter firmware 74

V
vdisks
data checksum 46

W
warranty and maintenance 29
web
documentation vii
resources vii

Index 109
110 ESS 5.3.1: Problem Determination Guide
IBM®

Printed in USA

GC27-9272-00

You might also like