FAS2600 Maintain
FAS2600 Maintain
Boot media
The boot media stores a primary and secondary set of boot image files that the system uses when it boots.
Caching module
You must replace the controller’s caching module when your system registers a single AutoSupport (ASUP)
message that the module has gone offline.
Chassis
The chassis is the physical enclosure housing all the controller components such as the controller/CPU unit,
power supply, and I/O.
Controller
A controller consists of a board, firmware, and software. It controls the drives and implements the ONTAP
functions.
DIMM
You must replace a DIMM (dual in-line memory module) when a memory mismatch is present, or you have a
failed DIMM.
Drive
A drive is a device that provides the physical storage media for data.
NVEM Battery
A battery is included with a controller and preserves cached data if the AC power fails.
Power supply
A power supply provides a redundant power source in a controller shelf.
1
Boot media
Overview of boot media replacement - FAS2600
The boot media stores a primary and secondary set of system (boot image) files that the
system uses when it boots. Depending on your network configuration, you can perform
either a nondisruptive or disruptive replacement.
You must have a USB flash drive, formatted to FAT32, with the appropriate amount of storage to hold the
image_xxx.tgz file.
You also must copy the image_xxx.tgz file to the USB flash drive for later use in this procedure.
• The nondisruptive and disruptive methods for replacing a boot media both require you to restore the var
file system:
◦ For nondisruptive replacement, the HA pair must be connected to a network to restore the var file
system.
◦ For disruptive replacement, you do not need a network connection to restore the var file system, but
the process requires two reboots.
• You must replace the failed component with a replacement FRU component you received from your
provider.
• It is important that you apply the commands in these steps on the correct node:
◦ The impaired node is the node on which you are performing maintenance.
◦ The healthy node is the HA partner of the impaired node.
If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy
controller shows false for eligibility and health, you must correct the issue before shutting down the impaired
controller; see the Synchronize a node with the cluster.
Steps
1. Check the status of the impaired controller:
◦ If the impaired controller is at the login prompt, log in as admin.
◦ If the impaired controller is at the LOADER prompt and is part of HA configuration, log in as admin on
the healthy controller.
◦ If the impaired controller is in a standalone configuration and at LOADER prompt, contact
mysupport.netapp.com.
2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message:
2
system node autosupport invoke -node * -type all -message
MAINT=number_of_hours_downh
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>
system node autosupport invoke -node * -type all -message MAINT=2h
3. Check the version of ONTAP the system is running on the impaired controller if up, or on the partner
controller if the impaired controller is down, using the version -v command:
◦ If <lno-DARE> or <1Ono-DARE> is displayed in the command output, the system does not support
NVE, proceed to shut down the controller.
◦ If <lno-DARE> is not displayed in the command output, and the system is running ONTAP 9.5, go to
Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier.
◦ If <lno-DARE> is not displayed in the command output, and the system is running ONTAP 9.6 or later,
go to Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later.
4. If the impaired controller is part of an HA configuration, disable automatic giveback from the healthy
controller: storage failover modify -node local -auto-giveback false or storage
failover modify -node local -auto-giveback-after-panic false
Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier
Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume
Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Steps
1. Connect the console cable to the impaired controller.
2. Check whether NVE is configured for any volumes in the cluster: volume show -is-encrypted true
If any volumes are listed in the output, NVE is configured and you need to verify the NVE configuration. If
no volumes are listed, check whether NSE is configured.
Steps
1. Display the key IDs of the authentication keys that are stored on the key management servers: security
key-manager query
◦ If the Restored column displays yes and all key managers display available, it’s safe to shut down
the impaired controller.
◦ If the Restored column displays anything other than yes, or if any key manager displays
unavailable, you need to complete some additional steps.
◦ If you see the message This command is not supported when onboard key management is enabled,
you need to complete some other additional steps.
2. If the Restored column displayed anything other than yes, or if any key manager displayed
unavailable:
3
a. Retrieve and restore all authentication keys and associated key IDs: security key-manager
restore -address *
mysupport.netapp.com
b. Verify that the Restored column displays yes for all authentication keys and that all key managers
display available: security key-manager query
c. Shut down the impaired controller.
3. If you saw the message This command is not supported when onboard key management is enabled,
display the keys stored in the onboard key manager: security key-manager key show -detail
a. If the Restored column displays yes manually back up the onboard key management information:
▪ Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
▪ Enter the command to display the OKM backup information: security key-manager backup
show
▪ Copy the contents of the backup information to a separate file or your log file. You’ll need it in
disaster scenarios where you might need to manually recover OKM.
▪ Return to admin mode: set -priv admin
▪ Shut down the impaired controller.
b. If the Restored column displays anything other than yes:
▪ Run the key-manager setup wizard: security key-manager setup -node
target/impaired node name
Enter the customer’s onboard key management passphrase at the prompt. If the
passphrase cannot be provided, contact mysupport.netapp.com
▪ Verify that the Restored column displays yes for all authentication key: security key-
manager key show -detail
▪ Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
▪ Enter the command to display the OKM backup information: security key-manager backup
show
▪ Copy the contents of the backup information to a separate file or your log file. You’ll need it in
disaster scenarios where you might need to manually recover OKM.
▪ Return to admin mode: set -priv admin
▪ You can safely shutdown the controller.
Steps
1. Display the key IDs of the authentication keys that are stored on the key management servers: security
key-manager query
◦ If the Restored column displays yes and all key managers display available, it’s safe to shut down
the impaired controller.
4
◦ If the Restored column displays anything other than yes, or if any key manager displays
unavailable, you need to complete some additional steps.
◦ If you see the message This command is not supported when onboard key management is enabled,
you need to complete some other additional steps
2. If the Restored column displayed anything other than yes, or if any key manager displayed
unavailable:
a. Retrieve and restore all authentication keys and associated key IDs: security key-manager
restore -address *
mysupport.netapp.com
b. Verify that the Restored column displays yes for all authentication keys and that all key managers
display available: security key-manager query
c. Shut down the impaired controller.
3. If you saw the message This command is not supported when onboard key management is enabled,
display the keys stored in the onboard key manager: security key-manager key show -detail
a. If the Restored column displays yes, manually back up the onboard key management information:
▪ Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
▪ Enter the command to display the OKM backup information: security key-manager backup
show
▪ Copy the contents of the backup information to a separate file or your log file. You’ll need it in
disaster scenarios where you might need to manually recover OKM.
▪ Return to admin mode: set -priv admin
▪ Shut down the impaired controller.
b. If the Restored column displays anything other than yes:
▪ Run the key-manager setup wizard: security key-manager setup -node
target/impaired node name
Enter the customer’s OKM passphrase at the prompt. If the passphrase cannot be
provided, contact mysupport.netapp.com
▪ Verify that the Restored column shows yes for all authentication keys: security key-
manager key show -detail
▪ Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
▪ Enter the command to back up the OKM information: security key-manager backup show
Make sure that OKM information is saved in your log file. This information will be
needed in disaster scenarios where OKM might need to be manually recovered.
▪ Copy the contents of the backup information to a separate file or your log. You’ll need it in disaster
scenarios where you might need to manually recover OKM.
5
▪ Return to admin mode: set -priv admin
▪ You can safely shut down the controller.
Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later
Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume
Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
1. Verify whether NVE is in use for any volumes in the cluster: volume show -is-encrypted true
If any volumes are listed in the output, NVE is configured and you need to verify the NVE configuration. If
no volumes are listed, check whether NSE is configured and in use.
2. Verify whether NSE is configured and in use: storage encryption disk show
◦ If the command output lists the drive details with Mode & Key ID information, NSE is configured and
you need to verify the NSE configuration and in use.
◦ If no disks are shown, NSE is not configured.
◦ If NVE and NSE are not configured, no drives are protected with NSE keys, it’s safe to shut down the
impaired controller.
1. Display the key IDs of the authentication keys that are stored on the key management servers: security
key-manager key query
After the ONTAP 9.6 release, you may have additional key manager types. The types are
KMIP, AKV, and GCP. The process for confirming these types is the same as confirming
external or onboard key manager types.
◦ If the Key Manager type displays external and the Restored column displays yes, it’s safe to shut
down the impaired controller.
◦ If the Key Manager type displays onboard and the Restored column displays yes, you need to
complete some additional steps.
◦ If the Key Manager type displays external and the Restored column displays anything other than
yes, you need to complete some additional steps.
◦ If the Key Manager type displays onboard and the Restored column displays anything other than
yes, you need to complete some additional steps.
2. If the Key Manager type displays onboard and the Restored column displays yes, manually back up
the OKM information:
a. Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
b. Enter the command to display the key management information: security key-manager onboard
show-backup
c. Copy the contents of the backup information to a separate file or your log file. You’ll need it in disaster
scenarios where you might need to manually recover OKM.
d. Return to admin mode: set -priv admin
e. Shut down the impaired controller.
6
3. If the Key Manager type displays external and the Restored column displays anything other than
yes:
a. Restore the external key management authentication keys to all nodes in the cluster: security key-
manager external restore
mysupport.netapp.com
b. Verify that the Restored column equals yes for all authentication keys: security key-manager
key query
c. Shut down the impaired controller.
4. If the Key Manager type displays onboard and the Restored column displays anything other than yes:
a. Enter the onboard security key-manager sync command: security key-manager onboard sync
b. Verify the Restored column shows yes for all authentication keys: security key-manager key
query
c. Verify that the Key Manager type shows onboard, and then manually back up the OKM information.
d. Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
e. Enter the command to display the key management backup information: security key-manager
onboard show-backup
f. Copy the contents of the backup information to a separate file or your log file. You’ll need it in disaster
scenarios where you might need to manually recover OKM.
g. Return to admin mode: set -priv admin
h. You can safely shut down the controller.
1. Display the key IDs of the authentication keys that are stored on the key management servers: security
key-manager key query -key-type NSE-AK
After the ONTAP 9.6 release, you may have additional key manager types. The types are
KMIP, AKV, and GCP. The process for confirming these types is the same as confirming
external or onboard key manager types.
◦ If the Key Manager type displays external and the Restored column displays yes, it’s safe to shut
down the impaired controller.
◦ If the Key Manager type displays onboard and the Restored column displays yes, you need to
complete some additional steps.
◦ If the Key Manager type displays external and the Restored column displays anything other than
yes, you need to complete some additional steps.
7
◦ If the Key Manager type displays external and the Restored column displays anything other than
yes, you need to complete some additional steps.
2. If the Key Manager type displays onboard and the Restored column displays yes, manually back up
the OKM information:
a. Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
b. Enter the command to display the key management information: security key-manager onboard
show-backup
c. Copy the contents of the backup information to a separate file or your log file. You’ll need it in disaster
scenarios where you might need to manually recover OKM.
d. Return to admin mode: set -priv admin
e. You can safely shut down the controller.
3. If the Key Manager type displays external and the Restored column displays anything other than
yes:
a. Restore the external key management authentication keys to all nodes in the cluster: security key-
manager external restore
mysupport.netapp.com
b. Verify that the Restored column equals yes for all authentication keys: security key-manager
key query
c. You can safely shut down the controller.
4. If the Key Manager type displays onboard and the Restored column displays anything other than yes:
a. Enter the onboard security key-manager sync command: security key-manager onboard sync
Enter the customer’s 32 character, alphanumeric onboard key management passphrase at the prompt.
If the passphrase cannot be provided, contact NetApp Support.
mysupport.netapp.com
b. Verify the Restored column shows yes for all authentication keys: security key-manager key
query
c. Verify that the Key Manager type shows onboard, and then manually back up the OKM information.
d. Go to advanced privilege mode and enter y when prompted to continue: set -priv advanced
e. Enter the command to display the key management backup information: security key-manager
onboard show-backup
f. Copy the contents of the backup information to a separate file or your log file. You’ll need it in disaster
scenarios where you might need to manually recover OKM.
g. Return to admin mode: set -priv admin
h. You can safely shut down the controller.
8
Shut down the impaired controller - FAS2600
After completing the NVE or NSE tasks, you need to complete the shutdown of the
impaired controller.
Steps
a. Take the impaired controller to the LOADER prompt:
Waiting for giveback… Press Ctrl-C, and then respond y when prompted.
System prompt or password Take over or halt the impaired controller from the healthy controller:
prompt (enter system password) storage failover takeover -ofnode
impaired_node_name
b. From the LOADER prompt, enter: printenv to capture all boot environmental variables. Save the output
to your log file.
This command may not work if the boot device is corrupted or non-functional.
To access components inside the controller, you must first remove the controller module from the system and
then remove the cover on the controller module.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
3. Remove and set aside the cable management devices from the left and right sides of the controller module.
9
4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
5. Turn the controller module over and place it on a flat, stable surface.
6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open.
10
Step 2: Replace the boot media
3. Press the blue button on the boot media housing to release the boot media from its housing, and then
11
gently pull it straight out of the boot media socket.
Do not twist or pull the boot media straight up, because this could damage the socket or the
boot media.
4. Align the edges of the replacement boot media with the boot media socket, and then gently push it into the
socket.
5. Check the boot media to make sure that it is seated squarely and completely in the socket.
If necessary, remove the boot media and reseat it into the socket.
6. Push the boot media down to engage the locking button on the boot media housing.
7. Close the controller module cover.
You can install the system image to the replacement boot media using a USB flash drive with the image
installed on it. However, you must restore the var file system during this procedure.
• You must have a USB flash drive, formatted to FAT32, with at least 4GB capacity.
• A copy of the same image version of ONTAP as what the impaired controller was running. You can
download the appropriate image from the Downloads section on the NetApp Support Site
◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download
button.
◦ If NVE is not enabled, download the image without NetApp Volume Encryption, as indicated in the
download button.
• If your system is an HA pair, you must have a network connection.
• If your system is a stand-alone system you do not need a network connection, but you must perform an
additional reboot when restoring the var file system.
Steps
1. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
2. Reinstall the cable management device and recable the system, as needed.
When recabling, remember to reinstall the media converters (SFPs) if they were removed.
3. Insert the USB flash drive into the USB slot on the controller module.
Make sure that you install the USB flash drive in the slot labeled for USB devices, and not in the USB
console port.
4. Push the controller module all the way into the system, making sure that the cam handle clears the USB
flash drive, firmly push the cam handle to finish seating the controller module, push the cam handle to the
closed position, and then tighten the thumbscrew.
The controller begins to boot as soon as it is completely installed into the chassis.
5. Interrupt the boot process to stop at the LOADER prompt by pressing Ctrl-C when you see Starting
AUTOBOOT press Ctrl-C to abort….
12
If you miss this message, press Ctrl-C, select the option to boot to Maintenance mode, and then halt the
controller to boot to LOADER.
6. For systems with one controller in the chassis, reconnect the power and turn on the power supplies.
The target port you configure is the target port you use to communicate with the
impaired controller from the healthy controller during var file system restore with a
network connection. You can also use the e0M port in this command.
If you use this optional parameter, you do not need a fully qualified domain name in the netboot
server URL. You need only the server’s host name.
Other parameters might be necessary for your interface. You can enter help ifconfig at
the firmware prompt for details.
2. When prompted, either enter the name of the image or accept the default image displayed inside the
brackets on your screen.
3. Restore the var file system:
13
If your system has… Then…
A network connection a. Press y when prompted to restore the backup configuration.
b. Set the healthy controller to advanced privilege level: set
-privilege advanced
c. Run the restore backup command: system node restore-
backup -node local -target-address
impaired_node_IP_address
d. Return the controller to admin level: set -privilege admin
e. Press y when prompted to use the restored configuration.
f. Press y when prompted to reboot the controller.
14
8. Give back the controller using the storage failover giveback -fromnode local command.
9. At the cluster prompt, check the logical interfaces with the net int -is-home false command.
If any interfaces are listed as "false", revert those interfaces back to their home port using the net int
revert command.
10. Move the console cable to the repaired controller and run the version -v command to check the ONTAP
versions.
11. Restore automatic giveback if you disabled it by using the storage failover modify -node local
-auto-giveback true command.
If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the
beginning of this procedure.
• If NSE or NVE are enabled and Onboard Key Manager is enabled, go to Option 1: Restore NVE or NSE
when Onboard Key Manager is enabled.
• If NSE or NVE are enabled for ONATP 9.5, go to Option 2: Restore NSE/NVE on systems running ONTAP
9.5 and earlier.
• If NSE or NVE are enabled for ONTAP 9.6, go to Option 3: Restore NSE/NVE on systems running ONTAP
9.6 and later.
Steps
1. Connect the console cable to the target controller.
2. Use the boot_ontap command at the LOADER prompt to boot the controller.
3. Check the console output:
4. At the Boot Menu, enter the hidden command, recover_onboard_keymanager and reply y at the
15
prompt.
5. Enter the passphrase for the onboard key manager you obtained from the customer at the beginning of this
procedure.
6. When prompted to enter the backup data, paste the backup data you captured at the beginning of this
procedure, when asked. Paste the output of security key-manager backup show OR security
key-manager onboard show-backup command.
The data is output from either security key-manager backup show or security
key-manager onboard show-backup command.
--------------------------BEGIN BACKUP--------------------------
TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA
QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV
dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA
3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA
IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/
LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf
ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA . . . .
H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAA
---------------------------END BACKUP---------------------------
8. Move the console cable to the partner controller and login as admin.
9. Confirm the target controller is ready for giveback with the storage failover show command.
10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo
-aggregates true command.
◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in
the slot until a replacement is received.
◦ If the command fails because of an open CIFS session, check with the customer on how to close out
CIFS sessions.
◦ If the command fails because the partner is "not ready", wait 5 minutes for the NVMEMs to
synchronize.
◦ If the command fails because of an NDMP, SnapMirror, or SnapVault process, disable the process. See
the appropriate Documentation Center for more information.
16
11. Once the giveback completes, check the failover and giveback status with the storage failover show
and `storage failover show-giveback` commands.
Only the CFO aggregates (root aggregate and CFO style data aggregates) will be shown.
If the Restored column = anything other than yes, contact Customer Support.
If the Restored column = anything other than yes/true, contact Customer Support.
18. At the clustershell prompt, enter the net int show -is-home false command to list the logical
interfaces that are not on their home controller and port.
If any interfaces are listed as false, revert those interfaces back to their home port using the net int
revert -vserver Cluster -lif nodename command.
19. Move the console cable to the target controller and run the version -v command to check the ONTAP
versions.
20. Restore automatic giveback if you disabled it by using the storage failover modify -node local
-auto-giveback true command.
Steps
17
1. Connect the console cable to the target controller.
2. Use the boot_ontap command at the LOADER prompt to boot the controller.
3. Check the console output:
4. Move the console cable to the partner controller and give back the target controller storage using the
storage failover giveback -fromnode local -only-cfo-aggregates true local
command.
◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in
the slot until a replacement is received.
◦ If the command fails because of an open CIFS sessions, check with customer how to close out CIFS
sessions.
◦ If the command fails because the partner "not ready", wait 5 minutes for the NVMEMs to synchronize.
◦ If the command fails because of an NDMP, SnapMirror, or SnapVault process, disable the process. See
the appropriate Documentation Center for more information.
5. Wait 3 minutes and check the failover status with the storage failover show command.
6. At the clustershell prompt, enter the net int show -is-home false command to list the logical
interfaces that are not on their home controller and port.
If any interfaces are listed as false, revert those interfaces back to their home port using the net int
revert -vserver Cluster -lif nodename command.
7. Move the console cable to the target controller and run the version -v command to check the ONTAP
versions.
8. Restore automatic giveback if you disabled it by using the storage failover modify -node local
-auto-giveback true command.
9. Use the storage encryption disk show at the clustershell prompt, to review the output.
This command does not work if NVE (NetApp Volume Encryption) is configured
10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the
key management servers.
◦ If the Restored column = yes and all key managers report in an available state, go to Complete the
replacement process.
18
◦ If the Restored column = anything other than yes, and/or one or more key managers is not available,
use the security key-manager restore -address command to retrieve and restore all
authentication keys (AKs) and key IDs associated with all nodes from all available key management
servers.
Check the output of the security key-manager query again to ensure that the Restored column = yes
and all key managers report in an available state
If the Restored column = anything other than yes, use the security key-manager setup
-node Repaired(Target)node command to restore the Onboard Key Management settings.
Rerun the security key-manager key show -detail command to verify Restored column =
yes for all authentication keys.
Steps
1. Connect the console cable to the target controller.
2. Use the boot_ontap command at the LOADER prompt to boot the controller.
3. Check the console output:
4. Move the console cable to the partner controller and give back the target controller storage using the
storage failover giveback -fromnode local -only-cfo-aggregates true local
command.
◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in
the slot until a replacement is received.
◦ If the command fails because of an open CIFS session, check with the customer on how to close out
CIFS sessions.
19
Terminating CIFS can cause loss of data.
◦ If the command fails because the partner is "not ready", wait 5 minutes for the NVMEMs to
synchronize.
◦ If the command fails because of an NDMP, SnapMirror, or SnapVault process, disable the process. See
the appropriate Documentation Center for more information.
5. Wait 3 minutes and check the failover status with the storage failover show command.
6. At the clustershell prompt, enter the net int show -is-home false command to list the logical
interfaces that are not on their home controller and port.
If any interfaces are listed as false, revert those interfaces back to their home port using the net int
revert -vserver Cluster -lif nodename command.
7. Move the console cable to the target controller and run the version -v command to check the ONTAP
versions.
8. Restore automatic giveback if you disabled it by using the storage failover modify -node local
-auto-giveback true command.
9. Use the storage encryption disk show at the clustershell prompt, to review the output.
10. Use the security key-manager key query command to display the key IDs of the authentication
keys that are stored on the key management servers.
◦ If the Restored column = yes/true, you are done and can proceed to complete the replacement
process.
◦ If the Key Manager type = external and the Restored column = anything other than yes/true,
use the security key-manager external restore command to restore the key IDs of the
authentication keys.
◦ If the Key Manager type = onboard and the Restored column = anything other than yes/true,
use the security key-manager onboard sync command to re-sync the Key Manager type.
Use the security key-manager key query to verify that the Restored column = yes/true for all
authentication keys.
20
Replace the caching module - FAS2600
You must replace the caching module in the controller module when your system
registers a single AutoSupport (ASUP) message that the module has gone offline; failure
to do so results in performance degradation.
• You must replace the failed component with a replacement FRU component you received from your
provider.
You might want to erase the contents of your caching module before replacing it.
1. Although data on the caching module is encrypted, you might want to erase any data from the impaired
caching module and verify that the caching module has no data:
a. Erase the data on the caching module: system controller flash-cache secure-erase run
-node node name localhost -device-id device_number
Run the system controller flash-cache show command if you don’t know the
flashcache device ID.
b. Verify that the data has been erased from the caching module: system controller flash-cache
secure-erase show
2. If the impaired controller is part of an HA pair, disable automatic giveback from the console of the healthy
controller: storage failover modify -node local -auto-giveback false
3. Take the impaired controller to the LOADER prompt:
Waiting for giveback… Press Ctrl-C, and then respond y when prompted.
21
If the impaired controller is Then…
displaying…
System prompt or password Take over or halt the impaired controller:
prompt (enter system password)
• For an HA pair, take over the impaired controller from the healthy
controller: storage failover takeover -ofnode
impaired_node_name
4. If the system has only one controller module in the chassis, turn off the power supplies, and then unplug
the impaired controller’s power cords from the power source.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
3. Remove and set aside the cable management devices from the left and right sides of the controller module.
4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
22
5. Turn the controller module over and place it on a flat, stable surface.
6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open.
23
Your storage system must meet certain criteria depending on your situation:
• It must have the appropriate operating system for the caching module you are installing.
• It must support the caching capacity.
• All other components in the storage system must be functioning properly; if not, you must contact technical
support.
1. Locate the caching module at the rear of the controller module and remove it.
a. Press the release tab.
b. Remove the heatsink.
If necessary, remove the caching module and reseat it into the socket.
4. Reseat and push the heatsink down to engage the locking button on the caching module housing.
5. Close the controller module cover, as needed.
24
Step 4: Reinstall the controller module
After you replace components in the controller module, reinstall it into the chassis.
1. If you have not already done so, replace the cover on the controller module.
2. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
Do not completely insert the controller module in the chassis until instructed to do so.
If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber
optic cables.
a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
25
If your system is in… Then perform these steps…
A stand-alone configuration a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Reconnect the power cables to the power supplies and to the
power sources, then turn on the power to start the boot process.
Chassis
Overview of chassis replacement - FAS2600
To replace the chassis, you must move the power supplies, hard drives, and controller
module or modules from the impaired chassis to the new chassis, and swap out the
impaired chassis from the equipment rack or system cabinet with the new chassis of the
same model as the impaired chassis.
All other components in the system must be functioning properly; if not, you must contact technical support.
• You can use this procedure with all versions of ONTAP supported by your system.
• This procedure is written with the assumption that you are moving all drives and controller module or
modules to the new chassis, and that the chassis is a new component from NetApp.
• This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial
outage in a multi-node cluster.
26
Shut down the controllers - FAS2600
This procedure is for 2-node, non-MetroCluster configurations only. If you have a system
with more than two nodes, see How to perform a graceful shutdown and power up of one
HA pair in a 4-node cluster.
Before you begin
You need:
If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the
Gracefully shutdown and power up your storage system Resolution Guide after performing this
procedure.
If using FlexArray array LUNs, follow the specific vendor storage array documentation for the
shutdown procedure to perform for those systems after performing this procedure.
If using SSDs, refer to SU490: (Impact: Critical) SSD Best Practices: Avoid risk of drive failure
and data loss if powered off for more than two months
27
As a best practice before shutdown, you should:
Steps
1. Log into the cluster through SSH or log in from any node in the cluster using a local console cable and a
laptop/console.
2. Turn off AutoSupport and indicate how long you expect the system to be off line:
system node autosupport invoke -node * -type all -message "MAINT=8h Power
Maintenance"
If your’re using a console/laptop, log into the controller using the same cluster administrator credentials.
Open an SSH session to every SP/BMC connection so that you can monitor progress.
For clusters using SnapMirror synchronous operating in StrictSync mode: system node
halt -node * -skip-lif-migration-before-shutdown true -ignore-quorum
-warnings true -inhibit-takeover true -ignore-strict-sync-warnings
true
7. Enter y for each controller in the cluster when you see Warning: Are you sure you want to halt
node "cluster name-controller number"? {y|n}:
8. Wait for each controller to halt and display the LOADER prompt.
9. Turn off each PSU or unplug them if there is no PSU on/off switch.
10. Unplug the power cord from each PSU.
11. Verify that all controllers in the impaired chassis are powered down.
28
equipment rack or system cabinet with the new chassis of the same model as the
impaired chassis.
Moving out a power supply when replacing a chassis involves turning off, disconnecting, and removing the
power supply from the old chassis and installing and connecting it on the replacement chassis.
When removing a power supply, always use two hands to support its weight.
The power supplies are keyed and can only be installed one way.
Do not use excessive force when sliding the power supply into the system. You can damage
the connector.
7. Close the cam handle so that the latch clicks into the locked position and the power supply is fully seated.
8. Reconnect the power cable and secure it to the power supply using the power cable locking mechanism.
Only connect the power cable to the power supply. Do not connect the power cable to a
power source at this time.
1. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the
system cables and SFPs (if needed) from the controller module, keeping track of where the cables were
connected.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
2. Remove and set aside the cable management devices from the left and right sides of the controller module.
29
3. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
4. Set the controller module aside in a safe place, and repeat these steps if you have another controller
module in the chassis.
Move the drives from each bay opening in the old chassis to the same bay opening in the new chassis.
The drive should disengage from the chassis, allowing it to slide free of the chassis.
When removing a drive, always use two hands to support its weight.
Drives are fragile. Handle them as little as possible to prevent damage to them.
3. Align the drive from the old chassis with the same bay opening in the new chassis.
30
4. Gently push the drive into the chassis as far as it will go.
5. Firmly push the drive the rest of the way into the chassis, and then lock the cam handle by pushing it up
and against the drive holder.
Be sure to close the cam handle slowly so that it aligns correctly with the front of the drive carrier. It clicks
when it is secure.
Step 4: Replace a chassis from within the equipment rack or system cabinet
Remove the existing chassis from the equipment rack or system cabinet before you can install the replacement
chassis.
After you install the controller module and any other components into the new chassis, boot it.
For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller
module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
1. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
Do not completely insert the controller module in the chassis until instructed to do so.
2. Recable the console to the controller module, and then reconnect the management port.
3. Repeat the preceding steps if there is a second controller to install in the new chassis.
4. Complete the installation of the controller module:
31
If your system is in… Then perform these steps…
An HA pair a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Repeat the preceding steps for the second controller module in
the new chassis.
A stand-alone configuration a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Reinstall the blanking panel and then go to the next step.
5. Connect the power supplies to different power sources, and then turn them on.
6. Boot each controller to Maintenance mode:
a. As each controller starts the booting, press Ctrl-C to interrupt the boot process when you see the
message Press Ctrl-C for Boot Menu.
If you miss the prompt and the controller modules boot to ONTAP, enter halt, and then
at the LOADER prompt enter boot_ontap, press Ctrl-C when prompted, and then
repeat this step.
b. From the boot menu, select the option for Maintenance mode.
32
Step 1: Verify and set the HA state of the chassis
You must verify the HA state of the chassis, and, if necessary, update the state to match your system
configuration.
1. In Maintenance mode, from either controller module, display the HA state of the local controller module and
chassis: ha-config show
2. If the displayed system state for the chassis does not match your system configuration:
a. Set the HA state for the chassis: ha-config modify chassis HA-state
▪ ha
▪ non-ha
b. Confirm that the setting has changed: ha-config show
3. If you have not already done so, recable the rest of your system.
4. The next step depends on your system configuration.
Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return
& Replacements page for further information.
Controller module
Overview of controller module replacement - FAS2600
You must review the prerequisites for the replacement procedure and select the correct
one for your version of the ONTAP operating system.
• All drive shelves must be working properly.
• If your system is in an HA pair, the healthy controller must be able to take over the controller that is being
replaced (referred to in this procedure as the “impaired controller”).
• This procedure includes steps for automatically or manually reassigning drives to the replacement
controller, depending on your system’s configuration.
33
You should perform the drive reassignment as directed in the procedure.
• You must replace the failed component with a replacement FRU component you received from your
provider.
• You must be replacing a controller module with a controller module of the same model type. You cannot
upgrade your system by just replacing the controller module.
• You cannot change any drives or drive shelves as part of this procedure.
• In this procedure, the boot device is moved from the impaired controller to the replacement controller so
that the replacement controller will boot up in the same version of ONTAP as the old controller module.
• It is important that you apply the commands in these steps on the correct systems:
◦ The impaired controller is the controller that is being replaced.
◦ The replacement controller is the new controller that is replacing the impaired controller.
◦ The healthy controller is the surviving controller.
• You must always capture the controller’s console output to a text file.
This provides you a record of the procedure so that you can troubleshoot any issues that you might
encounter during the replacement process.
Steps
1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message
MAINT=_number_of_hours_down_h
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>
system node autosupport invoke -node * -type all -message MAINT=2h
2. If the impaired controller is part of an HA pair, disable automatic giveback from the console of the healthy
controller: storage failover modify -node local -auto-giveback false
3. Take the impaired controller to the LOADER prompt:
34
If the impaired controller is Then…
displaying…
System prompt or password Take over or halt the impaired controller from the healthy controller:
prompt (enter system password) storage failover takeover -ofnode
impaired_node_name
4. If the system has only one controller module in the chassis, turn off the power supplies, and then unplug
the impaired controller’s power cords from the power source.
To replace the controller module, you must first remove the old controller module from the chassis.
Steps
1. If you are not already grounded, properly ground yourself.
2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the
system cables and SFPs (if needed) from the controller module, keeping track of where the cables were
connected.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
3. Remove and set aside the cable management devices from the left and right sides of the controller module.
4. If you left the SFP modules in the system after removing the cables, move them to the new controller
module.
5. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
35
6. Turn the controller module over and place it on a flat, stable surface.
7. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open.
You must locate the boot media and follow the directions to remove it from the old controller module and insert
it in the new controller module.
36
Steps
1. Locate the boot media using the following illustration or the FRU map on the controller module:
2. Press the blue button on the boot media housing to release the boot media from its housing, and then
gently pull it straight out of the boot media socket.
Do not twist or pull the boot media straight up, because this could damage the socket or the
boot media.
3. Move the boot media to the new controller module, align the edges of the boot media with the socket
housing, and then gently push it into the socket.
4. Check the boot media to make sure that it is seated squarely and completely in the socket.
If necessary, remove the boot media and reseat it into the socket.
5. Push the boot media down to engage the locking button on the boot media housing.
To move the NVMEM battery from the old controller module to the new controller module, you must perform a
specific sequence of steps.
Steps
1. Check the NVMEM LED:
◦ If your system is in an HA configuration, go to the next step.
◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then
check the NVRAM LED identified by the NV icon.
37
The NVRAM LED blinks while destaging contents to the flash memory when you halt the
system. After the destage is complete, the LED turns off.
▪ If power is lost without a clean shutdown, the NVMEM LED flashes until the destage is complete,
and then the LED turns off.
▪ If the LED is on and power is on, unwritten data is stored on NVMEM.
This typically occurs during an uncontrolled shutdown after ONTAP has successfully booted.
3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the
socket, and then unplug the battery cable from the socket.
4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder
and controller module.
5. Move the battery to the replacement controller module.
6. Loop the battery cable around the cable channel on the side of the battery holder.
7. Position the battery pack by aligning the battery holder key ribs to the “V” notches on the sheet metal side
wall.
8. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into
the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side
wall.
38
Step 4: Move the DIMMs
To move the DIMMs, you must follow the directions to locate and move them from the old controller module
into the replacement controller module.
You must have the new controller module ready so that you can move the DIMMs directly from the impaired
controller module to the corresponding slots in the replacement controller module.
Steps
1. Locate the DIMMs on your controller module.
2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement
controller module in the proper orientation.
3. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM,
and then slide the DIMM out of the slot.
Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM
circuit board.
The number and placement of system DIMMs depends on the model of your system.
The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it.
39
Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
Make sure that the plug locks down onto the controller module.
To move a caching module referred to as the M.2 PCIe card on the label on your controller, locate and move it
from the old controller into the replacement controller and follow the specific sequence of steps.
You must have the new controller module ready so that you can move the caching module directly from the old
controller module to the corresponding slot in the new one. All other components in the storage system must
be functioning properly; if not, you must contact technical support.
Steps
1. Locate the caching module at the rear of the controller module and remove it.
40
2. Gently pull the caching module straight out of the housing.
3. Move the caching module to the new controller module, and then align the edges of the caching module
with the socket housing and gently push it into the socket.
4. Verify that the caching module is seated squarely and completely in the socket.
If necessary, remove the caching module and reseat it into the socket.
5. Reseat and push the heatsink down to engage the locking button on the caching module housing.
6. Close the controller module cover, as needed.
After you install the components from the old controller module into the new controller module, you must install
the new controller module into the system chassis and boot the operating system.
For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller
module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
The system might update system firmware when it boots. Do not abort this process. The
procedure requires you to interrupt the boot process, which you can typically do at any time after
prompted to do so. However, if the system updates the system firmware when it boots, you must
wait until after the update is complete before interrupting the boot process.
Steps
1. If you are not already grounded, properly ground yourself.
2. If you have not already done so, replace the cover on the controller module.
3. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
Do not completely insert the controller module in the chassis until instructed to do so.
4. Cable the management and console ports only, so that you can access the system to perform the tasks in
the following sections.
You will connect the rest of the cables to the controller module later in this procedure.
41
If your system is in… Then perform these steps…
An HA pair The controller module begins to boot as soon as it is fully seated in
the chassis. Be prepared to interrupt the boot process.
a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. When you see the message Press Ctrl-C for Boot Menu,
press Ctrl-C to interrupt the boot process.
42
If your system is in… Then perform these steps…
A stand-alone configuration a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Reconnect the power cables to the power supplies and to the
power sources, turn on the power to start the boot process, and
then press Ctrl-C after you see the Press Ctrl-C for Boot
Menu message.
e. From the boot menu, select the option for Maintenance mode.
Important: During the boot process, you might see the following prompts:
◦ A prompt warning of a system ID mismatch and asking to override the system ID.
◦ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that
the healthy controller remains down. You can safely respond y to these prompts.
Step 1: Set and verify system time after replacing the controller
You should check the time and date on the replacement controller module against the healthy controller
module in an HA pair, or against a reliable time server in a stand-alone configuration. If the time and date do
not match, you must reset them on the replacement controller module to prevent possible outages on clients
due to time differences.
43
• The replacement node is the new node that replaced the impaired node as part of this procedure.
• The healthy node is the HA partner of the replacement node.
Steps
1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt.
2. On the healthy node, check the system time: cluster date show
3. At the LOADER prompt, check the date and time on the replacement node: show date
4. If necessary, set the date in GMT on the replacement node: set date mm/dd/yyyy
5. If necessary, set the time in GMT on the replacement node: set time hh:mm:ss
6. At the LOADER prompt, confirm the date and time on the replacement node: show date
You must verify the HA state of the controller module and, if necessary, update the state to match your system
configuration.
1. In Maintenance mode from the new controller module, verify that all components display the same HA
state: ha-config show
2. If the displayed system state of the controller module does not match your system configuration, set the HA
state for the controller module: ha-config modify controller ha-state
◦ ha
◦ non-ha
3. If the displayed system state of the controller module does not match your system configuration, set the HA
state for the controller module: ha-config modify controller ha-state
4. Confirm that the setting has changed: ha-config show
44
Steps
1. Recable the system.
2. Verify that the cabling is correct by using Active IQ Config Advisor.
a. Download and install Config Advisor.
b. Enter the information for the target system, and then click Collect Data.
c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and
all disks appear in the output, correcting any cabling issues you find.
d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor.
If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to
the disks when the giveback occurs at the end of the procedure. In a stand-alone system, you must manually
reassign the ID to the disks. You must use the correct procedure for your configuration.
You must confirm the system ID change when you boot the replacement controller and then verify that the
change was implemented.
1. If the replacement controller is in Maintenance mode (showing the *> prompt, exit Maintenance mode and
go to the LOADER prompt: halt
2. From the LOADER prompt on the replacement controller, boot the controller, entering y if you are prompted
to override the system ID due to a system ID mismatch: boot_ontap
3. Wait until the Waiting for giveback… message is displayed on the replacement controller console and
then, from the healthy controller, verify that the new partner system ID has been automatically assigned:
storage failover show
In the command output, you should see a message that the system ID has changed on the impaired
controller, showing the correct old and new IDs. In the following example, node2 has undergone
replacement and has a new system ID of 151759706.
4. From the healthy controller, verify that any coredumps are saved:
45
a. Change to the advanced privilege level: set -privilege advanced
You can respond Y when prompted to continue into advanced mode. The advanced mode prompt
appears (*>).
b. Save any coredumps: system node run -node local-node-name partner savecore
c. Wait for the `savecore`command to complete before issuing the giveback.
You can enter the following command to monitor the progress of the savecore command: system
node run -node local-node-name partner savecore -s
The replacement controller takes back its storage and completes booting.
If you are prompted to override the system ID due to a system ID mismatch, you should enter y.
b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is
possible: storage failover show
The output from the storage failover show command should not include the System ID changed
on partner message.
7. Verify that the disks were assigned correctly: storage disk show -ownership
The disks belonging to the replacement controller should show the new system ID. In the following
example, the disks owned by node1 now show the new system ID, 1873775277:
46
node1> `storage disk show -ownership`
8. Verify that the expected volumes are present for each controller: vol show -node node-name
9. If you disabled automatic takeover on reboot, enable it from the healthy controller: storage failover
modify -node replacement-node-name -onreboot true
In a stand-alone system, you must manually reassign disks to the new controller’s system ID before you return
the system to normal operating condition.
Steps
1. If you have not already done so, reboot the replacement node, interrupt the boot process by pressing Ctrl-
C, and then select the option to boot to Maintenance mode from the displayed menu.
2. You must enter Y when prompted to override the system ID due to a system ID mismatch.
3. View the system IDs: disk show -a
4. You should make a note of the old system ID, which is displayed as part of the disk owner column.
47
*> disk show -a
Local System ID: 118065481
5. Reassign disk ownership by using the system ID information obtained from the disk show command: disk
reassign -s old system ID disk reassign -s 118073209
6. Verify that the disks were assigned correctly: disk show -a
The disks belonging to the replacement node should show the new system ID. The following example now
show the disks owned by system-1 the new system ID, 118065481:
7. If your storage system has Storage or Volume Encryption configured, you must restore Storage or Volume
Encryption functionality by using one of the following procedures, depending on whether you are using
onboard or external key management:
◦ Restore onboard key management encryption keys
◦ Restore external key management encryption keys
8. Boot the node: boot_ontap
48
failed part to NetApp, as described in the RMA instructions shipped with the kit.
You must install new licenses for the replacement node if the impaired node was using ONTAP features that
require a standard (node-locked) license. For features with standard licenses, each node in the cluster should
have its own key for the feature.
You have a 90-day grace period in which to install the license keys. After the grace period, all old licenses are
invalidated. After a valid license key is installed, you have 24 hours to install all of the keys before the grace
period ends.
Steps
1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My
Support section under Software licenses.
The new license keys that you require are automatically generated and sent to the email
address on file. If you fail to receive the email with the license keys within 30 days, you
should contact technical support.
2. Install each license key: system license add -license-code license-key, license-key...
3. Remove the old licenses, if desired:
a. Check for unused licenses: license clean-up -unused -simulate
b. If the list looks correct, remove the unused licenses: license clean-up -unused
Before returning the replacement node to service, you should verify that the LIFs are on their home ports, and
register the serial number of the replacement node if AutoSupport is enabled, and reset automatic giveback.
Steps
1. Verify that the logical interfaces are reporting to their home server and ports: network interface show
-is-home false
If any LIFs are listed as false, revert them to their home ports: network interface revert -vserver
* -lif *
49
3. If an AutoSupport maintenance window was triggered, end it by using the system node autosupport
invoke -node * -type all -message MAINT=END command.
4. If automatic giveback was disabled, reenable it: storage failover modify -node local -auto
-giveback true
Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return
& Replacements page for further information.
You must replace the failed component with a replacement FRU component you received from your provider.
To shut down the impaired controller, you must determine the status of the controller and, if necessary, take
over the controller so that the healthy controller continues to serve data from the impaired controller storage.
If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy
controller shows false for eligibility and health, you must correct the issue before shutting down the impaired
controller; see Synchronize a node with the cluster.
Steps
1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message
MAINT=_number_of_hours_down_h
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>
system node autosupport invoke -node * -type all -message MAINT=2h
2. If the impaired controller is part of an HA pair, disable automatic giveback from the console of the healthy
controller: storage failover modify -node local -auto-giveback false
3. Take the impaired controller to the LOADER prompt:
50
If the impaired controller is Then…
displaying…
Waiting for giveback… Press Ctrl-C, and then respond y.
System prompt or password Take over or halt the impaired controller from the healthy controller:
prompt (enter system password) storage failover takeover -ofnode
impaired_node_name
4. If the system has only one controller module in the chassis, turn off the power supplies, and then unplug
the impaired controller’s power cords from the power source.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
3. Remove and set aside the cable management devices from the left and right sides of the controller module.
4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
51
5. Turn the controller module over and place it on a flat, stable surface.
6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open.
52
If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the
controller module.
You must perform a clean system shutdown before replacing system components to avoid losing unwritten
data in the nonvolatile memory (NVMEM). The LED is located on the back of the controller module. Look
for the following icon:
2. If the NVMEM LED is not flashing, there is no content in the NVMEM; you can skip the following steps and
proceed to the next task in this procedure.
3. If the NVMEM LED is flashing, there is data in the NVMEM and you must disconnect the battery to clear
the memory:
a. Locate the battery, press the clip on the face of the battery plug to release the lock clip from the plug
socket, and then unplug the battery cable from the socket.
53
Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM
circuit board.
The number and placement of system DIMMs depends on the model of your system.
8. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it
to the slot.
The notch among the pins on the DIMM should line up with the tab in the socket.
9. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM
squarely into the slot.
The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it.
Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
10. Push carefully, but firmly, on the top edge of the DIMM until the ejector tabs snap into place over the
notches at the ends of the DIMM.
11. Locate the NVMEM battery plug socket, and then squeeze the clip on the face of the battery cable plug to
insert it into the socket.
Make sure that the plug locks down onto the controller module.
54
1. If you have not already done so, replace the cover on the controller module.
2. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
Do not completely insert the controller module in the chassis until instructed to do so.
If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber
optic cables.
a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
A stand-alone configuration a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Reconnect the power cables to the power supplies and to the
power sources, then turn on the power to start the boot process.
55
Step 5: Return the failed part to NetApp
Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return
& Replacements page for further information.
The failed drive appears in the list of failed drives. If it does not, you should wait, and then run the
command again.
Depending on the drive type and capacity, it can take up to several hours for the drive to
appear in the list of failed drives.
How you replace the disk depends on how the disk drive is being used. If SED authentication is enabled,
you must use the SED replacement instructions in the ONTAP 9 NetApp Encryption Power Guide. These
Instructions describe additional steps you must perform before and after replacing an SED.
• Make sure the replacement drive is supported by your platform. See the NetApp Hardware Universe.
• Make sure all other components in the system are functioning properly; if not, you must contact technical
support.
When replacing several disk drives, you must wait one minute between the removal of each failed disk drive
and the insertion of the replacement disk drive to allow the storage system to recognize the existence of each
new disk.
Procedure
Replace the failed drive by selecting the option appropriate to the drives that your platform supports.
56
Option 1: Replace SSD
1. If you want to manually assign drive ownership for the replacement drive, you need to disable
automatic drive assignment replacement drive, if it is enabled
You manually assign drive ownership and then reenable automatic drive assignment
later in this procedure.
a. Verify whether automatic drive assignment is enabled: storage disk option show
If automatic drive assignment is enabled, the output shows on in the “Auto Assign” column (for
each controller module).
b. If automatic drive assignment is enabled, disable it: storage disk option modify -node
node_name -autoassign off
When a drive fails, the system logs a warning message to the system console indicating which drive
failed. Additionally, the attention (amber) LED on the drive shelf operator display panel and the failed
drive illuminate.
The activity (green) LED on a failed drive can be illuminated (solid), which indicates
that the drive has power, but should not be blinking, which indicates I/O activity. A failed
drive has no I/O activity.
Be sure to close the cam handle slowly so that it aligns correctly with the face of the drive.
When the drive’s activity LED is solid, it means that the drive has power. When the drive’s activity LED
57
is blinking, it means that the drive has power and I/O is in progress. If the drive firmware is
automatically updating, the LED blinks.
b. Assign each drive: storage disk assign -disk disk_name -owner owner_name
You can use the wildcard character to assign more than one drive at once.
c. Reenable automatic drive assignment if needed: storage disk option modify -node
node_name -autoassign on
10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
You manually assign drive ownership and then reenable automatic drive assignment
later in this procedure.
a. Verify whether automatic drive assignment is enabled: storage disk option show
If automatic drive assignment is enabled, the output shows on in the “Auto Assign” column (for
each controller module).
b. If automatic drive assignment is enabled, disable it: storage disk option modify -node
node_name -autoassign off
58
Depending on the storage system, the disk drives have the release button located at the top or on the
left of the disk drive face.
For example, the following illustration shows a disk drive with the release button located on the top of
the disk drive face:
The cam handle on the disk drive springs open partially and the disk drive releases from the
midplane.
6. Pull the cam handle to its fully open position to unseat the disk drive from the midplane.
7. Slide out the disk drive slightly and allow the disk to safely spin down, which can take less than one
minute, and then, using both hands, remove the disk drive from the disk shelf.
8. With the cam handle in the open position, insert the replacement disk drive into the drive bay, firmly
pushing until the disk drive stops.
Wait a minimum of 10 seconds before inserting a new disk drive. This allows the
system to recognize that a disk drive was removed.
If your platform drive bays are not fully loaded with drives, it is important to place the
replacement drive into the same drive bay from which you removed the failed drive.
Use two hands when inserting the disk drive, but do not place hands on the disk drive
boards that are exposed on the underside of the disk carrier.
9. Close the cam handle so that the disk drive is fully seated into the midplane and the handle clicks into
place.
Be sure to close the cam handle slowly so that it aligns correctly with the face of the disk drive..
10. If you are replacing another disk drive, repeat Steps 4 through 9.
11. Reinstall the bezel.
12. If you disabled automatice drive assignment in Step 1, then, manually assign drive ownership and
then reenable automatic drive assignment if needed.
a. Display all unowned drives: storage disk show -container-type unassigned
b. Assign each drive: storage disk assign -disk disk_name -owner owner_name
You can use the wildcard character to assign more than one drive at once.
c. Reenable automatic drive assignment if needed: storage disk option modify -node
node_name -autoassign on
59
13. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
To shut down the impaired controller, you must determine the status of the controller and, if necessary, take
over the controller so that the healthy controller continues to serve data from the impaired controller storage.
If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy
controller shows false for eligibility and health, you must correct the issue before shutting down the impaired
controller; see Synchronize a node with the cluster.
Steps
1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message
MAINT=_number_of_hours_down_h
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>
system node autosupport invoke -node * -type all -message MAINT=2h
2. If the impaired controller is part of an HA pair, disable automatic giveback from the console of the healthy
controller: storage failover modify -node local -auto-giveback false
3. Take the impaired controller to the LOADER prompt:
60
If the impaired controller is Then…
displaying…
System prompt or password Take over or halt the impaired controller from the healthy controller:
prompt (enter system password) storage failover takeover -ofnode
impaired_node_name
4. If the system has only one controller module in the chassis, turn off the power supplies, and then unplug
the impaired controller’s power cords from the power source.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
3. Remove and set aside the cable management devices from the left and right sides of the controller module.
4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
61
5. Turn the controller module over and place it on a flat, stable surface.
6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open.
62
1. Check the NVMEM LED:
◦ If your system is in an HA configuration, go to the next step.
◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then
check the NVRAM LED identified by the NV icon.
The NVRAM LED blinks while destaging contents to the flash memory when you halt the
system. After the destage is complete, the LED turns off.
▪ If power is lost without a clean shutdown, the NVMEM LED flashes until the destage is complete,
and then the LED turns off.
▪ If the LED is on and power is on, unwritten data is stored on NVMEM.
This typically occurs during an uncontrolled shutdown after ONTAP has successfully booted.
3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the
socket, and then unplug the battery cable from the socket.
4. Remove the battery from the controller module and set it aside.
5. Remove the replacement battery from its package.
6. Loop the battery cable around the cable channel on the side of the battery holder.
7. Position the battery pack by aligning the battery holder key ribs to the “V” notches on the sheet metal side
wall.
63
8. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into
the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side
wall.
9. Plug the battery plug back into the controller module.
1. If you have not already done so, replace the cover on the controller module.
2. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
Do not completely insert the controller module in the chassis until instructed to do so.
If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber
optic cables.
a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
64
If your system is in… Then perform these steps…
A stand-alone configuration a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position.
b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Reconnect the power cables to the power supplies and to the
power sources, and turn on the power to start the boot process.
Cooling is integrated with the power supply, so you must replace the power supply within
two minutes of removal to prevent overheating due to reduced airflow. Because the chassis
provides a shared cooling configuration for the two HA nodes, a delay longer than two
minutes will shut down all controller modules in the chassis. If both controller modules do
shut down, make sure that both power supplies are inserted, turn both off for 30 seconds,
and then turn both on.
1. Identify the power supply you want to replace, based on console error messages or through the LEDs on
the power supplies.
2. If you are not already grounded, properly ground yourself.
3. Turn off the power supply and disconnect the power cables:
65
a. Turn off the power switch on the power supply.
b. Open the power cable retainer, and then unplug the power cable from the power supply.
c. Unplug the power cable from the power source.
4. Squeeze the latch on the power supply cam handle, and then open the cam handle to fully release the
power supply from the mid plane.
5. Use the cam handle to slide the power supply out of the system.
When removing a power supply, always use two hands to support its weight.
6. Make sure that the on/off switch of the new power supply is in the Off position.
7. Using both hands, support and align the edges of the power supply with the opening in the system chassis,
and then gently push the power supply into the chassis using the cam handle.
The power supplies are keyed and can only be installed one way.
Do not use excessive force when sliding the power supply into the system. You can damage
the connector.
8. Close the cam handle so that the latch clicks into the locked position and the power supply is fully seated.
9. Reconnect the power supply cabling:
a. Reconnect the power cable to the power supply and the power source.
66
b. Secure the power cable to the power supply using the power cable retainer.
Once power is restored to the power supply, the status LED should be green.
10. Turn on the power to the new power supply, and then verify the operation of the power supply activity
LEDs.
The power supply LEDs are lit when the power supply comes online.
11. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part
Return & Replacements page for further information.
If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy
controller shows false for eligibility and health, you must correct the issue before shutting down the impaired
controller; see Synchronize a node with the cluster.
67
Steps
1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message
MAINT=_number_of_hours_down_h
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>
system node autosupport invoke -node * -type all -message MAINT=2h
2. If the impaired controller is part of an HA pair, disable automatic giveback from the console of the healthy
controller: storage failover modify -node local -auto-giveback false
3. Take the impaired controller to the LOADER prompt:
System prompt or password Take over or halt the impaired controller from the healthy controller:
prompt (enter system password) storage failover takeover -ofnode
impaired_node_name
4. If the system has only one controller module in the chassis, turn off the power supplies, and then unplug
the impaired controller’s power cords from the power source.
Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.
3. Remove and set aside the cable management devices from the left and right sides of the controller module.
68
4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller
module from the midplane, and then, using two hands, pull the controller module out of the chassis.
5. Turn the controller module over and place it on a flat, stable surface.
6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open.
69
Step 3: Replace the RTC battery
To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps.
2. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the
holder.
Note the polarity of the battery as you remove it from the holder. The battery is marked with
a plus sign and must be positioned in the holder correctly. A plus sign near the holder tells
you how the battery should be positioned.
Step 4: Reinstall the controller module and set time/date after RTC battery
replacement
After you replace a component within the controller module, you must reinstall the controller module in the
system chassis, reset the time and date on the controller, and then boot it.
70
1. If you have not already done so, close the air duct or controller module cover.
2. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.
Do not completely insert the controller module in the chassis until instructed to do so.
If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber
optic cables.
4. If the power supplies were unplugged, plug them back in and reinstall the power cable retainers.
5. Complete the reinstallation of the controller module:
a. With the cam handle in the open position, firmly push the controller module in until it meets the
midplane and is fully seated, and then close the cam handle to the locked position.
Do not use excessive force when sliding the controller module into the chassis to avoid
damaging the connectors.
b. If you have not already done so, reinstall the cable management device.
c. Bind the cables to the cable management device with the hook and loop strap.
d. Reconnect the power cables to the power supplies and to the power sources, and then turn on the
power to start the boot process.
e. Halt the controller at the LOADER prompt.
6. Reset the time and date on the controller:
a. Check the date and time on the healthy controller with the show date command.
b. At the LOADER prompt on the target controller, check the time and date.
c. If necessary, modify the date with the set date mm/dd/yyyy command.
d. If necessary, set the time, in GMT, using the set time hh:mm:ss command.
e. Confirm the date and time on the target controller.
7. At the LOADER prompt, enter bye to reinitialize the PCIe cards and other components and let the
controller reboot.
8. Return the controller to normal operation by giving back its storage: storage failover giveback
-ofnode impaired_node_name
9. If automatic giveback was disabled, reenable it: storage failover modify -node local -auto
-giveback true
71
Copyright information
Copyright © 2024 NetApp, Inc. All Rights Reserved. Printed in the U.S. No part of this document covered by
copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including
photocopying, recording, taping, or storage in an electronic retrieval system—without prior written permission
of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL
NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice. NetApp
assumes no responsibility or liability arising from the use of products described herein, except as expressly
agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any
patent rights, trademark rights, or any other intellectual property rights of NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents, or
pending applications.
LIMITED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set
forth in subparagraph (b)(3) of the Rights in Technical Data -Noncommercial Items at DFARS 252.227-7013
(FEB 2014) and FAR 52.227-19 (DEC 2007).
Data contained herein pertains to a commercial product and/or commercial service (as defined in FAR 2.101)
and is proprietary to NetApp, Inc. All NetApp technical data and computer software provided under this
Agreement is commercial in nature and developed solely at private expense. The U.S. Government has a non-
exclusive, non-transferrable, nonsublicensable, worldwide, limited irrevocable license to use the Data only in
connection with and in support of the U.S. Government contract under which the Data was delivered. Except
as provided herein, the Data may not be used, disclosed, reproduced, modified, performed, or displayed
without the prior written approval of NetApp, Inc. United States Government license rights for the Department
of Defense are limited to those rights identified in DFARS clause 252.227-7015(b) (FEB 2014).
Trademark information
NETAPP, the NETAPP logo, and the marks listed at http://www.netapp.com/TM are trademarks of NetApp, Inc.
Other company and product names may be trademarks of their respective owners.
72