Nutanix - Advanced Admin AOS v5 - 10
Nutanix - Advanced Admin AOS v5 - 10
10
Acropolis Advanced
Administration Guide
February 4, 2021
Contents
5. Logs............................................................................................................................34
Sending Logs to a Remote Syslog Server....................................................................................................34
Configuring the Remote Syslog Server Settings........................................................................... 36
Common Log Files................................................................................................................................................. 38
Nutanix Logs Root..................................................................................................................................... 38
Self-Monitoring (sysstats) Logs.............................................................................................................38
/home/nutanix/data/logs/cassandra..................................................................................................38
Controller VM Log Files........................................................................................................................... 39
Correlating the FATAL log to the INFO file.................................................................................................. 41
Stargate Logs............................................................................................................................................................42
Cassandra Logs........................................................................................................................................................44
Prism Gateway Log................................................................................................................................................ 45
Zookeeper Logs.......................................................................................................................................................45
Genesis.out.................................................................................................................................................................45
Diagnosing a Genesis Failure.................................................................................................................47
ii
ESXi Log Files.......................................................................................................................................................... 48
Nutanix Calm Log Files........................................................................................................................................ 48
6. Troubleshooting Tools........................................................................................ 50
Nutanix Cluster Check (NCC)............................................................................................................................ 50
Diagnostics VMs........................................................................................................................................................51
Running a Test Using the Diagnostics VMs......................................................................................52
Diagnostics Output.....................................................................................................................................52
Syscheck Utility........................................................................................................................................................ 53
Using Syscheck Utility...............................................................................................................................53
Copyright.................................................................................................................. 60
License.........................................................................................................................................................................60
Conventions...............................................................................................................................................................60
Default Cluster Credentials................................................................................................................................. 60
Version.......................................................................................................................................................................... 61
iii
1
CLUSTER MANAGEMENT
Although each host in a Nutanix cluster runs a hypervisor independent of other hosts in the
cluster, some operations affect the entire cluster.
Controller VM Access
Most administrative functions of a Nutanix cluster can be performed through the web console
or nCLI. Nutanix recommends using these interfaces whenever possible and disabling Controller
VM SSH access with password or key authentication. Some functions, however, require logging
on to a Controller VM with SSH. Exercise caution whenever connecting directly to a Controller
VM as the risk of causing cluster issues is increased.
Warning: When you connect to a Controller VM with SSH, ensure that the SSH client does
not import or change any locale settings. The Nutanix software is not localized, and executing
commands with any locale other than en_US.UTF-8 can cause severe cluster issues.
To check the locale used in an SSH session, run /usr/bin/locale. If any environment
variables are set to anything other than en_US.UTF-8, reconnect with an SSH
configuration that does not import or change any locale settings.
Port Requirements
Nutanix uses a number of ports for internal communication. The following unique ports are
required for external access to Controller VMs in a Nutanix cluster.
Table 1: Table
Note:
• As an admin user, you cannot access nCLI by using the default credentials. If you
are logging in as the admin user for the first time, you must SSH to the Controller
VM or log on through the Prism web console. Also, you cannot change the default
password of the admin user through nCLI. To change the default password of the
admin user, you must SSH to the Controller VM or log on through the Prism web
console.
• When you make an attempt to log in to the Prism web console for the first time
after you upgrade to AOS 5.1 from an earlier AOS version, you can use your existing
admin user password to log in and then change the existing password (you are
prompted) to adhere to the password complexity requirements. However, if you
are logging in to the Controller VM with SSH for the first time after the upgrade as
the admin user, you must use the default admin user password (Nutanix/4u) and
then change the default password (you are prompted) to adhere to the password
complexity requirements.
• You cannot delete the admin user account.
By default, the admin user password does not have an expiry date, but you can change the
password at any time.
When you change the admin user password, you must update any applications and scripts
using the admin user credentials for authentication. Nutanix recommends that you create a user
assigned with the admin role instead of using the admin user for authentication. The Prism Web
Console Guide describes authentication and roles.
Following are the default credentials to access a Controller VM.
Procedure
1. Log on to the Controller VM with SSH by using the management IP address of the Controller
VM and the following credentials.
2. Respond to the prompts, providing the current and new admin user password.
Changing password for admin.
Old Password:
New password:
Retype new password:
Password changed.
Procedure
If the cluster starts properly, output similar to the following is displayed for each node in the
cluster:
CVM: 10.1.64.60 Up
Zeus UP [5362, 5391, 5392, 10848, 10977, 10992]
Scavenger UP [6174, 6215, 6216, 6217]
SSLTerminator UP [7705, 7742, 7743, 7744]
SecureFileSync UP [7710, 7761, 7762, 7763]
Medusa UP [8029, 8073, 8074, 8176, 8221]
DynamicRingChanger UP [8324, 8366, 8367, 8426]
Pithos UP [8328, 8399, 8400, 8418]
Hera UP [8347, 8408, 8409, 8410]
Stargate UP [8742, 8771, 8772, 9037, 9045]
InsightsDB UP [8774, 8805, 8806, 8939]
InsightsDataTransfer UP [8785, 8840, 8841, 8886, 8888, 8889, 8890]
Ergon UP [8814, 8862, 8863, 8864]
Cerebro UP [8850, 8914, 8915, 9288]
Chronos UP [8870, 8975, 8976, 9031]
Curator UP [8885, 8931, 8932, 9243]
Prism UP [3545, 3572, 3573, 3627, 4004, 4076]
CIM UP [8990, 9042, 9043, 9084]
AlertManager UP [9017, 9081, 9082, 9324]
Arithmos UP [9055, 9217, 9218, 9353]
Catalog UP [9110, 9178, 9179, 9180]
Acropolis UP [9201, 9321, 9322, 9323]
Atlas UP [9221, 9316, 9317, 9318]
Uhura UP [9390, 9447, 9448, 9449]
Snmp UP [9418, 9513, 9514, 9516]
SysStatCollector UP [9451, 9510, 9511, 9518]
Tunnel UP [9480, 9543, 9544]
ClusterHealth UP [9521, 9619, 9620, 9947, 9976, 9977, 10301]
Janus UP [9532, 9624, 9625]
NutanixGuestTools UP [9572, 9650, 9651, 9674]
MinervaCVM UP [10174, 10200, 10201, 10202, 10371]
ClusterConfig UP [10205, 10233, 10234, 10236]
APLOSEngine UP [10231, 10261, 10262, 10263]
APLOS UP [10343, 10368, 10369, 10370, 10502, 10503]
Lazan UP [10377, 10402, 10403, 10404]
Orion UP [10409, 10449, 10450, 10474]
Delphi UP [10418, 10466, 10467, 10468]
What to do next
After you have verified that the cluster is running, you can start guest VMs.
(Hyper-V only) If the Hyper-V failover cluster was stopped, start it by logging on to a Hyper-V
host and running the Start-Cluster PowerShell command.
Warning: By default, Nutanix clusters have redundancy factor 2, which means they can tolerate
the failure of a single node or drive. Nutanix clusters with a configured option of redundancy
factor 3 allow the Nutanix cluster to withstand the failure of two nodes or drives in different
blocks.
Note:
• If you are running Acropolis File Services (AFS), stop AFS before stopping your AOS
cluster.
• If you are planning to stop your cluster that has metro availability configured, do not
stop the cluster before performing some remedial actions. For more information, see
Data Protection Guidelines (Metro Availability) topic in the Prism Web Console guide.
(Hyper-V only) Stop the Hyper-V failover cluster by logging on to a Hyper-V host and running
the Stop-Cluster PowerShell command.
Note: This procedure stops all services provided by guest virtual machines, the Nutanix cluster,
and the hypervisor host.
Procedure
Wait to proceed until output similar to the following is displayed for every Controller VM in
the cluster.
CVM: 172.16.8.191 Up, ZeusLeader
Zeus UP [3167, 3180, 3181, 3182, 3191, 3201]
Scavenger UP [3334, 3351, 3352, 3353]
ConnectionSplicer DOWN []
Hyperint DOWN []
Medusa DOWN []
DynamicRingChanger DOWN []
Pithos DOWN []
Stargate DOWN []
Cerebro DOWN []
Chronos DOWN []
Curator DOWN []
Prism DOWN []
AlertManager DOWN []
StatsAggregator DOWN []
SysStatCollector DOWN []
Destroying a Cluster
Before you begin
Reclaim licenses from the cluster to be destroyed by following Reclaiming Licenses When
Destroying a Cluster in the Web Console Guide.
Note: If the cluster is registered with Prism Central (the multiple cluster manager VM), unregister
the cluster before destroying it. See Registering with Prism Central in the Web Console Guide for
more information.
Procedure
Wait to proceed until output similar to the following is displayed for every Controller VM in
the cluster.
CVM: 172.16.8.191 Up, ZeusLeader
Zeus UP [3167, 3180, 3181, 3182, 3191, 3201]
Scavenger UP [3334, 3351, 3352, 3353]
ConnectionSplicer DOWN []
Hyperint DOWN []
Medusa DOWN []
DynamicRingChanger DOWN []
Pithos DOWN []
Stargate DOWN []
Cerebro DOWN []
Chronos DOWN []
Curator DOWN []
Prism DOWN []
AlertManager DOWN []
StatsAggregator DOWN []
SysStatCollector DOWN []
CAUTION: Performing this operation deletes all cluster and guest VM data in the cluster.
» If you want to preserve data on the existing cluster, remove nodes from the cluster using
the Hardware > Table > Host screen of the web console.
» If you want multiple new clusters, destroy the existing cluster by following Destroying a
Cluster on page 8.
2. Create one or more new clusters by following Configuring the Cluster on page 10.
AOS includes a web-based configuration tool that automates assigning IP addresses to cluster
components and creates the cluster.
Requirements
The web-based configuration tool requires that IPv6 link-local be enabled on the subnet. If IPv6
link-local is not available, you must configure the Controller VM IP addresses and the cluster
manually. The web-based configuration tool also requires that the Controller VMs be able to
communicate with each other.
All Controller VMs and hypervisor hosts must be on the same subnet. The hypervisor can be
multihomed provided that one interface is on the same subnet as the Controller VM.
Guest VMs can be on a different subnet.
Note: This procedure has been deprecated (superseded) in AOS 4.5 and later releases. Instead,
use the Foundation tool to configure a cluster. See the "Creating a Cluster" topics in the Field
Installation Guide for more information.
Procedure
Note: Internet Explorer requires protected mode to be disabled. Go to Tools > Internet
Options > Security, clear the Enable Protected Mode check box, and restart the browser.
If the cluster_init.html page is blank, then the Controller VM is already part of a cluster.
Connect to a Controller VM that is not part of a cluster.
You can obtain the IPv6 address of the Controller VM by using the ifconfig command.
Example
nutanix@cvm$ ifconfig
eth0 Link encap:Ethernet HWaddr 52:54:00:A8:8A:AE
inet addr:10.1.65.240 Bcast:10.1.67.255 Mask:255.255.252.0
inet6 addr: fe80::5054:ff:fea8:8aae/64 Scope:Link
The value of the inet6 addr field up to the / character is the IPv6 address of the Controller
VM.
• The maximum length is 75 characters (for vSphere and AHV) and 15 characters
(for Hyper-V).
• Allowed characters are uppercase and lowercase standard Latin letters (A-Z and
a-z), decimal digits (0-9), dots (.), hyphens (-), and underscores (_).
4. Type a virtual IP address for the cluster in the Cluster External IP field.
This parameter is required for Hyper-V clusters and is optional for vSphere and AHV
clusters.
You can connect to the external cluster IP address with both the web console and nCLI.
In the event that a Controller VM is restarted or fails, the external cluster IP address is
relocated to another Controller VM in the cluster.
5. (Optional) If you want to enable redundancy factor 3, set Cluster Max Redundancy Factor
to 3.
Redundancy factor 3 has the following requirements:
• A cluster must have at least five nodes, blocks, racks for redundancy factor 3 to be
enabled.
• For guest VMs to tolerate the simultaneous failure of two nodes or drives in different
blocks, the data must be stored on storage containers with replication factor 3.
• Controller VM must be configured with enough memory to support redundancy factor 3.
See the Acropolis Advanced Administration Guide topic CVM Memory Configurations for
Features.
6. Type the appropriate DNS and NTP addresses in the respective fields.
Note: You must enter NTP servers that the Controller VMs can reach in the CVM NTP
Servers field. If reachable NTP servers are not entered or if the time on the Controller VMs is
ahead of the current time, cluster services may fail to start.
For Hyper-V clusters, the CVM NTP Servers parameter must be set to the IP addresses of
one or more Active Directory domain controllers.
The Hypervisor NTP Servers parameter is not used in Hyper-V clusters.
8. Type the appropriate default gateway IP addresses in the Default Gateway row.
9. Select the check box next to each node that you want to add to the cluster.
All unconfigured nodes on the current network are presented on this web page. If you are
going to configure multiple clusters, be sure that you only select the nodes that should be
part of the current cluster.
Note: The unconfigured nodes are not listed according to their position in the block. Ensure
that you assign the intended IP address to each node.
If the cluster is running properly, output similar to the following is displayed for each node
in the cluster:
CVM: 10.1.64.60 Up
Zeus UP [5362, 5391, 5392, 10848, 10977, 10992]
Scavenger UP [6174, 6215, 6216, 6217]
SSLTerminator UP [7705, 7742, 7743, 7744]
SecureFileSync UP [7710, 7761, 7762, 7763]
Medusa UP [8029, 8073, 8074, 8176, 8221]
DynamicRingChanger UP [8324, 8366, 8367, 8426]
Pithos UP [8328, 8399, 8400, 8418]
Hera UP [8347, 8408, 8409, 8410]
Stargate UP [8742, 8771, 8772, 9037, 9045]
InsightsDB UP [8774, 8805, 8806, 8939]
InsightsDataTransfer UP [8785, 8840, 8841, 8886, 8888, 8889, 8890]
Ergon UP [8814, 8862, 8863, 8864]
Cerebro UP [8850, 8914, 8915, 9288]
Chronos UP [8870, 8975, 8976, 9031]
Curator UP [8885, 8931, 8932, 9243]
Prism UP [3545, 3572, 3573, 3627, 4004, 4076]
Procedure
» Linux
$ ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:0c:29:dd:e3:0b
inet addr:10.2.100.180 Bcast:10.2.103.255 Mask:255.255.252.0
inet6 addr: fe80::20c:29ff:fedd:e30b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2895385616 errors:0 dropped:0 overruns:0 frame:0
TX packets:3063794864 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2569454555254 (2.5 TB) TX bytes:2795005996728 (2.7 TB)
» Mac OS
$ ifconfig en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 70:56:81:ae:a7:47
inet6 fe80::7256:81ff:feae:a747 en0 prefixlen 64 scopeid 0x4
inet 172.16.21.208 netmask 0xfff00000 broadcast 172.31.255.255
media: autoselect
Note the IPv6 link-local addresses, which always begin with fe80. Omit the / character and
anything following.
» Windows
> ping -6 ipv6_linklocal_addr%interface
» Linux/Mac OS
$ ping6 ipv6_linklocal_addr%interface
• Replace ipv6_linklocal_addr with the IPv6 link-local address of the other laptop.
• Replace interface with the interface identifier on the other laptop (for example, 12 for
Windows, eth0 for Linux, or en0 for Mac OS).
If the ping packets are answered by the remote host, IPv6 link-local is enabled on the subnet.
If the ping packets are not answered, ensure that firewalls are disabled on both laptops and
try again before concluding that IPv6 link-local is not enabled.
5. Reenable the firewalls on the laptops and disconnect them from the network.
Results
• If IPv6 link-local is enabled on the subnet, you can use automated IP address and cluster
configuration utility.
• If IPv6 link-local is not enabled on the subnet, you have to manually set IP addresses and
create the cluster.
Note: IPv6 connectivity issue might occur if mismatch occurs because of VLAN tagging. This
issue might occur because ESXi that is shipped from the factory does not have VLAN tagging,
hence it might have VLAN tag as 0. The workstation (laptop) that you have connected might be
connected to access port, so it might use different VLAN tag. Hence, ensure that ESXi port must
be in the trunking mode.
Procedure
Connect to the backup site and activate it.
ncli> pd activate name="pd_name"
CAUTION: The VM registration might fail if the storage container is not mounted on the
selected host.
Planned failover
Procedure
Connect to the primary site and specify the failover site to migrate to.
ncli> pd migrate name="pd_name" remote-site="remote_site_name2"
Procedure
Run the vDisk manipulator utility from any Controller VM in the cluster.
• Replace ctr_name with the name of the storage container where the vDisk to fingerprint
resides.
• Replace vdisk_path with the path of the vDisk to fingerprint relative to the storage
container path (for example, Win7-desktop11/Win7-desktop11-flat.vmdk). You cannot
specify multiple vDisks in this parameter.
» To fingerprint all vDisks in the cluster:
nutanix@cvm$ ncli vdisk list | grep "Name.*NFS" | awk -F: \
'{print $4 ":" $5 ":" $6 ":" $7}' >> fingerprint.txt
nutanix@cvm$ for i in `cat fingerprint.txt`; do vdisk_manipulator --vdisk_name=$i \
Note: You can run vdisk_manipulator in a loop to fingerprint multiple vDisks, but run only one
instance of vdisk_manipulator on each Controller VM at a time. Executing multiple instances
on a Controller VM concurrently would generate significant load on the cluster.
2
CHANGING PASSWORDS
Changing User Passwords
You can change user passwords, including for the default admin user, in the web console or
nCLI. Changing the password through either interface changes it for both.
Procedure
• (Web console) Log on to the web console as the user whose password
is to be changed and select Change Password from the user icon
• Replace username with the name of the user whose password is to be changed.
• Replace curr_pw with the current password.
• Replace new_pw with the new password.
Note: If you change the password of the admin user from the default, you must specify
the password every time you start an nCLI session from a remote system. A password is
not required if you are starting an nCLI session from a Controller VM where you are already
logged on.
Procedure
1. Log on to the system where the SCVMM console is installed and start the console.
4. Update the username and password to include the new credentials and ensure that Validate
domain credentials is not checked.
Procedure
3. Respond to the prompts, providing the current and new nutanix user password.
Changing password for nutanix.
Old Password:
New password:
Retype new password:
Procedure
3. Respond to the prompts, providing the current and new admin user password.
Changing password for admin.
Old Password:
New password:
Retype new password:
Note: Do not add any other device, including guest VMs, to the VLAN to which the Controller VM
and hypervisor host are assigned. Isolate guest VMs on one or more separate VLANs.
Note: For guest VMs and other devices on the network, do not use a subnet that overlaps with
the 192.168.5.0/24 subnet on the default VLAN. If you want to use an overlapping subnet for such
devices, make sure that you use a different VLAN.
The following tables list the interfaces and IP addresses on the internal virtual switch on
different hypervisors:
Table 3: Interfaces and IP Addresses on the Internal Virtual Switch virbr0 on an AHV Host
eth1:1 192.168.5.254
eth1:1 192.168.5.254
Table 5: Interfaces and IP Addresses on the Internal Virtual Switch InternalSwitch on a Hyper-V
Host
eth1:1 192.168.5.254
Note: Make sure that the hypervisor and Controller VM interfaces on the external virtual switch
are not assigned IP addresses from the 192.168.5.0/24 subnet.
The following tables list the interfaces and IP addresses on the external virtual switch on
different hypervisors:
Table 6: Interfaces and IP Addresses on the External Virtual Switch br0 on an AHV Host
Table 7: Interfaces and IP Addresses on the External Virtual Switch vSwitch0 on an ESXi Host
Table 8: Interfaces and IP Addresses on the External Virtual Switch ExternalSwitch on a Hyper-
V Host
• Before you decide to change the CVM, hypervisor host, and IPMI IP addresses, consider the
possibility of incorporating the existing IP address schema into the new infrastructure by
reconfiguring your routers and switches instead of Nutanix nodes and CVMs. If that is not
possible and you must change the IP addresses of CVMs and hypervisor hosts, proceed with
the procedure described in this document.
• Guest VM downtime is necessary for this change, because the Nutanix cluster must be in a
stopped state. Therefore, plan the guest VM downtime accordingly.
• Verify if your cluster is using the network segmentation feature.
nutanix@cvm$ network_segment_status
Note the following if you are using the network segmentation feature.
• The network segmentation feature enables the backplane network for CVMs in your
cluster (eth2 interface). The backplane network is always a non-routable subnet and/or
VLAN that is distinct from the one which is used by the external interfaces (eth0) of your
CVMs and the management network on your hypervisor. Typically, you do not need to
change the IP addresses of the backplane interface (eth2) if you are updating the CVM or
host IP addresses.
• If you have enabled network segmentation on your cluster, check to make sure that the
VLAN and subnet in-use by the backplane network is still going to be valid once you
move to the new IP scheme. If not, and change the subnet or VLAN. See the Prism Web
Console Guide for your version of AOS to find instructions on disabling the network
segmentation feature (see the Disabling Network Segmentation topic) before you change
the CVM and host IP addresses. After you have updated the CVM and host IP addresses
by following the steps outlined later in this document, you can then proceed to re-enable
network segmentation. Follow the instructions in the Prism Web Console Guide, which
describes how to designate the new VLAN or subnet for the backplane network.
CAUTION: All the features that use the cluster virtual IP address will be impacted if you
change that address. See the "Virtual IP Address Impact" section in the Prism Web Console
Guide for more information.
Replace insert_new_external_ip_address with the new virtual IP address for the cluster.
Replace prism_admin_user_password with password of the Prism admin account.
• Ensure that the cluster NTP and DNS servers are reachable from the new Controller VM IP
addresses. If you are using different NTP and DNS servers, remove the existing NTP and DNS
servers from the cluster configuration and add the new ones. If you do not know the new
addresses, remove the existing NTP and DNS servers before cluster reconfiguration and add
the new ones afterwards.
Web Console In the gear icon pull-down list, click Name Servers.
In the gear icon pull-down list, click NTP Servers.
• Log on to a Controller VM in the cluster and check that all hosts are part of the metadata
store.
nutanix@cvm$ ncli host ls | grep "Metadata store status"
For every host in the cluster, Metadata store enabled on the node is displayed.
Warning: If Node marked to be removed from metadata store is displayed, do not proceed
with the IP address reconfiguration, and contact Nutanix Support to resolve the issue.
CAUTION:
Do not use the external IP address reconfiguration script (external_ip_reconfig) if you
are using the network segmentation feature on your cluster and you want to change
the IP addresses of the backplane (eth2) interface. See the Reconfiguring the Backplane
Network topic in the Prism Web Console Guide for instructions about how to change
the IP addresses of the backplane (eth2) interface.
Following is the summary of steps that you must perform to change the IP addresses on a
Nutanix cluster.
1. Check the health of the cluster infrastructure and resiliency (For more information, see the
Before you begin section of this document.)
2. Stop the cluster.
3. Change the VLAN and NIC Teaming configurations as necessary.
Note: Check the connectivity between CVMs and hosts, that is all the hosts must be
reachable from all the CVMs and vice versa before you perform step 4. If any CVM or host is
not reachable, contact Nutanix Support for assistance.
Warning: If you are changing the Controller VM IP addresses to another subnet, network, IP
address range, or VLAN, you must also change the hypervisor management IP addresses to the
same subnet, network, IP address range, or VLAN.
See the Changing the IP Address of on Acropolis Host topic in the AHV Administration Guide for
instructions about how to change the IP address of an AHV host.
Procedure
1. Log on to the hypervisor with SSH (vSphere or AHV), remote desktop connection (Hyper-V),
or the IPMI remote console.
If you are unable to reach the IPMI IP addresses, reconfigure by using the BIOS or hypervisor
command line.
For using BIOS, see the Configuring the Remote Console IP Address (BIOS) topic in the
Acropolis Advanced Setup Guide.
For using the hypervisor command line, see the Configuring the Remote Console IP Address
(Command Line) topic in the Acropolis Advanced Setup Guide.
Warning: This step affects the operation of a Nutanix cluster. Schedule a down time before
performing this step.
If you are using VLAN tags on your CVMs and on the management network for your
hypervisors and you want to change the VLAN tags, make these changes after the cluster is
stopped.
For information about assigning VLANs to hosts and the Controller VM, see the indicated
documentation:
• AHV: Assigning an Acropolis Host to a VLAN and Assigning the Controller VM to a VLAN
topics in the AHV Administration Guide.
• ESXi: For instructions about tagging a VLAN on an ESXi host by using DCUI, see the
Configuring Host Networking (ESXi) topic in the vSphere Administration Guide for
Acropolis (using vSphere HTML 5 Client).
Note: If you are relocating the cluster to a new site, the external_ip_reconfig script works only
if all the CVMs are up and accessible with their old IP addresses. Otherwise, contact Nutanix
Support to manually change the IP addresses.
After you have stopped the cluster, shut down the CVMs and hosts and move the cluster.
Proceed with step 4 only after you start the cluster at the desired site and you have
confirmed that all CVMs and hosts can SSH to one another. As a best practice, ensure that
4. Run the external IP address reconfiguration script (external_ip_reconfig) from any one
Controller VM in the cluster.
nutanix@cvm$ external_ip_reconfig
5. Follow the prompts to type the new netmask, gateway, and external IP addresses.
A message similar to the following is displayed after the reconfiguration is successfully
completed:
External IP reconfig finished successfully. Restart all the CVMs and start the cluster.
6. Note:
If you have changed the CVMs to a new subnet, you must now update the
IP addresses of hypervisor hosts to the new subnet. Change the hypervisor
management IP address or IPMI IP address before you restart the Controller VMs.
7. After you turn on every CVM, log on to each CVM and verify if the IP address has been
successfully changed. Note that it might take up to 10 minutes for the CVMs to show the new
IP addresses after they are turned on.
Note: If you see any of the old IP addresses in the following commands or the commands fail
to run, stop and call Nutanix Support assistance.
c. From any one CVM in the cluster, verify that the following outputs show the new IP
address scheme and that the Zookeeper IDs are mapped correctly.
Note: Never edit the following files manually. Contact Nutanix Support for assistance.
If the cluster starts properly, output similar to the following is displayed for each node in the
cluster:
CVM: 10.1.64.60 Up
Zeus UP [3704, 3727, 3728, 3729, 3807, 3821]
Scavenger UP [4937, 4960, 4961, 4990]
SSLTerminator UP [5034, 5056, 5057, 5139]
Hyperint UP [5059, 5082, 5083, 5086, 5099, 5108]
Medusa UP [5534, 5559, 5560, 5563, 5752]
DynamicRingChanger UP [5852, 5874, 5875, 5954]
Pithos UP [5877, 5899, 5900, 5962]
Stargate UP [5902, 5927, 5928, 6103, 6108]
Cerebro UP [5930, 5952, 5953, 6106]
Chronos UP [5960, 6004, 6006, 6075]
Curator UP [5987, 6017, 6018, 6261]
Prism UP [6020, 6042, 6043, 6111, 6818]
AlertManager UP [6070, 6099, 6100, 6296]
Arithmos UP [6107, 6175, 6176, 6344]
SysStatCollector UP [6196, 6259, 6260, 6497]
Tunnel UP [6263, 6312, 6313]
ClusterHealth UP [6317, 6342, 6343, 6446, 6468, 6469, 6604, 6605,
6606, 6607]
Janus UP [6365, 6444, 6445, 6584]
NutanixGuestTools UP [6377, 6403, 6404]
What to do next
• Run the following NCC checks to verify the health of the Zeus configuration. If any of these
checks report a failure or you encounter issues, contact Nutanix Support.
• If you have configured remote sites for data protection, you must update the new IP
addresses on both the sites by using the Prism Element web console.
• Configure the network settings on the cluster such as DNS, DHCP, NTP, SMTP, and so on.
• Power on the guest VMs and configure the network settings in the new network domain.
• After you verify that the cluster services are up and that there are no alerts informing that
the services are restarting, you can change the IPMI IP addresses at this stage, if necessary.
For instructions about how to change the IPMI addresses, see the Configuring the Remote
Console IP Address (Command Line) topic in the Acropolis Advanced Setup Guide.
• This feature also improves the initial placement of the VMs depending on the VM
configuration.
• The Acropolis block services feature uses the ADS feature for balancing sessions of the
externally visible iSCSI targets.
Note: If you have configured any host or VM-host affinity or VM-VM anti-affinity policies, these
policies are honored.
By default, the feature is enabled and it is recommended you keep this feature enabled.
However, you can disable the feature by using aCLI. For disabling ADS feature, see Disabling
Acropolis Dynamic Scheduling on page 33. Even if you disable the feature, the checks for
the contentions or hotspots run in the background and if any anomalies are detected, an alert
is raised in the Alerts dashboard after third notification. However, no action is taken by the ADS
feature to resolve these contentions. You need to manually take the remedial actions or you can
enable the feature. For more information about enabling ADS feature, see Enabling Acropolis
Dynamic Scheduling on page 33
• Ensure that all the hosts are running AOS 5.0 or later releases.
• The iSCSI targets are displayed as an empty entity. However, if any action is taken on an
iSCSI target, the relevant message is displayed in the Tasks dashboard.
• If a problem is detected and the ADS cannot solve the issue (for example, because of limited
CPU or storage resources), the migration plan might fail. In these cases, an alert is generated.
You need to monitor these alerts from the Alerts dashboard of the Prism Web console and
take necessary remedial actions.
• If the host, firmware, or AOS upgrade is in progress and if any resource contention occurs,
during the period of upgrade no resource contention rebalancing is performed.
Procedure
1. Log into the Controller VM in your cluster through an SSH session and access the Acropolis
command line.
Even after you disable the feature, the checks for the contentions or hotspots run in the
background and if any anomalies are detected, an alert is raised in the Alerts dashboard.
However, no action is taken by the ADS to solve the contentions. You need to manually take
the remedial actions or you can enable the feature.
Procedure
1. Log into the Controller VM in your cluster through an SSH session and access the Acropolis
command line.
• As the logs are forwarded from a Controller VM, the logs display the IP address of the
Controller VM.
• You can only configure one rsyslog server; you cannot specify multiple servers.
• After a remote syslog server is configured, it is enabled by default. (The Controller VM
begins sending log messages once the syslog server is configured.)
• Supported transport protocols are TCP and UDP.
• You can also forward logs to a remote syslog server by using Reliable Event Logging
Protocol (RELP). To use RELP logging, ensure that you have installed rsyslog-relp on the
remote syslog server.
Note: You can use RELP logging only if the transport protocol is TCP.
• rsyslog-config supports and can report messages from the following Nutanix modules:
Module name With monitor logs disabled, these With monitor logs enabled, these
logs are forwarded logs are also forwarded
• Forwards all the AHV host logs that are stored in /var/log/messages
to a remote syslog server.
ERROR ERROR
Note: As the logs are forwarded from a Controller VM, the logs display the IP address of the
Controller VM.
Procedure
1. As the remote syslog server is enabled by default, disable it while you configure settings.
ncli> rsyslog-config set-status enable=false
2. Create a syslog server (which adds it to the cluster) and confirm it has been created.
ncli> rsyslog-config add-server name=remote_server_name relp-enabled={true | false} ip-
address=remote_ip_address port=port_num network-protocol={tcp | udp}
ncli> rsyslog-config ls-servers
Name : remote_server_name
IP Address : remote_ip_address
Port : port_num
Protocol : TCP or UDP
Relp Enabled : true or false
Option Description
remote_server_name A descriptive name for the remote server receiving the
specified messages
remote_ip_address The remote server's IP address
port_num Destination port number on the remote server.
tcp | udp Choose tcp or udp as the transport protocol
true | false Choose true to enable RELP and choose false to disable RELP
• ACROPOLIS
• AUDIT
• CASSANDRA
• CEREBRO
• CURATOR
• GENESIS
• PRISM
• STARGATE
• SYSLOG_MODULE
• ZOOKEEPER
• Replace loglevel with one of the following:
• DEBUG
• INFO
• NOTICE
• WARNING
• ERROR
• CRITICAL
• ALERT
• EMERGENCY
Enable module logs at the ERROR level unless you require more information.
• (Optional) Set include-monitor-logs to specify whether the monitor logs are sent. It is
enabled (true) by default. If disabled (false), only certain logs are sent.
Note: If enabled, the include-monitor-logs option sends all monitor logs, regardless of the
level set by the level= parameter.
Note: The rsyslog configuration is send to Prism Central, Prism Element, and AHV only if the
module selected for export is applicable to them.
.FATAL Logs
If a component fails, it creates a log file named according to the following convention:
component-name.cvm-name.log.FATAL.date-timestamp
• component-name identifies to the component that created the file, such as Curator or Stargate.
• cvm-name identifies to the Controller VM that created the file.
• date-timestamp identifies the date and time when the first failure within that file occurred.
Each failure creates a new .FATAL log file.
Log entries use the following format:
[IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
The first character indicates whether the log entry is an Info, Warning, Error, or Fatal. The next
four characters indicate the day on which the entry was made. For example, if an entry starts
with F0820, it means that at some time on August 20th, the component had a failure.
Tip: The cluster also creates .INFO and .WARNING log files for each component. Sometimes, the
information you need is stored in one of these files.
/home/nutanix/data/logs/cassandra
This is the directory where the Cassandra metadata database stores its logs. The Nutanix
process that starts the Cassandra database (cassandra_monitor) logs to the /home/nutanix/
data/logs directory. However, the most useful information relating to the Cassandra is found in
the system.log* files located in the /home/nutanix/data/logs/cassandra directory.
Log Contents
system.log Cassandra system activity
iostat.INFO I/O activity for each physical every 5 sec sudo iostat
disk
Log Contents
num.processed Alerts that have been
processed
Log Contents
1. Search for the timestamp of the FATAL event in the corresponding INFO files.
d. Analyze the log entries immediately before the FATAL event, especially any errors or
warnings.
In the following example, the latest stargate.FATAL determines the exact timestamp:
nutanix@cvm$ cat stargate.FATAL
Log file created at: 2013/09/07 01:22:23
Running on machine: NTNX-12AM3K490006-2-CVM
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F0907 01:22:23.124495 10559 zeus.cc:1779] Timed out waiting for Zookeeper session
establishment
In the above example, the timestamp is F0907 01:22:23, or September 7 at 1:22:23 AM.
Next, grep for this timestamp in the stargate*INFO* files:
nutanix@cvm$ grep "^F0907 01:22:23" stargate*INFO* | cut -f1 -d:stargate.NTNX-12AM3K490006-2-
CVM.nutanix.log.INFO.20130904-220129.7363
2. If a process is repeatedly failing, it might be faster to do a long listing of the INFO files and
select the one immediately preceding the current one. The current one would be the one
referenced by the symbolic link.
For example, in the output below, the last failure would be recorded in the file
stargate.NTNX-12AM3K490006-2-CVM.nutanix.log.INFO.20130904-220129.7363.
ls -ltr stargate*INFO*
-rw-------. 1 nutanix nutanix 104857622 Sep 3 11:22 stargate.NTNX-12AM3K490006-2-
CVM.nutanix.log.INFO.20130902-004519.7363
-rw-------. 1 nutanix nutanix 104857624 Sep 4 22:01 stargate.NTNX-12AM3K490006-2-
CVM.nutanix.log.INFO.20130903-112250.7363
-rw-------. 1 nutanix nutanix 56791366 Sep 5 15:12 stargate.NTNX-12AM3K490006-2-
CVM.nutanix.log.INFO.20130904-220129.7363
lrwxrwxrwx. 1 nutanix nutanix 71 Sep 7 01:22 stargate.INFO ->
stargate.NTNX-12AM3K490006-2-CVM.nutanix.log.INFO.20130907-012223.11357
-rw-------. 1 nutanix nutanix 68761 Sep 7 01:33 stargate.NTNX-12AM3K490006-2-
CVM.nutanix.log.INFO.20130907-012223.11357
Tip: You can use the procedure above for the other types of files as well (WARNING and
ERROR) in order to narrow the window of information. The INFO file provides all messages,
WARNING provides only warning, error, and fatal-level messages, ERROR provides only error
and fatal-level messages, and so on.
Stargate Logs
This section discusses common entries found in Stargate logs and what they mean.
The Stargate logs are located at /home/nutanix/data/logs/stargate.[INFO|WARNING|ERROR|
FATAL].
This message is generic and can happen for a variety of reasons. While Stargate is initializing,
a watch dog process monitors it to ensure a successful startup process. If it has trouble
connecting to other components (such as Zeus or Pithos) the watch dog process stops
Stargate.
If Stargate is running, this indicates that the alarm handler thread is stuck for longer than 30
seconds. The stoppage could be due to a variety of reasons, such as problems connecting to
Zeus or accessing the Cassandra database.
To analyze why the watch dog fired, first locate the relevant INFO file, and review the entries
leading up to the failure.
This message indicates that Stargate is unable to communicate with Medusa. This may be due
to a network issue.
Analyze the ping logs and the Cassandra logs.
Log Entry: CAS failure seen while updating metadata for egroup egroupid or Backend
returns error 'CAS Error' for extent group id: egroupid
W1001 16:22:34.496806 6938 vdisk_micro_egroup_fixer_op.cc:352] CAS failure seen while updating
metadata for egroup 1917333
This is a benign message and usually does not indicate a problem. This warning message means
that another Cassandra node has already updated the database for the same key.
Log Entry: Fail-fast after detecting hung stargate ops: Operation with id <opid> hung
for 60secs
F0712 14:19:13.088392 30295 stargate.cc:912] Fail-fast after detecting hung stargate ops:
Operation with id 3859757 hung for 60secs
This message indicates that Stargate restarted because an I/O operation took more than 60
seconds to complete.
To analyze why the I/O operation took more than 60 seconds, locate the relevant INFO file and
review the entries leading up to the failure.
This message indicates that Stargate had 5 failed attempts to connect to Medusa/Cassandra.
Review the Cassandra log (cassandra/system.log) to see why Cassandra was unavailable.
Log Entry: Forwarding of request to NFS master ip:2009 failed with error kTimeout.
W1002 18:50:59.248074 26086 base_op.cc:752] Forwarding of request to NFS master
172.17.141.32:2009 failed with error kTimeout
This message indicates that Stargate cannot connect to the NFS master on the node specified.
Review the Stargate logs on the node specified in the error.
Cassandra Logs
After analyzing Stargate logs, if you suspect an issue with Cassandra/Medusa, analyze the
Cassandra logs. This topic discusses common entries found in system.log and what they mean.
The Cassandra logs are located at /home/nutanix/data/logs/cassandra. The most recent file is
named system.log. When the file reaches a certain size, it rolls over to a sequentially numbered
file (example, system.log.1, system.log.2, and so on).
Log Entry: batch_mutate 0 writes succeeded and 1 column writes failed for
keyspace:medusa_extentgroupidmap
INFO [RequestResponseStage:3] 2013-09-10 11:51:15,780 CassandraServer.java
(line 1290) batch_mutate 0 writes succeeded and 1 column writes failed for
keyspace:medusa_extentgroupidmap cf:extentgroupidmap row:lr280000:1917645 Failure Details:
Failure reason:AcceptSucceededForAReplicaReturnedValue : 1
This is a common log entry and can be ignored. It is equivalent to the CAS errors in the
stargate.ERROR log. It simply means that another Cassandra node updated the keyspace first.
This message indicates that the node could not communicate with the Cassandra instance at
the specified IP address.
Either the Cassandra process is down (or failing) on that node or there are network
connectivity issues. Check the node for connectivity issues and Cassandra process restarts.
Log Entry: Caught Timeout exception while waiting for paxos read response from leader:
x.x.x.x
ERROR [EXPIRING-MAP-TIMER-1] 2013-08-08 07:33:25,407 PaxosReadDoneHandler.java (line 64) Caught
Timeout exception while waiting for paxos read reponse from leader: 172.16.73.85. Request Id:
116. Proto Rpc Id : 2119656292896210944. Row no:1. Request start time: Thu Aug 08 07:33:18 PDT
2013. Message sent to leader at: Thu Aug 08 07:33:18 PDT 2013 # commands:1 requestsSent: 1
This message indicates that the node encountered a timeout while waiting for the Paxos leader.
Either the Cassandra process is down (or failing) on that node or there are network
connectivity issues. Check the node for connectivity issues or for the Cassandra process
restarts.
If there are issues with connecting to the Nutanix UI, escalate the case and provide the
output of the ss -s command as well as the contents of prism_gateway.log.
Zookeeper Logs
The Zookeeper logs are located at /home/nutanix/data/logs/zookeeper.out.
This log contains the status of the Zookeeper service. More often than not, there is no need
to look at this log. However, if one of the other logs specifies that it is unable to contact
Zookeeper and it is affecting cluster operations, you may want to look at this log to find the
error Zookeeper is reporting.
Genesis.out
When checking the status of the cluster services, if any of the services are down, or the
Controller VM is reporting Down with no process listing, review the log at /home/nutanix/data/
2017-03-23 19:24:00 INFO node_manager.py:4732 Checking if we need to sync the local SVM and
Hypervisor DNS configuration with Zookeeper
2017-03-23 19:26:00 INFO node_manager.py:1960 Certificate signing request data is not available
in Zeus configuration
2017-03-23 19:26:00 INFO node_manager.py:1880 No Svm certificate maps found in the Zeus
configuration
2017-03-23 19:26:00 INFO node_manager.py:4732 Checking if we need to sync the local SVM and
Hypervisor DNS configuration with Zookeeper
2017-03-23 19:28:00 INFO node_manager.py:1960 Certificate signing request data is not available
in Zeus configuration
2017-03-23 19:28:00 INFO node_manager.py:1880 No Svm certificate maps found in the Zeus
configuration
Under normal conditions, the genesis.out file logs the following messages periodically:
Unpublishing service Nutanix Controller
Publishing service Nutanix Controller
Zookeeper is running as [leader|follower]
Prior to these occasional messages, you should see Starting [n]th service. This is an indicator
that all services were successfully started. As of 5.0, there are 34 services.
Possible Errors
2017-03-23 19:28:00 WARNING command.py:264 Timeout executing scp -q -o CheckHostIp=no -o
ConnectTimeout=15 -o StrictHostKeyChecking=no -o TCPKeepAlive=yes -o UserKnownHostsFile=/dev/
null -o PreferredAuthentications=keyboard-interactive,password -o BindAddress=192.168.5.254
'root@[192.168.5.1]:/etc/resolv.conf' /tmp/resolv.conf.esx: 30 secs elapsed
2017-03-23 19:28:00 ERROR node_dns_ntp_config.py:287 Unable to download ESX DNS configuration
file, ret -1, stdout , stderr
2017-03-23 19:28:00 WARNING node_manager.py:2038 Could not load the local ESX configuration
2017-03-23 19:28:00 ERROR node_dns_ntp_config.py:492 Unable to download the ESX NTP
configuration file, ret -1, stdout , stderr
Any of the above messages means that Genesis was unable to log on to the ESXi host using the
configured password.
Procedure
1. Examine the contents of the genesis.out file and locate the stack trace (indicated by the
CRITICAL message type).
In the example above, the certificates in AuthorizedCerts.txt were not updated, which means
that you failed to connect to the NutanixHostAgent service on the host.
Log Contents
Log Contents
Log Description
/home/docker/nucalm/logs Logs of microservices from Nutanix Calm
container.
/home/docker/epsilon/logs Logs of microservices from Epsilon Container.
/home/nutanix/data/logs/genesis.out Logs containing information about enabling
container service and starting Nutanix Calm
and epsilon containers.
/home/nutanix/data/logs/epsilon.out Logs containing information about starting
epsilon service.
Note: Some plugins run nCLI commands and might require the user to input the nCLI password.
The password is logged on as plain text. If you change the password of the admin user from
the default, you must specify the password every time you start an nCLI session from a remote
system. A password is not required if you are starting an nCLI session from a Controller VM
where you are already logged on.
NCC Output
Each NCC plugin is a test that completes independently of other plugins. Each test completes
with one of these status types. The status might also display a link to a Nutanix Support Portal
Knowledge Base article with more details about the check, or information to help you resolve
issues NCC finds.
PASS
The tested aspect of the cluster is healthy and no further action is required. A check can
also return a PASS status if it is not applicable
FAIL
The tested aspect of the cluster is not healthy and must be addressed. This message
requires an immediate action. If you do not take immediate action, the cluster might
become unavailable or require intervention by Nutanix Support.
• From the Prism web console Health page, select Actions > Run Checks. Select All
checks and click Run.
• If you disable a check in the Prism web console, you cannot run it from the NCC
command line unless you enable it again from the web console.
• You can run NCC checks from the Prism web console for clusters where AOS 5.0 or
later and NCC 3.0 or later are installed. You cannot run NCC checks from the Prism
web console for clusters where AOS 4.7.x or previous and NCC 3.0 are installed.
• For AOS clusters where it is installed, running NCC 3.0 or later from the command
line updates the Cluster Health score, including the color of the score. For some NCC
checks, you can clear the score by disabling and then re-enabling the check.
Run two or more individual checks at a time
• You can specify two or more individual checks from the command line, with each
check separated by a comma. Ensure you do not use any spaces between checks, only
a comma character. For example:
ncc health_checks system_checks \
--plugin_list="cluster_version_check,cvm_reboot_check"
• You can re-run any NCC checks or plug-ins that reported a FAIL status.
ncc --rerun_failing_plugins=True
Diagnostics VMs
Nutanix provides a diagnostics capability to allow partners and customers to run performance
tests on the cluster. This is a useful tool in pre-sales demonstrations of the cluster and while
identifying the source of performance issues in a production cluster. Diagnostics should also
be run as part of setup to ensure that the cluster is running properly before the customer takes
ownership of the cluster.
The diagnostic utility deploys a VM on each node in the cluster. The Controller VMs control the
diagnostic VM on their hosts and report back the results to a single system.
• Ensure that 10 GbE ports are active on the ESXi hosts using esxtop or vCenter. The tests
run very slow if the nodes are not using the 10 GbE ports. For more information about this
known issue with ESXi 5.0 update 1, see VMware KB article 2030006.
Procedure
(vSphere only) In vCenter, right-click any diagnostic VMs labeled as "orphaned", select
Remove from Inventory, and click Yes to confirm removal.
If the command fails with ERROR:root:Zookeeper host port list is not set, refresh the
environment by running source /etc/profile or bash -l and run the command again.
The diagnostic may take up to 15 minutes to complete for a four-node cluster. Larger
clusters take longer time.
The script performs the following tasks:
1. Installs a diagnostic VM on each node.
2. Creates cluster entities to support the test, if necessary.
3. Runs four performance tests, using the Linux fio utility.
4. Reports the results.
(vSphere only) In vCenter, right-click any diagnostic VMs labeled as "orphaned", select
Remove from Inventory, and click Yes to confirm removal.
Diagnostics Output
System output similar to the following indicates a successful test.
Checking if an existing storage pool can be used ...
Using storage pool sp1 for the tests.
Note:
• Expected results vary based on the specific AOS version and hardware model used.
• The IOPS values reported by the diagnostics script is higher than the values reported
by the Nutanix management interfaces. This difference is because the diagnostics
script reports physical disk I/O, and the management interfaces show IOPS reported
by the hypervisor.
• If the reported values are lower than expected, the 10 GbE ports may not be active.
For more information about this known issue with ESXi 5.0 update 1, see VMware KB
article 2030006.
Syscheck Utility
Syscheck is a tool that runs load on a cluster and evaluate its performance characteristics.
This tool provides pass or fail feedback on all the checks. The current checks are network
throughput and direct disk random write performance. Syscheck tracks the tests on a per node
basis and prints the result at the conclusion of the test.
Note:
• Run this test on a newly created cluster or a cluster that is idle or has minimal load.
• Do not run this test if systems are sharing the network as it may interfere with their
operation.
Procedure
After executing the command, a message that records all the considerations of running this
test is displayed. When prompted with the message, type yes to run the check.
The test returns either pass or fail result. The latest result is placed under /home/nutanix/
data/syscheck directory. An output tar file is also placed in /home/nutanix/data/ directory
after every time you run this utility.
7
CONTROLLER VM MEMORY
CONFIGURATIONS
Controller VM memory allocation requirements differ depending on the models and the features
that are being used.
Note: G6/Skylake platforms do not have workload memory requirements for Controller VM and
vCPU configurations, unlike the G4/G5 platforms. G6/Skylake platforms do have Controller VM
memory configuration requirements and recommendations for features. See CVM Memory
Configurations for Features on page 58.
The Foundation imaging process sets the default memory allocated to each Controller VM for
all platforms.
Note: If the AOS upgrade process detects that any node hypervisor host has total physical
memory of 64 GB or greater, it automatically upgrades any Controller VM in that node with less
than 32 GB memory by 4 GB. The Controller VM is upgraded to a maximum 32 GB.
If the AOS upgrade process detects any node with less than 64 GB memory size, no
memory changes occur.
For nodes with ESXi hypervisor hosts with total physical memory of 64 GB, the
Controller VM is upgraded to a maximum 28 GB. With total physical memory greater
than 64 GB, the existing Controller VM memory is increased by 4 GB.
Nutanix does not support decreasing Controller VM memory below recommended
minimum amounts needed for cluster and add-in features. Nutanix Cluster Checks
(NCC), preupgrade cluster checks, and the AOS upgrade process detect and monitor
Controller VM memory.
Note: G6/Skylake platforms do not have workload memory requirements for Controller VM and
vCPU configurations, unlike the G4/G5 platforms. G6/Skylake platforms do have Controller VM
memory configuration requirements and recommendations for features. See CVM Memory
Configurations for Features on page 58.
The Foundation imaging process sets the number of vCPUs allocated to each Controller VM
according to your platform model. This table shows the default memory allocated to each
Controller VM for all platforms.
Workload Exceptions
Note: Upgrading to 5.1 requires a 4GB memory increase, unless the CVM memory already has 32
GB.
If all the data disks in a platform are SSDs, the node is assigned the High Performance workload
except for the following exceptions.
• Klas Voyager 2 uses SSDs but due to workload balance, this platform workload default is
VDI.
• Cisco B-series is expected to have large remote storage and two SSDs as a local cache for
the hot tier, so this platform workload is VDI.
- - HX1310 C220-M4L - -
- - HX2710-E Hyperflex - -
HX220C-
M4S
- - HX3510- - - -
FG
- - HX3710-F - - -
NX-8035- - HX5510-C - - -
G5
NX-6035- - - - - -
G5
NX-6155- - - - - -
G5
NX-8150- - - - - -
G5
The following tables show the minimum amount of memory and vCPU requirements and
recommendations for the Controller VM on each node for platforms that do not follow the
default.
XC730xd-24 32 20 8
XC6320-6AF
XC630-10AF
HX-3500 28 8
HX-5500
HX-7500
Features Memory
(GB)
Capacity tier deduplication (includes performance tier deduplication) 12
Redundancy factor 3 8
Performance tier deduplication 8
Cold-tier nodes + capacity tier deduplication 4
Capacity tier deduplication + redundancy factor 3 12
Table 23: Controller VM Memory Requirements for Remote Direct Memory Access (RDMA)
Clusters
License
The provision of this software to you does not grant any licenses or other rights under any
Microsoft patents with respect to anything other than the file server implementation portion of
the binaries for this software, including no licenses or any other rights in any hardware or any
devices or software that are used to communicate with or in connection with this software.
Conventions
Convention Description
root@host# command The commands are executed as the root user in the vSphere or
Acropolis host shell.
> command The commands are executed in the Hyper-V host shell.
AOS |
Interface Target Username Password
Version
Last modified: February 4, 2021 (2021-02-04T17:34:14+05:30)
AOS |