MX-ONE Virtual r7.4 (Redundancy)
MX-ONE Virtual r7.4 (Redundancy)
Using server redundancy, a standby server can take over the tasks of a regular server suffering from,
for example, hardware failure. This way, a faulty server can be replaced with a minimum of
disturbance.
When using server redundancy, regular servers and an additional standby server are grouped as a
cluster. The standby server is prepared with data from the regular servers in the cluster and ready to
start an instance of any of these servers in case of a server fault.
To build a real fault tolerant cluster, network redundancy can be combined with server redundancy.
Particularly, when the network connection is lost, the regular as well as standby server will run the
MX-ONE Service Node service. Subsequently when the network connection is restored and the
regular server and the standby server detect each other running the MX-ONE Service Node service,
the regular server will stop running the service. A preloaded cluster behaves as a non-preloaded
cluster in this regard. Network redundancy reduce the likelihood of both servers running the service
simultaneously.
If the Media Server is used, there are two different configurations, for the control the Media
Gateway(s) of the failing LIM: if the MS is co-located with the (failing) Service Node, a new MS is
started by the standby server. If the Media Server runs on a stand-alone server, the standby server
will continue to control the separate Media Server of the failing LIM. When Media Server is co-located,
both Control and media interface must be set to the same address as the Service Node for the standby
server to start the Media Server properly.
NOTE: If voice announcement functions are used, the announcement prompts must be made available
(preloaded) in the standby Media Gateway.
If there are more than one faulty regular servers in a cluster, the standby server will only replace one
of the faulty servers. Other faulty servers will not operate. Which of the faulty regular servers the
standby server will replace depends on the regular servers priority to run on the standby server. If the
priority is equal, the first regular server started on the standby server will continue running. If a server
with higher priority to run on the standby server fails, while the standby server is already running a
regular server, it will be replaced by the server with higher priority.
Each regular server in a cluster is configured with two addresses, a base IP address and an alias IP
address. The standby server is configured with only a base IP address. In case of a faulty regular
server, the standby server will take over the alias IP addresses of the faulty server.
When a regular server recovers from a failure, the MX-ONE Service Node programs running on the
standby server can fallback to the regular server.
During fallback, the MX-ONE Service Node software is stopped on the regular and standby servers,
and then started on the regular server.
If a cluster server finds another cluster server with the same alias address active, one of the servers
will remove the alias address and stop running the telephony programs. Which server that will stop
depends on configuration. Normally the regular server will be stopped and then the fallback is
executed, either automatically or manually. By this measure the consequences of Split brain is
avoided.
If the alias IP address is found in use by some other network equipment (none cluster server), it can
result in that the LIM is not started. This is an example of faulty network configuration.
HANDLING OF SERVER DATA
The standby server is prepared with data from the regular servers in the cluster and is ready to start
an instance of any faulty regular server within the cluster. Reload and system database data is copied
from the regular servers in the cluster to the standby server after every data backup and once every
24-hour period.
For a correct data synchronization, the server’s clock have to be in sync and NTP configured properly.
This is normally configured during system installation. If manual adjustments of the clocks are
performed, make sure servers clock are in sync. The reload data files modification times are used to
select data sync direction between regular and standby servers.
PRELOADED CLUSTER
A standby server is preloaded with program and data to make failover faster. This is only possible for
a cluster consisting of one regular server and one standby server.
In a preloaded cluster the alias address is started in both servers at the same time, but it is blocked
in the Linux kernel in the passive side. If the regular server fails, the blocking of the alias address is
removed and the standby server is functional.
The passive side is updated with reload and system database data from the active side when the
data_backup command is used on the active side. A data reload is then executed automatically in the
passive side to prepare it for failover.
The time to detect a server as failed is lower in a preloaded cluster than in a regular cluster, 30
seconds.
A shorter fail detection time increases the risk for faulty detection of server failure. This puts higher
requirements on the cluster servers and networks. Ensure that the cluster is configured with high
performance servers. Use network and storage with enough bandwidth. Use of Network redundancy
is recommended.
Failover or fallback can occur to a server (LIM) that is currently loading, but has not yet reached the
preloaded state. If this happens the time to recover will be longer than with a server that has reached
the preloaded state.
Using automatic fallback for instance, if the regular server is reloaded, fallback will occur as soon as
the two servers have found each other. This will happen while the regular server (LIM) is still loading.
The traffic disturbance at recovery using automatic fallback is in this case longer than if manual
fallback is used. The manual fallback can be ordered when the reloaded server has reached the
preloaded state.
NOTE: Customers which require moves, adds and changes when a MX-ONE Service Node is down must
use server redundancy, MX-ONE requires that all MX-ONE Service Nodes are up and running to be able to
make changes in the system, because of the reload data used in some functionalities, the Service Node can be
a redundant server (1+1 or N+1).
Server Redundancy N + 1
In the server redundancy N+1, a cluster of up to 10 Service Nodes can have 1 standby server.
At failover, the standby server will take over the identity of the failing server and the control of the media gate-
ways/servers in the failing server.
Server Redundancy 1 + 1
In the server redundancy 1+1, a cluster of 1 Service Node is created with 1 standby server.
At failover, the standby server will take over the identity of the failing server and the control of the media gate-
ways/servers in the failing server.
At failover, the standby server will take over the identity of the failing server and the control of the media gate-
ways/servers in the failing server. The failover time is faster in the 1+1 preloaded standby than the 1+1 server
redundancy.
Prerequisites
The following requirements and limitations apply for installations using server redundancy:
• A cluster can have up to ten LIMs.
• A cluster can have only one standby server.
• It is possible to have as many clusters as there are LIMs in the system (with a maximum of
one standby server per LIM server).
• A standby server can belong to only one cluster.
• A LIM server can belong to only one cluster.
• All servers in a cluster must reside on the same subnet. Gratuitous ARP is used in the network
to announce that a standby server has taken over the alias IP address of a faulty regular
server. (ARP is a link layer protocol, operating on the local subnet.)
• The Alias IP address must be on the same subnet as the base IP address.
• A standby server must have performance enough to be able to replace any regular server in
the cluster.
• A standby server must have enough free hard disk space to store two data backups (system
database data included) of each regular server in the cluster.
• There must be enough bandwidth within a cluster for efficient transfer of data backups to the
standby server.
• Failover behavior preloaded is used only for clusters consisting of one regular and one standby
server.
Known Limitations of the Server Redundancy Functionality
AUTOMATIC FALLBACK
After a server failure using automatic fallback to the regular server, fallback will take place when the
server is functioning again. This can create problems if the regular server starts and stops repeatedly
during a short period of time
Deployment
MX-ONE V7.X VIRTUALIZATION (REDUNDANT SERVER)
• net_setup
• YES
Select the Keyboard layout then press
OK
• Set your Time Zone (press the spacebar to acknowledge the country selected)
• Other Server
• No
• Set Passwords
o root >>>>>>>>>>>>> mx1_root
o mxone_admin >>>>> mx1_admin
o mxone_user >>>>>>> mx1_user
o No
• mxone_maintenance
• press “c”
o this will add the new Server to the Telephony Server
It is highly recommended to install the database nodes before promoting a server as standby. For example,
install a new server, it will be a free server, after adding a stand-alone database node in this server, when the
installation finishes go and add the standby role. As a last step create the cluster with the desired type of server
redundancy.
Using server redundancy, a standby server can take over the tasks of a regular server suffering from, for
example, hardware failure. This way, a faulty server can be replaced with a minimum of disturbance. When
using server redundancy, regular servers and an additional, standby server are grouped as a cluster. The
standby server is prepared with data from the regular servers in the cluster and ready to start an instance of
any of these servers in case of a server fault.
Each server in a cluster supervises the state of the other servers. In case of a server failure, the software and
configuration running on the faulty server will be activated and started on the standby server. The standby
server will also manage the media gateway or gateways of the faulty server. In a N+1 Server Redundancy
scenario, if in the very unlikely event that there are more than one faulty servers at the same time in a cluster,
the standby server will only replace one of the faulty servers. Other faulty servers will not operate until the
fault is fixed and they are restarted.
When a regular server recovers from a failure, an automatic or a manual fallback will take place. During
fallback, the Service Node software is stopped on the standby server and is then reloaded and restarted on
the regular server.
At failover, the standby server will take over the identity of the failing server and the control of the media
gateways in the failing server. For ongoing TDM traffic related to the media gateways controlled by that
server, all connections will be lost when the server fails. All new traffic will be redirected to the standby server.
In a distributed system connected over limited bandwidth (WAN), each remote domain must have its own
standby server.
In a multi-server system, it is possible to have as many clusters as there are servers in the system (with a
maximum of one standby server per regular server). A server and standby server can belong to only one
cluster.
Note! All servers in one standby cluster must be on the same subnet.
• set new base IPV4 address for the selected Service Nodes
• confirm
• Cluster is added
• list Clusters
Other Options:
FALLBACK
• set to Manual by default
• changing to Automatic
PRIORITY
MAINTENANCE COMMANDS
• stop the MX-ONE Service
o systemctl stop mxone_sby.service
this will stop the MX-ONE Services and triggers/activates the Standby Server