h15718 Ecs Networking BP WP
h15718 Ecs Networking BP WP
November 2023
H15718.12
White Paper
Abstract
This white paper describes networking and related best practices for
ECS, the Dell software-defined cloud-scale object storage platform.
Copyright
The information in this publication is provided as is. Dell Inc. makes no representations or warranties of any kind with respect
to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
Copyright © 2016–2023 Dell Inc. or its subsidiaries. Published in the USA November 2023 H15718.12.
Dell Inc. believes the information in this document is accurate as of its publication date. The information is subject to change
without notice.
Contents
Executive summary ........................................................................................................................ 4
Tools ............................................................................................................................................... 35
Conclusion ..................................................................................................................................... 39
References ..................................................................................................................................... 40
Executive summary
Overview Dell ECS is a cloud-scale, object-storage platform for traditional, archival, and next-
generation workloads. It provides geo-distributed and multi-protocol (object, HDFS, and
NFS) access to data. An ECS deployment offers a turnkey appliance with industry-
standard hardware that you can use to form the hardware infrastructure. In either type of
deployment, a network infrastructure is required for the interconnection between the
nodes and customer environments for object storage access.
This paper describes ECS networking and provides configuration best practices. It
provides details about ECS network hardware, network configurations, and network
separation. This paper should be used as an adjunct to the following Dell ECS
documentation on the Dell Technologies ECS Info Hub:
• ECS Hardware Guide (for Gen1 and Gen2 hardware)
• ECS EX-Series Hardware Guide
• Networks Guide for D- and U- Series (Gen 1 and Gen 2 hardware)
• Networks Guide for EX300 and EX3000 (EX-Series hardware)
Updates to this document are completed periodically and often coincide with new features
and functionality changes.
Audience This document is targeted for Dell field personnel and customers interested in
understanding ECS networking infrastructure and the role that networking plays within
ECS. It also describes how ECS connects to customer environments.
March 2023 H15718.11 Removed the virtual IP section and updated the ‘Network
separation planning and requirements’ section
November 2023 H15718.12 Removed intermix between HDD and AFA cluster
We value your Dell Technologies and the authors of this document welcome your feedback on this
feedback document. Contact the Dell Technologies team by email.
Note: For links to other documentation for this topic, see the ObjectScale and ECS Info Hub.
ECS overview
ECS features a software-defined architecture that promotes scalability, reliability, and
availability. ECS is built as a completely distributed storage system to provide data
access, protection, and geo-replication. The main use cases for ECS include storage for
modern applications and as secondary storage to free up primary storage of infrequently
used data while also keeping it reasonably accessible.
ECS software and hardware components work in concert for unparalleled object and file
access. The software layers are shown in Figure 1 along with the underlying infrastructure
and hardware layers. The set of layered components consists of the following:
• ECS portal and provisioning services: Provides an API, CLI, and web-based
portal that allows self-service, automation, reporting, and management of ECS
nodes. It also handles licensing, authentication, multitenancy, and provisioning
services.
• Data services: Provides services, tools, and APIs to support object, and HDFS and
NFSv3.
• Storage engine: Responsible for storing and retrieving data, managing
transactions, and protecting and replicating data.
• Fabric: Provides clustering, health, software, and configuration management as
well as upgrade capabilities and alerting.
• Infrastructure: Uses SUSE Linux Enterprise Server 12 as the base operating
system for the turnkey appliance or qualified Linux operating systems for industry
standard hardware configuration.
• Hardware: Industry-standard hardware composed of x86 nodes with internal disks
or attached to disk-array enclosures with disks, and top-of-rack (ToR) switches.
For an in-depth architecture review of ECS, see the document ECS: Overview and
Architecture.
Introduction ECS network infrastructure consists of a set of ToR switches which allow for the following
two types of network connections:
• Public Network: Connection between the customer production network and ECS.
• Private Network: For management of nodes and switches within and across racks.
The ToR switches are dedicated to either the public (production) network, or to the
private, internal-to-ECS-only network. For the public network, a pair of 10/25 GbE network
switches are used, which service data and internal communication between the nodes.
For the private network, depending on the hardware generation, use either a single 1 GbE
switch for Gen1 or Gen2 (Gen1/2). Alternately, use a pair of 25 GbE switches (Gen3). The
private network is used for remote management, console access, and PXE booting which
enables rack management and cluster-side management and provisioning. From this set
of switches, uplink connections are presented to the customer production environment for
storage access and management of ECS. The networking configurations for ECS are
recommended to be redundant and highly available.
Note: Gen1/2 and Gen3 EX300, 500, 3000, 5000 Series systems use a public network for internal
communication between nodes. Gen3 EXF900 Series systems use a private network for internal
communication between nodes.
Traffic types Understanding the traffic types within ECS and the customer environment is useful for
architecting the network physical and logical layout and configuration for ECS.
Note: The internode traffic of EXF900 running in a private network due to the NVMe-oF
architecture of EXF900.
The private network, which is under Dell control, is entirely for node and switch
maintenance and thus traffic types include:
• Segment Maintenance Management: Traffic associated with administration,
installation, or setup of nodes and switches within rack.
Introduction Each ECS appliance rack contains three, four, or six switches. Gen1/2 appliances have
three switches, two for the public network and one for the private network. Gen3 EX300,
EX500, EX3000 and EX5000 systems have two public switches and two private switches.
Gen3 EXF900 systems have another two dedicated aggregation switches for private
switches ensure all the EXF900 nodes have line rate performance to any node in any
rack.
Switch details, including model numbers, along with designated switch port usage and
network cabling information, can be found in the ECS Hardware Guide for Gen1/2
appliances and the ECS EX-Series Hardware Guide for Gen3 appliances.
Public switches Public (production or front-end) switches are used for data transfer to and from customer
applications as well as internal node-to-node communication for Gen1/2 and Gen3
EX300, 500, 3000, 5000 series. The inter- node traffic of Gen3 EXF900 will go through
inside the private switch. These switches connect to the ECS nodes in the same rack. For
Gen1/2 appliances, two 10 GbE, 24-port or 52-port Arista switches are used. For Gen3
appliances, two 10/25 GbE (EX300) or two 25 GbE (EX500, EX3000, EX5000 and
EXF900) 48-port Dell switches are used. To create a High Availability (HA) network for the
nodes in the rack, the public switches work in tandem using LACP/MLAG, with the Arista
switches in Gen1/2 appliances, and Virtual Link Trunking (VLT), with the Dell switches in
Gen3 appliances. This pairing is for redundancy and resiliency in case of a switch failure.
Across all generations of hardware, Gen1-3, each ECS node has two Ethernet ports that
directly connect to one of the ToR public switches. Due to NIC bonding, the individual
connections of a node appear to the outside world as one. The nodes are assigned IP
addresses from the customer’s network either statically or through a DHCP server. At a
minimum, one uplink between each ToR public switch in the ECS appliance to the
customer network is required. The public switch management ports connect to the ToR
private switch or switches.
Best practices:
• For redundancy and to maintain a certain level of performance, have two uplinks per
switch to customer switch, or four uplinks per rack minimum.
• Use at least 25 GbE switches for optimal performance when using customer-
provided public switches.
• Have dedicated switches for ECS and do not use shared ports on the customer
core network.
Private switches Private switches are used by ECS for node management. For Gen1/2 appliances, the
private switches also allow for out-of-band (OOB) management communication between
customer networks and Remote Management Module (RMM) ports in individual ECS
nodes. Gen1/2 appliances have a 52-port 1 GbE Arista switch, or a Cisco switch for
organizations with strict Cisco only requirements. Gen3 appliances contain two 25 GbE
48-port Dell private switches identical in model to the public switches.
Note: Gen3 does not allow for OOB management communication from customer networks.
The management ports in each node connect to one or more private switches. They use
private addresses such as 192.168.219.x. Each Gen1/2 node also has a connection
between its RMM port and the private switch. This node also can have access to the
customer network to provide OOB management of the nodes. Gen3 nodes also have a
connection between their integrated Dell Remote Controller (iDRAC) and one of the
private switches. However, there is no customer-facing OOB management for Gen3 ECS
nodes.
Note: Dell switches are required for the private network. Private switches cannot be customer-
provided.
Best practices:
Aggregation The aggregation switches can be installed in the Dell rack or a customer provided rack.
switches for The aggregation switch allows you to connect up to seven racks of EXF900 nodes in the
EXF900 same cluster.
Note: For more information about the network cabling, see the ECS EX Series Hardware Guide
and the ECS Networking Guide.
Customer- The flexibility of ECS allows for variations of network hardware and configurations which
provided should meet the Dell standards like ECS_Appliance_Features_RPQ customer-provided
switches switches configuration support.
Regarding customer provided switches, configuration and support are the responsibility of the
customer. These switches should be dedicated to ECS and not shared with other applications.
Dell assistance is advisory for customer-provided switches.
The private network switches for an ECS appliance cannot be replaced since these are
solely for administration, installation, diagnosing, and management of ECS nodes and
switches. The privatenetwork switches need to remain under control of Dell personnel.
Introduction The previous section briefly described the switches and related networks used ECS
appliances. This section will explore further the public production network, and the private
ECS internal management network referred to as the Nile Area Network, or NAN for short.
Design considerations and best practices in both production and internal networks are
discussed to offer guidance for network architects.
Production The production network involves the connections between the customer’s network and the
network two ToR ECS front-end, public data switches and the connections within the ECS rack.
These connections act as the critical paths for in and out client requests and data (“north
to south”), and internode traffic (“east to west”) for replication and processing requests as
shown in the following figure for a single rack.
For multirack, Gen1/2 and Gen3 EX300, EX500, EX3000 and EX5000 systems, internode
traffic flows north to south and over to the customer network and to the other ECS racks
as shown in the following figure.
Note: For Gen3 EXF900, it is designed with NVMe-oF architecture, the internode traffic will go
through inside the private switch and dedicated aggregation switch to improve the performance.
Network connections in the production network as a best practice should be designed for
high availability, resiliency, and optimal network performance.
Each node in the rack is connected to both switches through two NICs which are
aggregated together using a Linux bonding driver. The node is configured to bond the two
NICs into a single LACP bonding interface also known as a "mode 4" bond. This bonding
interface connects one port to each switch as demonstrated in the following images.
The following two figures are examples of nodes bonded with a public switch pair.
The terminal output on the next page displays snippets of the basic configuration files for
the Gen1/2 and Gen3 node. LACP is a protocol that would build LAGs dynamically by
exchanging information in Link Aggregation Control Protocol Data Units (LACDUs)
relating to the link aggregation between the LAG. LACP sends messages across each
network link in the group to check if the link is still active resulting in faster errorand failure
detection. The port channels on each switch are MLAG-ed together and can be visible
from the nodes. They are configured to allow for fast connectivity using the spanning tree
portfast command. This command places the ports in forwarding state immediately as
opposed to the transition states of listening, learning and then forwarding which can cause
15 seconds of delay. Port channels are also set to lacp fallback to allow all ports within
the port channel to fall back to individual switch ports. When the node’s portsare not yet
configured as LAG, this setting allows for PXE booting from public switch ports of the
nodes and forwarding of traffic.
The data switches are preconfigured on ECS supported switches. The configuration files
for the data switches are on each node in directory. For example, Gen3 Dell switch
configuration files are located inside /usr/share/emc-dell-firmware/config/ecs/.
Each Gen1/2 Arista public data switch has eight ports available to connect to the
customer network, providing sixteen uplink ports per rack. Gen3 Dell switches each have
eight 10/25 GbE and four 40/100 GbE ports, providing either sixteen or eight uplink ports
per rack. See the appropriate (Gen1/2 or Gen3) Networks Guide for ECS Hardware for
complete details including switch configuration examples.
For Gen1/2 appliances, similar to the ports used for the node, the eight uplink ports on
each of the data switches are configured as a single LACP/MLAG interface. This
configuration is shown below in the code output. The port channels are also configured to
be in lacp fallback mode for customers who are unable to present LACP to the ECS rack.
This mode will only be activated if no LACP is detected by the protocol. If there is no
LACP discovered between the customer link and the ECS switches, the lowest active port
will be activated and all other linked ports in the LAG will be disabled until a LAG is
detected. At this point, there is no redundancy in the paths.
In addition, the data switches are not configured to participate in the customer’s spanning
tree topology. They are presented as edge or host devices since a single LAG for the
eight ports in each switch is created. It also simplifies the setup of the ECS switches in the
customer network. Here is the output for public switches basic configuration file.
!Gen1/2 node
interface Ethernet1
description MLAG group 100
channel-group 100 mode active
lacp port-priority 1
!
interface Port-Channel100
description Customer Uplink (MLAG group 100)
port-channel lacp fallback
port-channel lacp fallback timeout 1
spanning-tree bpdufilter enable
mlag 100
mtu 9216
vlt-port-channel 90
spanning-tree bpdufilter enable
!
!Gen3 node for 100G uplink (40G uplink ports share with 100G
uplink ports)
interface ethernet1/1/53
description "Customer Uplink 100G"
no shutdown
channel-group 100 mode active
no switchport
mtu 9216
flowcontrol receive off
!
interface port-channel100
description "Customer Uplink 100G"
no shutdown
switchport mode trunk
switchport access vlan 1
mtu 9216
vlt-port-channel 100
spanning-tree bpdufilter enable
!
!Gen3 node for 40G uplink (40G uplink ports share with 100G uplink
ports)
interface ethernet1/1/53:1
description "Customer Uplink 40G"
no shutdown
channel-group 110 mode active
no switchport
mtu 9216
flowcontrol receive off
!
interface port-channel110
description "Customer Uplink 40G"
no shutdown
switchport mode trunk
switchport access vlan 1
mtu 9216
vlt-port-channel 110
spanning-tree bpdufilter enable
!
Note: In the Gen3 nodes, we use port channels 80, 90,100 and 110 to indicate different port
speeds. See the ECS hardware guide for more information.
Connections from the customer network to the data switches can be linked in several
different ways. For instance, they can be linked as a single link, as multi-link to a single
The code below shows an example of a two port LAG for an ECS public switch and Cisco
single switch withmultiple links.
!Gen3 Dell 25G uplink configuration. (For more details about speed
configuration, refer to the 25G uplink example earlier in this
section.)
interface ethernet1/1/41-1/1/42
channel-group 100 mode active
no switchport
mtu 9216
Figure 12 exhibits a multiple port uplink to multiple switches with a LAG configuration. A
better approach would be to configure more than two links per ECS switch as presented
in Figure 13. The links should be spread in a bowtie fashion (links on each customer
switch should be distributed evenly between the data switches) for redundancy and
optimal performance during failures or scheduled downtime.
Figure 12. Multiple customer switches with single link per switch
Figure 13. Multiple customer switches with multiple links per switch
In either of these configurations, both port channels will need to be connected using a
multiswitch LAG protocol like Arista MLAG or Cisco virtual Port Channel (vPC) to connect
to the ECS MLAG switch pair port channel. Also, customers need to create port channels
using LACP in active or passive mode on all switchesparticipating in the multiswitch LAG.
In the output below are sample configurations for Arista and Cisco with multiswitch LAG
protocols definitions. Note the vPC or MLAG numbers on each switch would need to
matchto create a single port channel group.
Best practices:
• For multiple links, set up LACP on the customer switch. If LACP is not configured
on customer switches to ECS switches, one of the data switches will have the
active connection or connections. The port or ports connected to other data switch
will be disabled until a LAG is configured. The connection or connections on the
other switch will only become active if one switch goes down.
• Balance the number of uplinks from each switch for proper network load balancing
to the ECS nodes.
• When using two customer switches, it is required to use multiswitch LAG protocols.
• For multirack environments of Gen1/2, Gen3 EX300, EX500, EX3000 and EX5000
systems, consider using an aggregation switch to keep internode traffic separated
from customer core network.
Note: Gen1/2 switches are configured not to participate in the customer’s spanning tree topology.
Gen3 switches are configured to participate in the customer’s spanning tree topology with Rapid
Spanning Tree Protocol (rstp). For Gen3 switch configuration details see the ECS Networks Guide
for EX300 and EX3000 (EX-Series) Hardware documentation.
Another example is if the customer needs to configure the uplinks for a specific VLAN.
The VLAN membership should only be changed if the customer requirement is to set the
uplink ports to VLAN trunk mode. Only the port channels for the uplink and nodes need to
be changed to setup the VLAN. The output below shows a sample of how to change the
VLAN membership. Both data switches would need to have thesame VLAN configuration.
# Create new vlan
vlan 10
exit
# change to vlan trunk mode for uplink
interface port-channel100
switchport mode trunk
switchport trunk allowed vlan 10
# change vlan membership for access port to the nodes
interface port-channel1-12
switchport access vlan 10
copy running-config startup-config
Best practices:
• See the document ECS Appliance Features RPQ customer-provided switches
configuration support for configuration changes.
Internal private The internal private network, also known as the Nile Area Network (NAN), is mainly used
network for maintenance and management of the ECS nodes and switches within a rack and
across racks. Ports on the management switch can be connected to another management
switch on another rack, creating a NAN topology. From these connections, nodes from
any rack or segment can communicate to any other node within the NAN. The
management switch is split in different LANs to separate the traffic to specific ports on the
switch for segment-only traffic, cluster traffic and customer traffic to RMM:
• Segment LAN: Includes nodes and switches within a rack
• Private.4 LAN: Includes all nodes across all racks
• RMM/iDRAC LAN: Includes uplink ports to customer LAN for RMM/iDRAC access
from customer’s network
NAN topologies
The NAN is where all maintenance and management communications traverse within rack
and across racks. A NAN database contains information such as IP addresses, MAC
addresses, node name, and ID on all nodes within the cluster. This database is locally
stored on every node and is synchronously updated by primary node using the
setrackinfo command. Information on all nodes and racks within the cluster can be
retrieved by querying the NAN database. One command that queries the NAN database is
getrackinfo.
The racks are connected to the management switches on designated ports. These
connections allow nodes within the segments to communicate with each other. There are
different ways to connect the racks or rack segments together. Each rack segment is
specified a unique color during installation and thus identifying the racks within the cluster.
The figures below depict some of the topologies and give some advantages and
disadvantages of each NAN topology.
The following figure shows a simple topology linearly connecting the segments through
ports of the management switches in a daisy-chain fashion. The disadvantage of this
topology is that when one of the physical links breaks, there is no way to communicate to
the segment or segments that has been disconnected from the rest of the segments. This
event causes a split-brain issue in NAN and forms a less reliable network.
Another way to connect the segments is in a ring topology as illustrated in the following
figure. The advantage of thering topology over the linear is that two physical links would
need to be broken to encounter the split-brain issue, proving to be more reliable.
For large installations, the split-brain issue in the ring or linear topologies could be
problematic for the overall management of the nodes. A star topology is recommended for
an ECS cluster where there are ten or more racks or customers wanting to reduce the
issues that ring or linear topologies pose. In the star topology, an aggregation switch, as
shown in the following figure, would need to be added and would be an extra cost;
however, it isthe most reliable among the NAN topologies.
Best practices:
• Do not use linear topology.
• For large installations of ten or more ECS racks, a star topology is recommended
for better failover.
• EXF900 use dedicated aggregation switches for ECS racks, each node use port
53-56 for connecting.
Segment LAN
The Segment LAN logically connects nodes and switches within a rack to a LAN identified
as VLAN 2. This LAN consists of designated ports on the management switch or switches
and are referred to as the blue network. All traffic is limited to members of this segment
for ease of management and isolation from the customer networkand other segments
within the cluster. The Ethernet ports on the nodes are configured with a private IP
address derived from the segment subnet and node ID number. Thus, the IP address is of
the form 192.168.219.{NodeID}. The IPs are not routable and packets are untagged.
These addresses are reused by all segments in the cluster. To avoid confusion, it is not
recommended to use these IP addresses in the topology file required when installing the
ECS software on the nodes. There are several IP addresses that are reserved for specific
uses:
• 192.168.219.254: Reserved for the primary node within the segment. Recall from
the previous section that there is a primary node designated to synchronize the
updates to the NAN database.
• 192.168.219.250: Reserved for the private switch (bottom)
• 192.168.219.251: Reserved for the private switch (top)
• 192.168.219.252: Reserved for the public switch (bottom)
• 192.168.219.253: Reserved for the public switch (top)
• 169.254.255.252: Reserved for the aggregation switch for EXF900 (bottom)
• 169.254.255.253: Reserved for the aggregation switch for EXF900 (top)
Note: Gen1/2 only have one private switch named turtle with 192.168.219.251.
Best practices:
• For troubleshooting a Gen1/2 node, administer the node through the segment LAN
(connect a laptop to port 24) to not interfere configurations of other segments within
the cluster.
• For troubleshooting a Gen3 node, the administrator can access the cluster by the
tray which is connected with private switch named fox in port 34, or connect a
laptop to port 36 in the fox switch.
Private.4 LAN
Multiple segment LANs are logically connected to create a single Cluster LAN for
administration and access to the entire cluster. Designated interconnect ports on
management switches provide interconnectivity between management switches. All
members will tag their IP traffic with VLAN ID 4 and communicate through the IPv4 link
local subnet. During software installation, all nodes in the rack are assigned a unique
color number. The color number acts as the segment ID and is used together with the
node ID to consist of the new cluster IP address for every node in the cluster. The IP
addresses of the nodes in the cluster LAN will be in the form of
169.254.{SegmentID}.{NodeID}. This unique IP address would be the recommended IP
address to specify in the topology file for the nodes within the cluster.
Best practices:
• ECS does not support IPv6, so do not enable IPv6 on these switches or send IPv6
packets.
• If troubleshooting a segment within the cluster, administer the segment through the
Segment LAN to not affect the configuration of the entire cluster.
• Use the IP address in the topology file to provide a unique IP for all nodes within
the cluster.
RMM/iDRAC access from customer network (optional)
RMM/iDRAC access from customer network is optional, and it is recommended to
determine specific requirement from customer. A relevant use of the RMM/iDRAC
connection would be for ECS software-only deployments where the hardware is managed
and maintained by customers. Another use is when customers have a management
station in which they would require RMM/iDRAC access to all hardware from a remote
location for security reasons.
Note: For Gen1/2 ECS, RMM connect to the private switch named turtle; For Gen3 ECS, iDRAC
connect to the private switch named fox.
For a Gen1/2 node, to allow for RMM connections from customer switch, Ports 51 and 52
on the management switch are configured in a hybrid mode. This configuration allows the
ports to handle both tagged and untagged traffic. In this setup, the ports can be used for
multiple purposes. The uplinks to the customer switch are on VLAN 6 and packets are
untagged.
Best practices:
• Providing RMM to customer is optional and only available in Gen1/2 appliances.
• Determination should be made if customer access to RMM is absolutely required
before configuration.
• Ensure that NAN traffic on VLAN 4 does not leak to customer network when adding
RMM access to customer network.
Overview Network separation allows for the separation of different types of network traffic for
security, granular metering, and performance isolation. The types of traffic that can be
separated include:
• Management: Traffic related to provisioning and administering through the ECS
Portal and traffic from the operating system such as DNS, NTP, and SRS.
• Replication: Traffic between nodes in a replication group.
SRS (Dell Secure Remote Services) Based on the network that the SRS Gateway
isattached (public.data or public.mgmt)
Note: Starting from ECS 3.7 version, it allows S3 data access on both public.data and
public.data2 network.
Network Review the following list of best practices, requirements, and planning information prior to
separation separating networks.
planning and • ECS networks are separated using VLANs.
requirements
• Network separation can be configured as part of the ECS install procedure, or in an
existing ECS environment using the IP Change procedure.
• Network separation is not supported for custom installations of ECS software.
Separating networks using Virtual IP Addresses is not supported.
The following information applies to separating networks either during ECS installation, or
when separating networks in existing ECS environments:
• Most ECS installations are configured with the data, management, and replication
traffic on a single network. Network separation should only be configured when
there is an explicit requirement to separate one or more of the ECS networks, such
as for QoS for replication traffic. Each separated network must have a unique VLAN
ID and must sit on a different network segment from any other separated network.
• The IP addresses of the nodes, hostnames, and DNS and NTP settings must be
static. See the ECS Installation Guide for details.
• By default, a static IP address must first assigned to the public network.
• If separating networks, it is recommended to create replication and data networks
and to dedicate the public network for the management traffic. If the management
network must be on a VLAN, this can be achieved by updating the native VLAN on
the front-end (public) switches.
• By design, DNS/NTP/SMTP/LDAP are forced to use the management network. If
the management network is separated, DNS/NTP/SMTP/LDAP will use the defined
management network.
• When separating networks, it is required to assign each traffic type (data,
management, or replication) a unique VLAN ID.
• To use network separation with VLANs, the front-end switches (rabbit and hare)
must be configured to pass packets with the chosen VLAN tags. Ensure that this
configuration has been performed.
• The default gateway will always be set on the public interface. When configuring
separated interfaces on networks not routable from the default gateway, static
routes may be required to support the address resolution of the interfaces.
• Clients accessing ECS use the data or management network IP addresses. In the
case of the data network, traffic should be balanced across the addresses that are
assigned to the data network.
Note: CAS applications use the load balancer that is built into the CAS SDK, however if using
another protocol, an external load balancer is required.
Network In addition to the default network configuration, network can be partially separated, or all
separation separated, using the following:
configurations • Standard (default): All management, data, and replication traffic in one VLAN
referred to as public.
• Partial (Dual): Two VLANs where one VLAN is the default public which can have
two traffic types and another VLAN for any traffic not defined in public VLAN.
• Partial (Triple): One VLAN for public VLAN and two VLANs where one traffic type
is placed in public VLAN and two different VLANS are defined for the other two
traffic types not in public.
Network separation configures VLANs for specific networks and uses VLAN tagging at the
operating system level. For the public network, traffic is tagged at the switch level. At a
minimum, the default gateway is in the public network and all the other traffic can be in
separate VLANs. If needed, the default public VLAN can also be part of the customer’s
upstream VLAN, and in this case, the VLAN ID for public must match the customer’s
VLAN ID.
admin@memphis-pansy:/etc/sysconfig/network> ls ifcfg-public*
ifcfg-public ifcfg-public.data ifcfg-public.mgmt ifcfg-public.repl
The operating system presents the interfaces with a managed name in the form of
public.{trafficType} such as public.mgmt, public.repl, or public.data as can be observed by
ip addr command output in the following code.
The HAL searches for these managed names based on the active_template.xml in
/opt/emc/hal/etc. It finds those interfaces and presents those to the Fabric layer. Output
of cs_hal list nics is shown in the following figure.
The HAL gives the above information to the Fabric layer which creates a JavaScript
Object Notation (JSON file) with IP addresses and interface names and supplies this
information to the object container. The code below is an output from Fabric Command
Line (fcli) showing the format of the JSON structure.
"replication_ip": "10.10.30.55"
},
"status": "OK"
}
The mapped content of this JSON structure is placed in object container in the file
/host/data/network.json as shown in the terminal output below in which the object layer
can use to separate ECS network traffic.
{
"data_interface_name": "public.data",
"data_ip": "10.10.10.55",
"hostname": "memphis-pansy.ecs.lab.emc.com",
"mgmt_interface_name": "public.mgmt",
"mgmt_ip": "10.10.20.55",
"private_interface_name": "private.4",
"private_ip": "169.254.78.17",
"public_interface_name": "public",
"public_ip": "10.245.132.55",
"replication_interface_name": "public.repl"
"replication_ip": "10.10.30.55"
}
Network separation in ECS uses source-based routing to specify the route that packets
take through the network. In general, the path that packets come in on will be the same
path going out. Based on the ip rules, the local node that originates the packet looks at
the IP, looks at local destination, and if it is not local, it looks at the next. Using source-
based routing reduces static routes that need to be added.
ECS switch Depending on customer requirements, network separation may involve modification of the
configuration for basic configuration files for the data switches. This section will explore examples of
network different network separation implementations in the switch level such as the default, single
separation domain, single domain with public set as a VLAN, and physical separation.
Standard (default)
The default settings use configuration files that are bundled with ECS. In this scenario,
there is no VLAN and there is only the public network. Also, there is no tagged traffic in
the uplink connection. All ports are running in access mode. The following table and figure
provide an example of a default ECS network setup with customer switches.
Single domain
In a single domain, a LACP switch or an LACP/MLAG switch pair is configured on the
customer side to connect to the ECS MLAG switch pair. Network separation is achieved
by specifying VLANs for the supported traffic types. In the example in the following table
and figure, data and replication traffic are separated into two VLANs and the management
stays in the public network. The traffic on the VLANs will be tagged at the operating
system level with their VLAN ID which in this case is 10 for data and 20 for replication
traffic. The management traffic on the public network is not tagged.
Repl 20 Yes
Both data switch configurations files would need to be modified to handle the VLANs in
above example. The terminal output below shows how this can be specified for Arista
switches. Things to note from the configuration file include:
• The switchports have been modified from access to trunk.
• VLANs 10 and 20 created to separate data and replication traffic are allowed. They
also need to be created first.
vlan 10, 20
interface po1-12
switchport trunk native vlan 1
switchport mode trunk
switchport trunk allowed vlan 1,10,20
interface po100
switchport mode trunk
switchport trunk allowed vlan 1, 10,20
The settings within the configuration files of the data switches would need to be changed
to include all the VLANs specified for network separation. As can be seen from the
terminal output below, an update to the native VLAN is done to match the customer VLAN
for public. In this example, the public VLAN is identified as VLAN 100.
Here is example code that shows a single domain with two VLANs and public VLAN
settings for public switches.
Physical separation
For physical separation, an example setup may include multiple domains on the customer
network defined for each type of traffic. An example of the setup and details are shown in
the following table and figure. As shown in the table, the public network is not tagged and
will be on port channel 100, data traffic will be on VLAN 10, tagged and on port channel
101 and replication traffic will be on VLAN 20, tagged and on port channel 102. The three
domains are not MLAG together.
The terminal output below shows what the settings would be on the data switches for this
configuration on Arista switches. Port-channel 100 is set up to remove uplink ports 2
through 8, leaving only the first uplink for the public network. Port-channel 101 defines the
settings for the data traffic and port channel 102 is for the replication traffic where the
corresponding VLANs are allowed and switchport is set to trunk. Connections to the data
nodes are defined by interface po1-12.
For situations where customers would want the public network on a VLAN, the following
table and the subsequent terminal output provide example details of the configuration. In
this case, all traffic is tagged and public is tagged with ID 100, data traffic tagged with 10
and replication tagged with 20. Uplink connections and port channel 100 are set up as the
trunk, and VLAN 10, 20, and 100 are allowed. The connections to the nodes defined in
interface po1-12 are also set accordingly.
Repl 20 Yes
Best practices:
• Network separation is optional. If used, it is important to determine fit and best
configuration.
• Keep the management traffic within the public to reduce the number of static
routes.
• Although it can have only default gateway in public, it is recommended to have at
least one of the traffic types to be in the public network.
For production network, one uplink per switch to customer switch is required at the
minimum. However, one per switch may not be enough to handle the performance
necessary for all traffic specifically in multirack and single site deployment or when one
switch fails. Internode traffic in a single site multirack deployment traverses through one
rack, up to the customer network and down to the next rack of switches in addition to
handling traffic associated with data, replication, and management. It is recommended at
the minimum, four uplinks per rack (two links per switch) for performance and high
availability. Since both the data switches are peers, if link to either switch is broken, one of
the other switches is available to handle the traffic.
Note: Network performance is only one aspect of overall ECS performance. The software and
hardware stack both contribute as well.
Best practices:
• A minimum of four uplinks per rack (two links per switch) is recommended to
maintain optimal performance in case one of the switches fails.
• Use enough uplinks to meet any performance requirements.
• Get a good understanding of workloads, requirements, deployment, current network
infrastructure, and expected performance.
• When replicating two EXF900 systems across sites, consider the potential
performance impacts over the WAN. A large ingest may put a high load on the link,
causing saturation or delayed RPO. Also, a user or application may experience
higher latency times on remote reads and writes as compared to local requests.
Tools
Introduction This section describes tools available for planning, troubleshooting, and monitoring of
ECS networking.
ECS portal The WebUI provides a view of network metrics within ECS. For instance, the average
bandwidth of the network interfaces on the nodes can be viewed in the Node and Process
Health page. The Traffic Metrics page provides read and write metrics at the site and
individual node level. It shows the read and write latency in milliseconds, the read and
write bandwidth in bytes/second and read and write transactions per second.
ECS Designer The ECS Designer is an internal tool to help with streamlining the planning and
and planning deployment of ECS. It integrates the ECS Configuration Guide with the internal validation
guide process. The tool is in spreadsheet format, and inputs are color-coded to indicate which
fields require customer information. The sheets are ordered in a workflow to guide the
architects in the planning.
Note: Contact your account team to obtain the latest copy of the ECS Designer.
Also available is an ECS Planning Guide that provides information about planning an ECS
installation and site preparation. It also provides an ECS installation readiness checklist
and echoes the considerations discussed in this white paper.
Secure Remote Secure Remote Services (SRS) provides secure two-way communication between
Services customer equipment and Dell support. It leads to faster problem resolution with proactive
remote monitoring and repair. SRS traffic goes through the ECS public network and not
the RMM access ports on the ECS private network. SRS can enhance customer
experience by streamlining the identification, troubleshooting, and resolution of issues.
Linux or HAL ECS software runs on a Linux operating system. Common Linux tools can be used to
tools validate or get information about ECS network configurations. Some tools useful for this
include: ifconfig, netstat, and route. HAL tools such as getrackinfo are also useful. Below
is output with examples.
The following code shows an example of truncated output of netstat to validate network
separation.
Another tool that can validate the setup of network separation is the domulti wicked
ifstatus public.<traffic type> command which shows the state of the network interfaces.
The state of each interface should be up. Here is the command being used to check the
public.data interface.
192.168.219.9
========================================
public.data up
link: #14, state up, mtu 1500
type: vlan public[1000], hwaddr 00:1e:67:e3:1c:46
config: compat:suse:/etc/sysconfig/network/ifcfg-public.data
leases: ipv4 static granted
addr: ipv4 10.10.10.35/24 [static]
192.168.219.10
========================================
public.data up
link: #13, state up, mtu 1500
type: vlan public[1000], hwaddr 00:1e:67:e3:28:72
config: compat:suse:/etc/sysconfig/network/ifcfg-public.data
leases: ipv4 static granted
addr: ipv4 10.10.10.36/24 [static]
192.168.219.11
========================================
public.data up
link: #13, state up, mtu 1500
type: vlan public[1000], hwaddr 00:1e:67:e3:29:7e
config: compat:suse:/etc/sysconfig/network/ifcfg-public.data
leases: ipv4 static granted
addr: ipv4 10.10.10.37/24 [static]
192.168.219.12
========================================
public.data up
link: #11, state up, mtu 1500
type: vlan public[1000], hwaddr 00:1e:67:e3:12:be
config: compat:suse:/etc/sysconfig/network/ifcfg-public.data
leases: ipv4 static granted
addr: ipv4 10.10.10.38/24 [static]
Some of the HAL tools were covered in ECS network separation; however, here is an
output of getrackinfo -a that lists the IP addresses, RMM MAC and Public MAC across
nodes within an ECS rack.
Best practices:
• Use ECS Designer to help with the planning of ECS network with customer
network.
• Use the ECS portal to monitor traffic and alerts.
• Set up and enable SRS for streamlining of issues to Dell support team.
Network services
Certain external network services need to be reachable by the ECS system which
includes:
• Authentication Providers (optional): System and namespace administrative
users can be authenticated using Active Directory and LDAP. Swift object users can
be authenticated using Keystone. Authentication providers are not required for
ECS. ECS has local user management built- in. Local users on ECS are not
replicated between VDCs.
• DNS Server (required): Domain Name server or forwarder.
• NTP Server (required): Network Time Protocol server. See NTP best practices for
guidance on optimum configuration.
• SMTP Server (optional): Simple Mail Transfer Protocol Server is used for sending
alerts and reporting from the ECS rack.
• DHCP server (optional): Only necessary if assigning IP addresses through DHCP.
• Load Balancer (optional but highly recommended): Evenly distributes client and
application load across all available ECS nodes.
Also, the data switch uplinks would need to reside in the same network or accessible by
the ECS system.
The ECS General Best Practices white paper provides additional information about these
network services. Also available are white papers that exemplify on how to deploy ECS
with vendor-specific load balancers:
Conclusion
ECS supports specific network hardware and configurations plus customer variations and
requirements. The switches used as part of the ECS hardware infrastructure provide the
backbone for the ECS communication paths to the customer network, node-to-node
communication, and node- and cluster-wide management. It is a best practice to architect
ECS networking to be reliable, highly available, and performant. There are tools to help
with planning, monitoring, and diagnosing of the ECS network. Customers are
encouraged to work closely with Dell Technologies personnel to assist in providing the
optimal ECS network configuration to meet their requirements.
References
Dell The following Dell Technologies documentation provides other information related to this
Technologies document. Access to these documents depends on your login credentials. If you do not
documentation have access to a document, contact your Dell Technologies representative.
• ObjectScale and ECS Info Hub
• ECS General Best Practices
• ECS Hardware Guide
• ECS: Overview and Architecture
• NTP best practices