White Paper - Link Aggregation
White Paper - Link Aggregation
About SysKonnect: SysKonnect GmbH, the Server Connectivity Company, focuses on the development, manufacture and worldwide marketing of high-end network products. These include highperformance network cards for Gigabit Ethernet, FDDI/CDDI and FDDI concentrators for strategic networks. The comprehensive line of SK-NET NICs is ideally suited to application areas such as electronic commerce, finance and health care, imaging and to business applications such as SAP and Baan. The headquarters of SysKonnect are in Ettlingen, Germany. The company also has offices in Great Britain and USA. Products are sold through OEMs and worldwide distribution channels.
Table of Contents
1 Why Link Aggregation? Higher Link Availability Increased Link Capacity Aggregating replaces Upgrading 2 Types of Link Aggregation Switch-to-Switch Connections Switch-to-Station (Server or Router) Connections Station-to-Station Connections 3 The IEEE Standard 802.3ad 4 Configuration Physical issues in Link Aggregation Addressing Distribution of Frames Technology Constraints SysKonnect solution for Link Aggregation with Gigabit Ethernet Link Aggregation Control Protocol (LACP) Creation of aggregators and teams Redundant Switch Failover SysKonnect Network Control for Windows 2000 5 Conclusion 5 5 5 5 7 7 8 9 11 13 13 13 13 13 14 14 14 15 17 21
Link aggregation may be less expensive than a native speed upgrade and yet achieve a similar performance level. Both the hardware costs for a higher speed link and the equivalent number of lower speed connections have to be balanced to decide which approach is the most advantageous. Sometimes link aggregation may even be the only means to improve performance when the highest data rate available on the market is not sufficient. Many network administrators have experienced that upgrading the network hardware (e.g., switching from 10 Mb/s network adapters to 100 Mb/s network adapters), in the end, led to a performance improvement much less than the 10:1 ratio implied by the hardware change (or perhaps no improvement at all!). Whether an aggregated link actually yields a performance improvement commensurate with the number of links provided depends to a great extent on network traffic patterns and the algorithm used by the devices to distribute frames among aggregated links. To the extent that traffic can be distributed uniformly across the links, the effective capacity will increase as desired. If the traffic and distribution algorithm is such that a few links carry the bulk of the traffic while others go nearly idle, the improvement will be less than anticipated.
Switch-to-Switch Connections
In this scenario, multiple workgroups are joined to form one aggregated link. By aggregating multiple links, the higher speed connections can be achieved without hardware upgrade. Example In Figure 2, two switches are shown which are connected using four 1000 Mb/s links. If one link fails between these two switches, the other links in the link aggregation group take over the traffic and the connection is maintained.
This configuration reduces the number of ports available for connection to external devices. Aggregation thus implies a trade-off between port usage and additional capacity for a given device pair.
Figure 3. Switch-to-station connection Link Aggregation trades off port usage for effective link capacity. While it is common for high port-density switches to have some number of excess ports, it is rare for a server to have unused network interface cards. In addition, traditional single-port network adapters use a server backplane slot for each interface; often a server configuration will have only a limited number of slots available for network peripherals. In response to this problem, a number of manufacturers offer multiport network adapters specifically for use in servers, e.g. SysKonnects Dual Link Gigabit Ethernet Adapter. Figure 1 depicts multiple 1000 Mb/s links being aggregated between the backbone switch and a high-performance enterprise backbone router. From the perspective of the switch, a network layer router is simply an end station not much different from a server. As such, we can aggregate links between a switch and a router for the same reasons as in the switch-toserver case. One important difference arises regarding the choice of algorithm used to distribute frames among the links.
Station-to-Station Connections
Station-to-Station Connections
In the case of aggregation directly between a pair of end stations, no switches are involved at all. As in the station-to-switch case, the higher performance channel is created without having to upgrade to higher-speed LAN hardware. In some cases, higher-speed NICs may not even be available for a particular server platform, making link aggregation the only practical choice for improved performance. Example Figure 4 shows two servers interconnected by an aggregation of four 1000 Mb/s links.
Figure 4. Station-to-station connection This high-speed connection may be useful for multi-processing or server redundancy applications where high performance is needed to maintain real-time server coherence (this configuration is sometimes called back-end network).
10
Link Aggregation, according to IEEE 802.3, does not support the following:
12
Dissimilar MACs Link Aggregation is supported only on links using the IEEE 802.3 MAC (Gigabit Ethernet and FDDI are not supported in parallel but dissimilar PHYs such as copper and fiber are supported) Half duplex operation Link Aggregation is supported only on point-to-point links with MACs operating in full duplex mode. Operation across multiple data rates All links in a Link Aggregation Group operate at the same data rate (e.g. 10 Mb/s, 100 Mb/s, or 1000 Mb/s).
4 Configuration
Physical issues in Link Aggregation
Addressing
Each network interface controller is assigned a unique MAC address. Usually this address is programmed into the ROM during manufacturing. During initialization, the device driver reads the contents of the ROM and transfers the address to a register within the MAC controller. In most cases, this address is used as source and destination address during the transmission of packets. Aggregated links are to appear as a single link with a single logical network interface and therefore only have one virtual MAC address. The MAC address of one of the interfaces belonging to the aggregated link provides the virtual address of the logical link.
Frame Distribution
When applying WAN technologies, frames are sometimes broken into smaller units to accelerate transmission (such as in the bonding of B-channel ISDN lines). LAN communications channels, however, do not support sub-frame transfers. The complete frame has to be sent through the same physical link. Using aggregated links, the task is to select the link on which to transmit a given frame. Sending one long frame may take longer than sending several short ones, so the short frames may be received earlier than one long frame. The order has to be restored at the receiver side. Thus, an agreement has been made: all frames belonging to one conversation must be transmitted through the same physical link, which guarantees correct ordering at the receiving end station. For this reason no sequencing information may be added to the frames. Traffic belonging to separate conversations can be sent through various links in a random order. The algorithm for assigning frames to a conversation depends on the application environment and the kind of devices used at each end of the link. When a conversation is to be transferred to another link because the originally mapped link is out of service (failed or configured out of the aggregation) or a new link has become available relieving the existing ones, precautions have to be taken to avoid mis-ordering of frames at the receiver. This can be realized either by means of a delay time the distributor must determine somehow or through an explicit marker protocol that searches for a marker identifying the last frame of a conversation. The distributor inserts a marker message behind the last frame of a conversation. After the collector receives this marker message it sends a response to the distributor, which then knows, that all frames of the conversation have been delivered. Now the distributor can send frames of these types of conversations via a new link without delay. If the conversation is to be transferred to a new link, because the originally mapped link failed, this method will not work. There is no path on which the message marker can be transferred, i.e. the distributor has to employ the timeout method.
Technology Constraints
In principle, the devices applied in the aggregation restrict the throughput. Using an aggregation of four 100 Mb/s links instead of one 100 Mb/s link will increase the capacity but the throughput on each link remains the same.
14
4 Configuration
15
establish an aggregator. Aggregators are built automatically from those ports for which partner information is available. Using the SysKonnect Network Control running on Windows 2000 (see below) the user can combine two or more ports to create a team. For drivers of other operating systems, similar tools for the configuration of Link Aggregation will be available. Teams and aggregators have the following main features: There is a virtual MAC address for the whole team (the MAC address of one adapter of the team). The aggregator with the most active links is the active aggregator. Every other aggregator is in hot standby. The aggregators are configured automatically (see the link aggregation standard).
Below the team level, LACP automatically creates aggregators as described above. If all ports have the same partner information, only one aggregator is established. If there are n types of partner information, n aggregators will be created. With more than one aggregator within a team, the redundant switch failover mechanism RSF provided by SysKonnect can be employed. This unique feature is described in the next chapter Redundant Switch Failover.
16
4 Configuration
Figure 5. Link Aggregation and RSF Example In the above system data is normally transferred from Server A to Server B via Aggregator A1, Switch 1, and Aggregator B1 and vice versa, i.e. Aggregators A1 and B1 are the active links, Aggregators A2 and B2 are in hot standby. Scenario 1: Port failure Assume two links of Aggregator A1 fail: RSF switches the data flow of Server A to Aggregator A2, because data is always transferred on the link with the larger bandwidth. Data from Server A to Server B is then transmitted via Aggregator A2, Switch 2, Switch 1, and Aggregator B1 (is still the aggregator with the greatest bandwidth in Team of Server B) and vice versa. In case the two links of Aggregator A1 become active again, RSF switches data transfer back to Aggregator A1 due to the larger bandwidth. Scenario 2: Switch failure Assume Switch 1 fails: RSF switches the data flow from Server A to Aggregator A2 and the data flow from Server B to Aggregator B2 because both Aggregator A1 and B1 do not have an active link anymore. Data from Server A to Server B is then transmitted via Aggregator A2, Switch 2, and Aggregator B2 and vice versa. When switch 1 is functional again, RSF switches data transmission back to Aggregator A1 and B1 due to the larger bandwidth.
17
Figure 6. Adapter Overview in SysKonnect Network Control The corresponding tab in the SysKonnect Network Control where the user is able to configure the link aggregation features is called Team. The Team tab shows all links or ports of SysKonnect Gigabit Ethernet adapters which are available for teaming or have been already grouped to form a team.
18
4 Configuration
In this tab the user is able to create teams, add ports to teams, remove teams, and rename teams. After a new team has been created the user can add ports to this team. If a port from this team has found a partner on the other end of the connection, which is suitable for link aggregation, a message is displayed next to the corresponding port (see Figure 8).
Figure 8. 802.3ad partner found If a team is selected, the corresponding parameters are displayed below the tree view. The following parameters can be viewed:
Parameter IP Address Values xxx.xxx.xxx.xxx (decimal) 12..9014 Description Unique 32-bit (4 bytes) address of an end station within a TCP/IP network. The IP address can not be defined in the SysKonnect Network Control. This parameter specifies the maximum frame size in bytes the driver will support. The performance of the network usually increases if a large packet size is used. Do not use values larger than 1514 if you are not sure whether or not your network supports jumbo frames. If VLAN is configured the actual frame size on the port is always 4 Bytes larger than the configured frame size of the VLAN because the VLAN tag is inserted into the frame. The Maximum Frame Size can be defined in the SysKonnect Network Control. This option specifies the maximum number of multicast addresses the driver accepts. The Maximum Multicast can be defined in the SysKonnect Network Control.
Maximum Multicast
0..10000
If a port, which is part of a team, is selected, the following additional link aggregation parameters are displayed among others (see Figure 9):
Parameter Link Status Values Link up Link down Description This parameter indicates whether the port has an active link (up) or is inactive (down). The status of this parameter is delivered by the network driver. This parameter indicates whether the port has found an 802.3ad partner with which it can form a team. The status of this parameter is delivered by the network driver.
Switch membership
e.g. sw1, sw2, sw3, This parameter shows to which switch the port is con... nected physically. The information for this parameter is delivered by the network driver. It can only be changed, if you physically connect the port to a different switch. active standby This parameter indicates whether the port is active or in standby mode to take over in case the active link fails.
Failover Status
19
20
4 Configuration
5 Conclusion
SysKonnect s solution for Link Aggregation offers two main features which are essential for every network administrator: it provides increased capacity and a fail safe system. By employing Link Aggregation the costs for upgrading the performance and the resiliency of a system can be kept reasonable because both benefits can be attained using existing hardware. By using the automatic configuration protocol LACP we can provide redundancy with automatic switching to the standby link in case the active link fails. The SysKonnect driver enables load balancing not only on the basis of MAC address information but also on the basis of IP, TCP, and UDP information. Higher throughput by aggregating multiple links is possible with existing hardware. No additional network adapters have to be purchased. The benefits of Link Aggregation can be reached with the SysKonnect Network Driver Installation Package for Windows 2000. The package contains the Miniport driver, the Virtual LAN (VLAN) intermediate driver, the Link Aggregation (LAGG) intermediate driver and the configuration utility SysKonnect Network Control and is available for free download for SysKonnect customers on our web site: www.syskonnect.com. Demanding applications running in high-performance environments like servers in enterprises, web servers, and intranet servers gain particularly from the high-bandwidth and duplex capabilities of Link Aggregation. The SysKonnect implementation of Link Aggregation also provides a perfect solution on the road to the migration to 10 Gigabit Ethernet which will be integrated in the future. The user can fill the capacity gap by employing for example four 1000 Mb/s adapters in a team and also gets the benefit of a failsafe system by making use of the Redundant Switch Failover mechanism the SysKonnect driver provides.
Headquarters SysKonnect GmbH Siemensstrasse 23 D-76275 Ettlingen Germany Phone: Support: Fax: + 49 7243 502 100 + 49 7243 502 330 + 49 7243 502 989
Americas, Canada, and Pacific Rim SysKonnect, Inc. 1922 Zanker Road San Jose, CA 95112 USA Phone: + 1 408 437 3800 Sales: + 1 800 752 3334 Support: +1 866 782 2507 + 1 408 437 3857 Fax: + 1 408 437 3866 E-mail: [email protected]
Europe, Middle East, and Africa SysKonnect Ltd. 55 Henley Drive Frimley Green, Camberley Surrey, Gu16 6NF United Kingdom Phone: Fax: + 44 1 252 836 467 + 44 1 252 836 537
E-mail: [email protected]
E-mail: [email protected]