This chapter introduces general campus switching design considerations and describes modularity in switching designs. It includes the following sections:
Campus Design Methodology
Case Study and Simulation Exercise
The availability of multigigabit campus switches gives customers the opportunity to build extremely high-performance, high-reliability networksif they follow correct network design approaches. Unfortunately, some alternative network design approaches can result in a network that has lower performance, reliability, and manageability.
This chapter describes a hierarchical modular design approach called multilayer design. First, it addresses general campus switching design considerations. The differences between Layer 2 (L2) and Layer 3 (L3) switching, and where to use each, are also discussed.
When you finish this chapter, you will be able to understand campus network switch design fundamentals and describe the positioning of switches in campus network modules.
Campus Design Methodology
The multilayer approach to campus network design combines Layer 2 switching with Layer 3 switching to achieve robust, highly-available campus networks. This section discusses the factors you should consider for a Campus local-area network (LAN) design.
Designing an Enterprise Campus
Designing an Enterprise Campus network requires a broad view of the network's overall picture. The network designer must be familiar with both Enterprise Campus design methodologies and Enterprise Campus modules.
Campus design requires an understanding of the organizational network borders (geography) and the existing and planned application traffic flows. Physical characteristics of the network depend on the following criteria:
Selected transmission media
The type of technology (switched or shared)
The type of traffic forwarding (switching) in network devices (Layer 2 or Layer 3)
You should consider the following five factors when deploying the campus network:
Network geographyThe distribution of network nodes (for example, host or network devices) and the distances between them significantly affect the campus solutionespecially the physical transmission media.
Network applicationsIn terms of bandwidth and delay, the application requirements place stringent requirements on a campus network solution.
Data link layer technology (shared or switched)The dedicated bandwidth solution of LAN switching is replacing the traditional approach, in which all devices share the available bandwidth using hubs. The network designer must consider these options, especially when migrating or upgrading existing networks.
Layer 2 versus Layer 3 switchingThe network devices and their features determine the network's flexibility, but also contribute to the network's overall delay. Layer 2 switching is based on media access control (MAC) addresses, and Layer 3 switching is based on network layer addressesusually Internet Protocol (IP) addresses.
Transmission media (physical cabling)Cabling is one of the biggest long-term investments in network deployment. Therefore, transmission media selection depends not only on the required bandwidth and distances, but also on the emerging technologies that might be deployed over the same infrastructure in the future. The network designer must thoroughly evaluate the cost of the medium (including installation costs) and the available budget in addition to the technical characteristics, such as signal attenuation and electromagnetic interference. Two major cabling options exist: copper-based media (for example, unshielded twisted pair [UTP]) and optical fiber.
The following sections examine these factors.
The location of Enterprise Campus nodes and the distances between them determine the network's geography. When designing the Enterprise Campus network, the network designer's first step is to identify the network's geography. The network designer must determine the following:
Location of nodesNodes (end users, workstations, or servers) within an organization can be located in the same room, building, or geographical area.
Distances between the nodesBased on the location of nodes and the distance between them, the network designer decides which technology should be used, the maximum speeds, and so on. (Media specifications typically include a maximum distance, how often regenerators can be used, and so on.)
The following geographical structures can be identified with respect to the network geography:
Distant remote building
Distant remote building over 100 km
These geographical structures serve as guides to help determine Enterprise Campus transmission media and the logical modularization of the Enterprise Campus network. The following sections describe these geographical structures.
An intra-building campus network structure provides connectivity for the end nodes, which are all located in the same building, and gives them access to the network resources. (The access and distribution layers are typically located in the same building.)
User workstations are usually attached to the floor-wiring closet with UTP cables. To allow the most flexibility in the use of technologies, the UTP cables are typically Category 5 (CAT 5) or better. Wiring closets usually connect to the building central switch (distribution switch) over optical fiber. This offers better transmission performances and is less sensitive to environmental disturbances.
As shown in Figure 4-1, an inter-building network structure provides the connectivity between the individual campus buildings' central switches (in the distribution and/or core layers). Typically placed only a few hundred meters to a few kilometers apart, these buildings are usually in close proximity.
Figure 4-1 Inter-Building Network Structure
Because the nodes in all campus buildings usually share common devices such as servers, the demand for high-speed connectivity between the buildings is high. To provide high throughput without excessive interference from environmental conditions, optical fiber is the media of choice between the buildings.
Distant Remote Building Structure
When connecting distances that exceed a few kilometers (usually within a metropolitan area), the network designer's most important factor to consider is the physical media. The speed and cost of the network infrastructure depend heavily on the media selection.
Usually, the bandwidth requirements are higher than the physical connectivity options can support. In such cases, the network designer must identify the organization's critical applications and then select the equipment that supports intelligent network services, such as quality of service (QoS) and filtering capabilities that allow optimal use of the bandwidth.
Some companies might own their media, such as fiber or copper lines. However, if the organization does not own physical transmission media to certain remote locations, the Enterprise Network Campus must connect through the Enterprise Edge wide-area network (WAN) module using connectivity options from public service providers (such as metropolitan area network [MAN]).
Network Geography Considerations
Table 4-1 compares the types of connectivity, availability importance, required throughput, and expected cost for each geographical structure.
Table 4-1 Network Geography Considerations
Distant Remote Building
Distant Over 100 km
MM = Multimode; SM = single-mode
Depending on the distances and environmental conditions that result from the respective geographical scopes, various connectivity options existranging from traditional copper media to fiber-based transmission media.
Typically, availability within a building is very important, and it decreases with distance between buildings. (This is because the physical buildings in the campus often form the core of the campus network; communication to buildings located farther from the core is not as important.)
The throughput requirements increase close to the network's core and close to the sites where the servers reside.
A quick review of Table 4-1 reveals a combination of a high level of availability, medium bandwidth, and a low price for the Enterprise Campus network when all nodes are located in the same building. The cost of transmission media increases with the distance between nodes. A balance between the desired bandwidth and available budget are usually required to keep the cost reasonable; bandwidth is often sacrificed.
Network Application Characterization
Application characterization is the process of determining the characteristics of the network's applications. Network designers should determine which applications are critical to the organization and the network demands of these applications to determine enterprise traffic patterns inside the Enterprise Campus network. This process should result in information about network bandwidth usage and response times for certain applications. These parameters influence the selection of the transmission medium and the desired bandwidth.
Different types of application communication result in varying network demands. The following sections review four types of application communication:
From the network designer's perspective, client-client applications include those applications in which the majority of network traffic passes from one network edge device to another through the organization's network, as shown in Figure 4-2. Typical client-client applications include the following:
IP telephonyTwo peers establish communication with the help of a telephone manager workstation; however, the conversation occurs directly between the two peers when the connection is established.
File sharingSome operating systems (or even applications) require direct access to data on other workstations.
Videoconference systemsThis application is similar to IP telephony. However, the network requirements for this type of application are usually higher, particularly bandwidth consumption and QoS requirements.
Figure 4-2 Client-Client Application
Client-Distributed Server Applications
Historically, clients and servers were attached to a network device on the same LAN segment.
With increased traffic on the corporate network, an organization can decide to split the network into several isolated segments. As shown in Figure 4-3, each of these segments has its own servers, known as distributed servers, for its application. In this scenario, servers and users are located in the same virtual LAN (VLAN). Department administrators manage and control the servers. The majority of department traffic occurs in the same segment, but some data exchange (to a different VLAN) can happen over the campus backbone. For traffic passing to another segment, the overall bandwidth requirement might not be crucial. For example, Internet access must go through a common segment that requires less performance than the traffic to the local segment servers.
Figure 4-3 Client-Distributed Server Application
Client-Server Farm Applications
In a large organization, the organizational application traffic passes across more than one wiring closet, or VLAN. Such applications include
Organizational mail servers (such as Lotus Notes and Microsoft Exchange)
Common file servers (such as Novell, Microsoft, and Sun)
Common database servers for organizational applications (such as Sybase, Oracle, and IBM)
A large organization requires its users to have fast, reliable, and controlled access to the critical applications. To fulfill these demands and keep administrative costs down, the solution is to place the servers in a common Server Farm, as shown in Figure 4-4. The placement of servers in a Server Farm requires the network designer to select a network infrastructure that is highly resilient (providing security), redundant (providing high availability), and that provides adequate throughput. High-end LAN switches with the fastest LAN technologies, such as Gigabit Ethernet, are typically deployed in such an environment.
Figure 4-4 Client-Server Farm Application
Client-Enterprise Edge Applications
As shown in Figure 4-5, Client-Enterprise Edge applications use servers on the Enterprise Edge. These applications exchange data between the organization and its public servers.
The most important communication issue between the Enterprise Campus Network and the Enterprise Edge is not performance, but security. High availability is another important characteristic; data exchange with external entities must be in constant operation. Applications installed on the Enterprise Edge can be crucial to organizational process flow; therefore, any outages can increase costs.
Typical Enterprise Edge applications are based on web technologies. Examples of these application types, such as external mail servers and public web servers, can be found in any organization.
Figure 4-5 Client-Enterprise Edge Application
Organizations that support their partnerships through e-commerce applications also place their e-commerce servers into the Enterprise Edge. Communication with these servers is vital because of the two-way replication of data. As a result, high redundancy and resiliency of the network, along with security, are the most important requirements for these applications.
Table 4-2 compares the types of applications and their requirements for the most important network parameters. The following sections discuss these parameters.
Table 4-2 Network Application Requirements
Client-Enterprise Edge Servers
Total required throughput
Total network cost
The wide use of LAN switching at Layer 2 has revolutionized local-area networking and has resulted in increased performance and more bandwidth for satisfying the requirements of new organizational applications. LAN switches provide this performance benefit by increasing bandwidth and throughput for workgroups and local servers.
The shared media for client-client (also termed peer-to-peer) communication is suitable only in a limited scope, typically when the number of client workstations is very low (for example, in small home offices).
The required throughput varies from application to application. An application that exchanges data between users in the workgroup usually does not require a high throughput network infrastructure. However, organizational-level applications usually require a high-capacity link to the servers, which is usually located in the Server Farm.
Client-client communication, especially in the case of frequent file transfers, could be intensive, and the total throughput requirements can be high.
Applications located on servers in the Enterprise Edge are normally not as bandwidth- consuming (compared to the applications in the Server Farm) but may require high-availability and security features.
High availability is a function of the application and the entire network between a client workstation and a server that is located in the network. Although network availability is primarily determined by the network design, the individual components' mean time between failures (MTBF) is a factor. It is recommended that you add redundancy to the distribution layer and the campus.
Depending on the application and the resulting network infrastructure, the cost varies from low in a client-client environment to high in a highly redundant Server Farm. In addition to the cost of duplicate components for redundancy, costs include the cables, routers, switches, software, and so forth.
Data Link Layer Technologies
Traditionally, network designers had a limited number of hardware options when purchasing a technology for their campus networks. Hubs were used for wiring closets, and routers were used to break the network into logical segments. The increasing power of desktop processors and the requirements of client/server and multimedia applications drove the need for greater bandwidth in traditional shared-media environments. These requirements are prompting network designers to replace hubs with LAN switches.
Key Point: Bandwidth Domains and Broadcast Domains
A bandwidth domain, which is known as a collision domain for Ethernet LANs, includes all devices that share the same bandwidth. For example, when using switches or bridges, everything associated with one port is a bandwidth domain.
A broadcast domain includes all devices that see each other's broadcasts (and multicasts). For example, all devices associated with one router port reside in the same broadcast domain.
Devices in the same bandwidth domain also reside in the same broadcast domain; however, devices in the same broadcast domain can reside in different bandwidth domains.
All workstations residing in one bandwidth domain compete for the same LAN bandwidth resource. All traffic from any host in the bandwidth domain is visible to all the other hosts. In the case of an Ethernet collision domain, two stations can cause a collision by transmitting at the same time. The stations must then stop transmitting and try again at a later time, thereby delaying traffic transmittal.
All broadcasts from any host residing in the same broadcast domain are visible to all other hosts in the same broadcast domain. Desktop protocols such as AppleTalk, Internetwork Packet Exchange (IPX), and IP require broadcasts or multicasts for resource discovery and advertisement. Hubs, switches, and bridges forward broadcasts and multicasts to all ports. Routers do not forward these broadcasts or multicasts to any ports. In other words, routers block broadcasts (which are destined for all networks) and multicasts; routers forward only unicast packets (which are destined for a specific device) and directed broadcasts (which are destined for all devices on a specific network).
Shared technology using hubs or repeaters is based on all devices sharing a segment's bandwidth. Initially, the entire Ethernet segment was a single common busthe cable itself. With the introduction of hubs and new structured wiring, the physical network bus topology changed to a star topology. This topology resulted in fewer errors in the network because of the repeaters receiving an electrical signal and boosting the signal before forwarding it to all other segment participants (on all other repeater ports). All devices on all ports of a hub or repeater are on the same bandwidth (collision) domain.
Switched LAN Technology
Switched LAN technology uses the same physical star topology as hubs but eliminates the sharing of bandwidth. Devices on each port of a switch are in different bandwidth (collision) domains; however, all devices are still in the same broadcast domain. The LAN switches provide an efficient way of transferring network frames over the organizational network. In case of a frame error, the switch does not forward the frame as a hub or repeater would.
Comparing Switched and Shared Technologies
Table 4-3 presents some of the most obvious differences and benefits of switched technology compared to shared technology. It uses Fast Ethernet as an example.
Table 4-3 Switched Versus Shared Fast Ethernet Technologies
>10 Megabits per second (Mbps)
From 1 meter
The major drawback of shared technology is that all network devices must compete for the same bandwidth; only one frame flow is supported at a time. Bandwidth in shared technology is limited to the speed on a network segment (in this case, 100 Mbps for Fast Ethernet). Because of collisions, aggregate network bandwidth is less than this.
LAN switching technology supports speeds from Ethernet (10 Mbps) onward and enables multiple ports to simultaneously forward frames over the switch. Thus, the utilized aggregate network bandwidth could be much greater than with shared technology.
A Layer 3 device separates network segments from each other into different broadcast domains. A traditional network's Layer 3 device was a router; in a modern network, the preference is for a Layer 3 switch.
In a shared network, the network's diameter (the largest distance between two network devices) is constrained by the transmission media's physical characteristics because of the collision detection algorithmthe maximum distance between devices is limited to ensure that no collisions occur. In a shared environment, all devices reside in the same collision domain. The hub improves the frame's physical characteristics but does not check for frame errors. Every station on the segment must compete for resources and be able to detect whether two or more network stations are transmitting at the same time. The Ethernet standard for shared technology defines how long the sending device must possess the bus before it actually sends the data, so collisions can be detected. Because of this time limitation, the length or range of the segment is defined and never reaches more than 500 meters in the best-case scenario.
In a switched environment, devices on each port are in different collision domains. Collision detection is only a concern on each physical segment, and the segments themselves are limited in length. Because the switch stores the entire frame or part of it before forwarding it, the segments do not generate any collisions. The media that is used does not constrain the overall network's diameter.
The traditional shared technology is not capable of supporting new network features; this became important with the increasing number of organizational client/server and multimedia applications. LAN switches perform several functions at Layer 3, and even at higher Open System Interconnection (OSI) layers. Modern networks are required to support intelligent networkservices (such as QoS), security, and management; LAN switches have the ability to support these.
Many organizational processes that run on the network infrastructure are critical for the organization's success. Consequently, high availabilityhas become increasingly important. While shared networks do not offer the required capability, the LAN switches do.
Switches can be interconnected with multiple links without creating loops in the network (using the Spanning Tree protocol). Hubs cannot be interconnected with redundant links.
Considering all the benefits LAN switches offer, you might expect the cost per port to be much higher on switches than on hubs. However, with wide deployment and availability, the price per port for LAN switches is almost the same as it is for hubs or repeaters.
All of the previously listed factors have mostly eliminated shared technologies; the majority of new networks use only switched technologies. Shared technologies are present in only some parts of existing networks and in smaller home offices.
Layer 2 and Layer 3 Switching Design Considerations
LAN switches have traditionally been only Layer 2 devices. Modern switches provide higher OSI level functionalities and can effectively replace routers in the LAN switched environment. Deploying pure Layer 2 or selecting Layer 3 switches in the enterprise network is not a trivial decision. It requires a full understanding of the network topology and customer demands.
Key Point: Layer 2 Versus Layer 3 Switching
The difference between Layer 2 and Layer 3 switching is the type of information that is used inside the frame to determine the correct output interface. Layer 2 switching forwards frames based on data link layer information (MAC address), while Layer 3 switching forwards frames based on network layer information (such as IP address).
When deciding on the type of LAN switch to use and the features to be deployed into a network, consider the following factors:
Network service capabilitiesThe network services the organization requires (QoS, and so on).
Size of the network segmentsHow the network is segmented, based on traffic characteristics.
Convergence timesThe maximum amount of time the network can be unavailable in the event of network outages.
Spanning-Tree Domain Considerations
Layer 2 switches use the Spanning Tree Protocol (STP) to ensure that only one active path exists between two switches. If a physical loop exists (for redundancy), STP puts ports on the switch in blockingstate (thereby effectively disabling the ports, from a data perspective) to ensure a loop-free network. In the event of a failure, the blocked port is re-enabled (put into a forwardingstate). An STP domain is a set of switches that communicates via STP. STP is illustrated in Figure 4-6.
Figure 4-6 STP
STP selects a root switch (or root bridge, according to IEEE 802.1d standard terminology) and determines whether any redundant paths exist. After the switch comes online, it takes up to 50 seconds before the root switch and redundant links are detected. At this time, the switch ports go through the listening and learning states; from there they progress to either the forwarding or blocking state. No ordinary traffic can travel through the network at this time.
The default STP Forward Delay timer is 15 seconds; it determines how long the port stays in both the listening and learning states (for a total of 30 seconds). The Maximum Age timer defaults to 20 seconds; this is the time during which a switch stores a BPDU before discarding it, and therefore determines when the switch recognizes that a topology change has occurred. The addition of 30 seconds and 20 seconds composes the 50 seconds referred to previously.
When the primary link goes down and the redundant link must be activated, a similar event occurs. The time it takes for a redundant path to be activated depends on whether the failure is direct (a port on the same switch) or indirect (a port on another switch). Direct failures take 30 seconds because the switch bypasses the 20-second Maximum Age timer (and associated Blocking State for the port); from there it moves straight to the listening state (for 15 seconds), and then to the learning state (for 15 seconds). For indirect failures, the switch port must first wait 20 seconds (Maximum Age Timer) before it can transition to the listening state and then the learning state, for a total of 50 seconds. Thus, when a link fails, up to 50 seconds might pass before another link becomes available.
Cisco has implemented several features that have improved STP convergence. Recent standardization efforts have also proposed some new enhancements to the STP. Following is a brief description of the STP enhancements that result in faster convergence; this convergence is comparable to Layer 3 convergence and, in some instances, even exceeds it.
PortFastUsed for ports in which end-user stations and/or servers are directly connected. When PortFast is enabled, there is no delay in passing traffic because the switch immediately puts the port in the forwarding state (skipping the listening and learning states). Two additional measures that prevent potential STP loops are associated with the PortFast feature:
Bridge Protocol Data Unit (BPDU) GuardPortFast transitions the port into STP forwarding mode immediately upon linkup. Since the port still participates in STP, the potential of STP loop exists (if some device attached to that port also runs STP). The BPDU guard feature enforces the STP domain borders and keeps the active topology predictable. If the port receives a BPDU, the port is transitioned into errdisablestate (meaning that it was disabled due to an error) and an error message is reported.
BPDU FilteringThis feature allows the user to block PortFast-enabled nontrunk ports from transmitting BPDUs. Spanning tree does not run on these ports.
UplinkFastIf the link to the root switch goes down and the link is directly connected to the switch, UplinkFast enables the switch to put a redundant path (port) into active state within a second.
BackboneFastIf a link on the way to the root switch fails but is not directly connected to the switch, BackboneFast reduces the convergence time from 50 seconds to between 20 and 30 seconds. When this feature is used, it must be enabled on all switches in the STP domain.
In addition to features that enable faster convergence of the STP, features exist that prevent errors from resulting in unpredictable STP topology changes that could lead to STP loops. These features include the following:
STP Loop GuardWhen one of the blocking ports in a physically redundant topology stops receiving BPDUs, usually STP creates a potential loop by moving the port to forwarding state. With the STP Loop Guard feature enabled and if a blocking port no longer receives BPDUs, that port is moved into the STP loop-inconsistent blocking state instead of the listening/learning/forwarding state. This feature avoids loops in the network that result from unidirectional or other software failures.
BPDU Skew DetectionThis feature allows the switch to keep track of late-arriving BPDUs (by default, BPDUs are sent every 2 seconds) and notify the administrator via syslog messages. Skew detection generates a report for every port on which BPDU has ever arrived late (this is known as skewed). Report messages are rate-limited (one message every 60 seconds) to protect the CPU.
Unidirectional Link Detection (UDLD)If the STP process that runs on the switch with a blocking port stops receiving BPDUs from its upstream (designated) switch on that port, STP creates a forwarding loop or STP loop by eventually aging out the STP information for this port and moving it to the forwarding state. The UDLD is a Layer 2 protocol that works with the Layer 1 mechanisms to determine a link's physical status. If the port does not see its own device/port ID in the incoming UDLD packets for a specific duration of time, the link is considered unidirectional from the Layer 2 perspective. Once UDLD detects the unidirectional link, the respective port is disabled and the error message is generated.
Although spanning tree was previously considered to have very slow convergence (up to 50 seconds), the latest standard enhancements render its convergence comparable to (or even exceeding) that of routing protocols. The following enhancements are useful in environments that contain several VLANs:
Rapid STP (RSTP, defined in IEEE 802.1W)RSTP provides rapid convergence of the spanning tree by assigning port roles and determining the active topology. The RSTP builds upon the IEEE 802.1d STP to select the switch with the highest switch priority as the root switch and then assigns the port roles (root, designated, alternate, backup, and disabled) to individual ports. These roles assist in rapid STP convergence, which can be extremely fast (within a second) because of the topology knowledge.
Multiple STP(MSTP, sometimes referred to as MISTP [Multiple Instances of STP], defined in IEEE 802.1S)MSTP uses RSTP for rapid convergence by enabling several (topologically identical) VLANs to be grouped into a single spanning tree instance, with each instance including a spanning tree topology that is independent of other spanning tree instances. This architecture provides multiple forwarding paths for data traffic, enables load balancing, and reduces the number of spanning tree instances that are required to support a large number of VLANs.
Load Sharing Guidelines
Layer 2 and Layer 3 switches handle load sharing differently, as described in the following sections.
Layer 2 Load Sharing
Because Layer 2 switches are aware of only MAC addresses, they cannot perform any intelligent load sharing. In an environment characterized by multiple VLANs per access switch and more than one connection to the uplink switch, the solution is to put all uplink connections into trunks (Inter-switch link [ISL] or 802.1q). Each trunk carries all VLANs; however, without additional configuration, the STP protocol disables all nonprimary uplink ports. This configuration can result in a bandwidth shortage because the traffic for all the VLANs passes through the same link. To overcome this problem, the STP parameters must be configured to carry some VLANs across one uplink and the rest of the VLANs across the other uplink. For example, one uplink could be configured to carry the VLANs with odd numbers, while the other uplink is configured to carry the VLANs with even numbers. The top of Figure 4-7 illustrates this situation.
Figure 4-7 Layer 2 Versus Layer 3 Load Sharing
Layer 3 Load Sharing
Layer 3-capable switches can perform load sharing based on IP addresses. As illustrated in the lower portion of Figure 4-7, most modern Layer 3 devices with load sharing capability can balance the load per packet or per destination-source IP pair.
The advantage of Layer 3 IP load sharing is that links are used more proportionately than with Layer 2 load sharing, which is based on VLANs only. For example, the traffic in one VLAN can be very heavy while the traffic in another VLAN is very low; in this case, per-VLAN load sharing by using even and odd VLANs is not appropriate. Due to the dynamic nature of organizational applications, Layer 3 load sharing is more appropriate. Layer 3 allows for dynamic adaptation to link utilization and depends on the routing protocol design. Layer 3 switches also support Layer 2 load sharing, so they can still apply per-VLAN load sharing while connected to other Layer 2 switches.
Layer 2 Versus Layer 3 Switching
Table 4-4 compares Layer 2 and Layer 3 switching with respect to various campus network features. Considerations for deployment include
Pure Layer 2 switching throughout the network
Various combinations of Layer 2 and Layer 3 switching, including
Layer 3 switching in the distribution layer only
Layer 3 switching in the distribution and core layers
Layer 3 switching throughout the network
Table 4-4 Layer 2 Versus Layer 3 Switching
Layer 2 Everywhere
Layer 3 in Distribution Only
Layer 3 in Core and Distribution
Layer 3 Everywhere
Access Control List (ACL) and QoS
Layer 2 and Layer 3
ACL and QoS
Layer 2 and Layer 3
ACL and QoS
Layer 2 and Layer 3
ACL and QoS
Distribution: Routing protocol hold-timer (quick)
Core and distribution: Routing protocol hold-timer (quick)
Routing protocol hold-timer (quick)
The following sections elaborate on the features in Table 4-4.
The policy domain is the scope of the network that is affected by a certain policy. A network policy is a formal set of statements that define how network resources are allocated among devices. In addition to selected hosts or applications, the policies can be applied to individual users, groups, or entire departments. For example, policies can be based on the time of day or client authorization priorities. Network managers implement policies and policy statements and store them in a policy repository or on the device itself. The devices then apply the configured policies to network resources.
The size of the policy domain depends on the switching layer and on the mechanisms for policy implementation. In pure Layer 2 switching, the policy domain overlaps with the switching domain's boundaries; Layer 3 switching offers much more flexibility. In Layer 2 switching, the access control lists (ACLs) and various QoS mechanisms can only be applied to switched ports and MAC addresses; in the Layer 3 switching, the ACL and QoS mechanisms are extended to IP addresses, or even applications (for example, using Transmission Control Protocol [TCP] and User Datagram Protocol [UDP] ports).
When multiple links exist, they can be used for redundancy and/or traffic load sharing. As discussed in the "Load Sharing Guidelines" section of this chapter, Layer 2 switches only offer load sharing by distributing VLANs across different uplink ports. Layer 3 switches, however, can perform load sharing between ports based on IP destinations.
A failure domain defines the scope of the network that is affected by network failures. In a Layer 2-switched domain, a misconfigured or malfunctioning workstation can introduce errors that impact or disable the entire domain. Problems of this nature are often difficult to localize.
A failure domain is
Bounded by Layer 3 switching
Bounded by the VLAN when Layer 2 switching is deployed in an entire campus
As discussed in the "Spanning-Tree Domain Considerations" section of this chapter, loop prevention mechanisms in a Layer 2 topology cause the STP to take between 30 and 50 seconds to converge. To eliminate STP convergence issues in the campus backbone, all the links connecting backbone switches must be routed links, not VLAN trunks. This also limits the broadcast and failure domains.
In the case where the Layer 3 switching is deployed everywhere, convergence is within seconds (depending on the routing protocol implemented) because all the devices detect their connected link failure immediately and act upon it promptly (sending respective routing updates).
In a mixed Layer 2 and Layer 3 environment, the convergence time not only depends on the Layer 3 factors (including routing protocol timers such as hold-time and neighbor loss detection), but also on the STP convergence.
Using Layer 3 switching in a structured design reduces the scope of spanning tree domains. It is common to use a routing protocol, such as Enhanced Interior Gateway Protocol (EIGRP) or Open Shortest Path First (OSPF), to handle load balancing, redundancy, and recovery in the backbone.
The cost of deploying Layer 3 switching in comparison to Layer 2 switching increases with the scope of Layer 3 switching deployment. Layer 3 switches are more expensive than their Layer 2 counterparts; for example, Layer 3 functionality can be obtained by adding cards and software to a modular Layer 2 switch.
An Enterprise Campus can use various physical media to interconnect devices.
Selecting the type of cable is an important consideration when deploying a new network or upgrading an existing one. Cabling infrastructure represents a long-term investmentit is usually installed to last for ten years or more. In addition, even the best network equipment does not operate as expected with poorly chosen cabling.
A network designer must be aware of physical media characteristics because they influence the maximum distance between devices and the network's maximum transmission speed.
Twisted-pair cables (copper) and optical cables (fiber) are the most common physical transmission media used in modern networks.
Unshielded Twisted-Pair (UTP) Cables
UTP consists of four pairs of isolated wires that are wrapped together in plastic cable. No additional foil or wire is wrapped around the core wires (thus, they are unshielded). This makes these wires less expensive, but also less immune to external electromagnetic influences than shielded cables. UTP is widely used to interconnect workstations, servers, or other devices from their network interface card (NIC) to the network connector at a wall outlet.
The characteristics of twisted-pair cable depend on the quality of their material. As a result, twisted-pair cables are sorted into categories. Category 5 or greater is recommended for speeds of 100 megabits per second (Mbps) or higher. Because of the possibility of signal attenuation in the wires and carrier detection, the maximum cable length is usually limited to 100 meters . For example, if one PC starts to transmit and another PC is more than 100 meters away, the second PC might not detect the signal on the wire and therefore start to transmit, causing a collision on the wire.
One of the frequent considerations in the cabling design is electromagnetic interference. Due to high susceptibility to interference, UTP is not suitable for use in environments with electromagnetic influences. Similarly, UTP is not appropriate for environments that can be affected by the UTP's own interference.
Some security issues are also associated with electromagnetic interferenceit is easy to eavesdrop on the traffic carried across UTP because these cables emit electromagnetic interference.
Typical requirements that lead to the selection of optical cable as a transmission media include distances longer than 100 meters, and immunity to electromagnetic interference. There are different types of optical cable; the two main types are multimode (MM) and single-mode (SM).
Both MM and SM optical cable have lower signal losses than a twisted pair cable; therefore, optical cables automatically enable longer distances between devices. However, fiber cable has precise production and installation requirements, resulting in a higher cost than twisted pair cable.
Multimode fiber is optical fiber that carries multiple light waves or modes concurrently, each at a slightly different reflection angle within the optical fiber core. Because modes tend to disperse over longer lengths (modal dispersion), MM fiber transmission is used for relatively short distances. Typically, light emitting diodes (LEDs) are used with MM fiber. The typical diameter of an MM fiber is 50 or 62.5 micrometers.
Single-mode (also known as monomode) fiber is optical fiber that carries a single wave (or laser) of light. Lasers are typically used with SM fiber. The typical diameter of an SM fiber core is between 2 and 10 micrometers.
Copper Versus Fiber
Table 4-5 presents some of the critical parameters that influence the network transmission medium selection.
Table 4-5 Copper Versus Fiber Media
Ethernet: <1 gigabits per second (Gbps)
LRE: <15 Mbps
Ethernet: <100 m
MM: 550 m*
SM: <100 km*
Inter-node and inter-building
* When using Gigabit Ethernet
Table 4-5 lists Ethernet as a technology; this includes Ethernet, Fast Ethernet, and Gigabit Ethernet. Long Reach Ethernet (LRE) is also listed. This latter technology is Cisco proprietary and runs on voice-grade copper wires; it allows higher distances than traditional Ethernet and is used as an access technology in WANs. Chapter 5, "Designing WANs," further describes LRE.
The following sections elaborate on the parameters in Table 4-5.
The bandwidth parameter indicates the required bandwidth in a particular segment of the network, or the connection speed between the nodes inside or outside the building.
The range parameter is the maximum distance between network devices (such as workstations, servers, printers, and IP phones) and network nodes, and between network nodes.
Table 4-6 summarizes the bandwidth and range characteristics of the transmission media types.
Table 4-6 Transmission Media Types Bandwidth and Range Characteristics
Up to 100 meters
Up to 2 kilometers (km)(Fast Ethernet) Up to 550 m (Gigabit Ethernet)
Up to 40 km Up to 100 km (Gigabit Ethernet)
Up to 1 Gpbs
Up to 1 Gbps
Cheap to install
Copper cables are typically used for connectivity of network devices to the wiring closet where
Distances are less than 100 meters
Speeds of 100 Mbps are satisfactory
Cost must be kept within reasonable limits
Fast EtherChannel (FEC) and Gigabit EtherChannel solutions group several parallel links between LAN switches into a channel that is seen as a single link from the Layer 2 perspective. Two protocols have been introduced for automatic EtherChannel formation: the Port Aggregation Control Protocol (PagP), which is Cisco proprietary, and the Link Aggregation Control Protocol (LACP), which is standardized and defined in IEEE 802.3ad.
Deployment area indicates whether wiring is required for wiring closet only (where users access the network), for inter-node, or even for inter-building connections.
Connection from the wiring closet to the building central node can use UTP. As for most inter-node and especially inter-building connections, MM, or even SM, fiber is probably needed if there are high-speed requirements.
When deploying UTP in an area with high electrical or magnetic interferencefor example, in an industrial environmentyou must pay special attention to media selection. In such environments, the disturbances might interfere with data transfer and therefore result in an increased number of frame errors. Electrical grounding can isolate some external disturbance, but the wiring increases the costs. Fiber optic installation is the only reasonable solution for such networks.
Optical fiber requires a precise technique for cable coupling. Even a small deviation from the ideal position of optical connectors can result in either a loss of signal or a large number of frame losses. Careful attention during optical fiber installation is imperative because of the traffic's high sensitivity to coupling misalignment. In environments where the cable does not consist of a single fiber from point to point, coupling is required and loss of signal can easily occur.
Along with the cost of the medium, you must also seriously consider installation cost. Installation costs are significantly higher than UTP installation costs because of strict requirements for optical cable coupling.
Figure 4-8 illustrates a typical campus network structure. End devices such as workstations, IP phones, and printers are no more than 100 m away from the LAN switch. UTP wiring can easily handle the required distance and speed; it is also easy to set up, and the price/performance ratio is reasonable.
Figure 4-8 A Campus Network Uses Many Different Types of Cables
Optical fiber cables handle higher speeds and distances that can be required among switch devices. MM optical cable is usually satisfactory inside the building. Depending on distance, organizations use MM or SM optical for inter-building communication cable. If the distances are short (up to 500 m), MM fiber is a more reasonable solution for speeds up to 1 Gbps.
However, an organization can install SM fiber if its requirements are for longer distances, or if they are planning for future higher speeds (for example, 10 Gbps). The current specification provides Gigabit Ethernet connectivity on SM fiber up to 5 km; however, Cisco has already provided modules that support connectivity up to 10 km, and even up to 100 km.
Selecting the less expensive type of fiber might satisfy a customer's current need, but this fiber might not meet the needs of future upgrades or equipment replacement. Replacing cable can be very expensive. Planning with future requirements in mind might result in higher initial costs, but ultimately lower costs.
This chapter covers the following key topics:
Operation of NATThis section discusses the basics of network address translation, including fundamental concepts and terminology, and typical NAT applications.
NAT IssuesThis section examines some potential problems that you might encounter with NAT. Solutions to many of the problems, either through Cisco IOS Software functionality or through design techniques, are identified.
Configuring NATThis section presents case studies demonstrating how Cisco IOS Software is configured to perform typical NAT functions.
Troubleshooting NATThis section examines various methods and tools for troubleshooting Cisco NAT.
The acronym NAT is used interchangeably to mean network address translation and network address translator (software that runs the NAT function).
Operation of NAT
NAT is described in RFC 1631.1 The original intention of NAT was, like classless interdomain routing (CIDR), to slow the depletion of available IP address space by allowing many private IP addresses to be represented by some smaller number of public IP addresses. Since that time, users have found NAT to be a useful tool for network migrations and mergers, server load sharing, and creating "virtual servers." This section examines all these applications, but first describes the basics of NAT functionality and terminology.
Basic NAT Concepts
Figure 4-1 depicts a simple NAT function. Device A has an IP address that belongs to the private range specified by RFC 1918, whereas device B has a public IP address. When device A sends a packet to device B, the packet passes through a router that is running NAT. The NAT replaces device A's private address (192.168.2.23) in the source address field with a public address (188.8.131.52) that can be routed across the Internet, and forwards the packet. When device B sends a reply to device A, the destination address of the packet is 184.108.40.206. This packet again passes through the NAT router, and the destination address is replaced with device A's private address.
Figure 4-1 The NAT Router Replaces the Private Address of Device A (192.168.2.23) with a Publicly Routable Address (220.127.116.11)
NAT is transparent to the end systems involved in the translation. In Figure 4-1, device A knows only that its IP address is 192.168.2.23; it is unaware of the 18.104.22.168 address. Device B, on the other hand, thinks the address of device A is 22.214.171.124; it knows nothing about the 192.168.2.23 address. That address is "hidden" from device B.
NAT can hide addresses in both directions. In Figure 4-2, NAT is performed on the addresses of both device A and device B. Device A thinks device B's address is 172.16.80.91, when in fact device B's real address is 126.96.36.199. You can see that the NAT router is translating both the source and destination addresses in both directions to support this address scheme.
Cisco NAT devices divide their world into the inside and the outside. Typically the inside is a private enterprise or ISP, and the outside is the public Internet or an Internet-facing service provider. Additionally, a Cisco NAT device classifies addresses as either local or global. A local address is an address that is seen by devices on the inside, and a global address is an address that is seen by devices on the outside. Given these four terms, an address may be one of four types:
Inside local (IL)Addresses assigned to inside devices. These addresses are not advertised to the outside.
Inside global (IG)Addresses by which inside devices are known to the outside.
Outside global (OG)Addresses assigned to outside devices. These addresses are not advertised to the inside.
Outside local (OL)Addresses by which outside devices are known to the inside.
Figure 4-2 The NAT Router Is Translating Both the Source and Destination Addresses in Both Directions
In Figure 4-2, device A is on the inside and device B is on the outside. 192.168.2.23 is an inside local address, and 188.8.131.52 is an inside global address. 172.16.80.91 is an outside local address, and 184.108.40.206 is an outside global address.
IG addresses are mapped to IL addresses, and OL addresses are mapped to OG addresses. The NAT device tracks these mappings in an address translation table. Example 4-1 shows the address translation table for the NAT router in Figure 4-2. This table contains three entries. Reading the entries from the bottom up, the first entry maps OL address 172.16.80.91 to the OG address 220.127.116.11. The next entry maps the IG address 18.104.22.168 to the IL address 192.168.2.23. These two entries are static, created when the router was configured to translate the specified addresses. The last (top) entry maps the inside addresses to the outside addresses. This entry is dynamic and was created when device A first sent a packet to device B.
Example 4-1 The Address Translation Table of the NAT Router in Figure 4-2NATrouter#show ip nat translations Pro Inside global Inside local Outside local Outside global --- 22.214.171.124 192.168.2.23 172.16.80.91 126.96.36.199 --- 188.8.131.52 192.168.2.23 --- --- --- --- --- 172.16.80.91 184.108.40.206 NATrouter#
As the preceding paragraph demonstrates, a NAT entry may be static or dynamic. Static entries are one-to-one mappings of local addresses and global addresses. That is, a unique local address is mapped to a unique global address. Dynamic entries may be many-to-one or one-to-many. A many-to-one mapping means that many addresses can be mapped to a single address. In a one-to-many mapping, a single address can be mapped to one of several available addresses.
The following sections describe several common applications of NAT and demonstrate more clearly how static NAT and the various implementations of dynamic NAT operate.
NAT and IP Address Conservation
The original mission of NAT was to slow the depletion of IP addresses, and this is the focus of RFC 1631. The core assumption of the concept is that only some of an enterprise's hosts will be connected to the Internet at any one time. Some devices (print servers and DHCP servers, for example) never require connectivity outside of the enterprise at all. As a result, the enterprise can be addressed out of the private RFC 1918 address space, and a significantly smaller number of uniquely assigned public addresses are placed in a pool on a NAT at the edge of the enterprise, as demonstrated in Figure 4-3. The non-unique private addresses are IL addresses, and the public addresses are IG addresses.
Figure 4-3 In This NAT Design, a Pool of Public IP Addresses Serves a Private Address Space 8 Times as Large
When an inside device sends a packet to the Internet, the NAT dynamically selects a public address from the inside global address pool and maps it to the device's inside local address. This mapping is entered into the NAT table. For instance, Example 4-2 shows that three inside devices from the enterprise in Figure 4-310.1.1.1.20, 10.1.197.64, and 10.1.63.148 have sent packets through the NAT. Three addresses from the IG pool220.127.116.11, 18.104.22.168, and 22.214.171.124, respectivelyhave been mapped to the IL addresses.
Example 4-2 Three Addresses from the Inside Local Address Space in Figure 4-3 Have Been Dynamically Mapped to Three Addresses from the Inside Global Address PoolNATrouter#show ip nat translations Pro Inside global Inside local Outside local Outside global --- 126.96.36.199 10.1.1.20 --- --- --- 188.8.131.52 10.1.197.64 --- --- --- 184.108.40.206 10.1.63.148 --- --- NATrouter#
The destination address of any packet from an outside device responding to the inside device is the IG address. Therefore, the original mapping must be held in the NAT table for some length of time to ensure that all packets of a particular connection are translated consistently. Holding an entry in the NAT table for some period also reduces subsequent lookups when the same device regularly sends packets to the same or multiple outside destinations.
When an entry is first placed into the NAT table, a timer is started; the period of the timer is the translation timeout. Each time the entry is used to translate the source or destination address of a subsequent packet, the timer is reset. If the timer expires, the entry is removed from the NAT table and the dynamically assigned address is returned to the pool. Cisco's default translation timeout is 86,400 seconds (24 hours); you can change this with the command ip nat translation timeout.
The default translation timeout varies according to protocol. Table 4-3, later in this chapter, displays these values.
This particular NAT application is a many-to-one application, because for each IG address in the pool, many IL addresses could be mapped to it. In the case of Figure 4-3, an 8-to-1 relationship exists. This is a familiar concepttelcos use it when they design switches and trunks that can handle only a portion of their total subscribers, and airlines use it when they overbook flights. Think of it as statistically multiplexing IL addresses to IG addresses. The risk, as with telcos and airlines, is in underestimating peak usage periods and running out of capacity.
No restrictions apply to the ratio of the size of the local address space and the size of the address pool. In Figure 4-3, the IL range and/or the IG range can be made larger or smaller to fit specific requirements. For example, the IL range 10.0.0.0/8, comprising more than 16 million addresses, can be mapped to a four-address pool of 220.127.116.1118.104.22.168 or smaller. The real limitation is not the number of possible addresses in the specified IL range, but the number of actual devices using addresses in the range. If only four devices are using addresses out of the 10.0.0.0/8 range, no more than four addresses are needed in the pool. If there are 500,000 devices on the inside, you need a bigger pool.
When an address from the dynamic pool is in the NAT table, it is not available to be mapped to any other address. If all the pool addresses are used up, subsequent inside packets attempting to pass through the NAT router cannot be translated and are dropped. Therefore, it is important to ensure that the NAT pool is large enough, and that the translation timeout is small enough, so that the dynamic address pool never runs dry.
Almost all enterprises have some systems, such as mail, Web, and FTP servers, that must be accessible from the outside. The addresses of these systems must remain the same; otherwise outside hosts will not know from one time to the next how to reach them. Therefore, you cannot use dynamic NAT with these systems; their IL addresses must be statically mapped to IG addresses. The IG addresses used for static mapping must not be included in the dynamic address pool; although the IG address is permanently entered into the NAT table, the same address can still be chosen from the dynamic pool, creating an address ambiguity.
The NAT technique described in this section can be very useful for scaling a growing enterprise. Rather than repeatedly requesting more address space from the addressing authorities or the ISP, you can move the existing public addresses into the NAT pool and renumber the inside devices from a private address space. Depending on the size of the organization and the structure of its existing address allocations, you can perform the renumbering as a single project or as an incremental migration.
NAT and ISP Migration
One of the drawbacks of CIDR, as discussed in Chapter 2, "Introduction to Border Gateway Protocol 4," is that it can increase the difficulty of changing Internet service providers. If you have been assigned an address block that belongs to ISP1, and you want to change to ISP2, you almost always have to return ISP1's addresses and acquire a new address range from ISP2. This return can mean a painful and costly re-addressing project within your enterprise.
It cannot be overemphasized that the pain and expense of an address migration is sharply reduced when the addressing scheme is well designed in the first place.
Suppose you are a subscriber of ISP1, which has a CIDR block of 22.214.171.124/20, and the ISP has assigned you an address space of 126.96.36.199/23. You then decide to switch your Internet service to ISP2, which has a CIDR block of 188.8.131.52/19. ISP2 assigns you a new address space of 184.108.40.206/23. Instead of renumbering your inside systems, you can use NAT (see Figure 4-4). The 220.127.116.11/23 address space has been returned to ISP1, but you continue to use this space for the IL addresses. Although the addresses are from the public address space, you can no longer use them to represent your internetwork to the public Internet. You use the 18.104.22.168/23 space from ISP2 as the IG addresses and map (statically or dynamically) the IL addresses to these IG addresses.
Figure 4-4 This Enterprise Has an Inside Local Address Space That Belongs to ISP1 But Is a Subscriber of ISP2. It Uses NAT to Translate the IL Addresses to IG Addresses Assigned Out of ISP2's CIDR Block
The danger in using a scheme such as this is in the possibility that any of the inside local addresses might be leaked to the public Internet. If this were to happen, the leaked address would conflict with ISP1, which has legal possession of the addresses. If ISP2 is using appropriately paranoid route filtering, such a mistake should not cause leakage to the Internet. As Chapter 2 emphasized, however, you should never make the assumption that an AS-external peer is filtering properly. Therefore, you must take extreme care to ensure that all the IL addresses are translated before packets are allowed into ISP2.
Another problem arising from this scheme is that ISP1 will probably reassign the 22.214.171.124/23 range to another customer. That customer is then unreachable to you. Suppose, for example, that a host on your network wants to send a packet to newbie@ISP1.com. DNS translates the address of that destination as 126.96.36.199, so the host uses that address. Unfortunately, that address is interpreted as belonging to your local internet and is either misrouted or is dropped as unreachable.
The moral of the story is that the migration scheme described in this section is very useful on a temporary basis, to reduce the complexity of the immediate move. Ultimately, however, you should still re-address your internet with private addresses.
NAT and Multihomed Autonomous Systems
Another shortcoming of CIDR is that multihoming to different service providers becomes more difficult. Figure 4-5 recaps the problem as discussed in Chapter 2. A subscriber is multihomed to ISP1 and ISP2 and has a CIDR block that is a subset of ISP1's block. To establish correct communication with the Internet, both ISP1 and ISP2 must advertise the subscriber's specific address space of 188.8.131.52/23. If ISP2 does not advertise this address, all the subscriber's incoming traffic passes through ISP1. And if ISP2 advertises 184.108.40.206/23, whereas ISP1 advertises only its own CIDR block, all the subscriber's incoming traffic matches the more-specific route and passes through ISP2. This poses several problems:
ISP1 must "punch a hole" in its CIDR block, which probably means modifying the filters and policies on many routers.
ISP2 must advertise part of a competitor's address space, an action that both ISPs are likely to find objectionable.
Advertising the subscriber's more-specific address space represents a small reduction in the effectiveness of CIDR in controlling the size of Internet routing tables.
Some national service providers do not accept prefixes longer than /19, meaning the subscriber's route through ISP2 will be unknown to some portion of the Internet.
Figure 4-6 shows ways that NAT can help solve the problem of CIDR in a multihomed environment. Translation is configured on the router connecting to ISP2, and the IG address pool is a CIDR block assigned by ISP2. ISP2 no longer advertises an ISP1 address space, so it is no longer necessary for ISP1 to advertise the subscriber's more-specific aggregate. Hosts within the subscriber's enterprise can access the Internet either by selecting the closest edge router or by some established policy. The IL address of the hosts' packets will be the same, no matter which router they pass through; if packets are sent to ISP2, however, the address is translated. So from the perspective of the Internet, the source addresses of packets from the subscriber vary according to which ISP has forwarded the packets.
Figure 4-5 Because the Multihomed Subscriber's CIDR Block Is a Subset of ISP1's CIDR Block, Both ISP1 and ISP2 Must Advertise the More-Specific Aggregate
Figure 4-6 NAT Is Used to Resolve the CIDR Problem Depicted in Figure 4-5
Figure 4-7 shows a more efficient design. NAT is implemented on both edge routers and the CIDR blocks from each ISP become the IG address pools of the respective NATs. The IL addresses are from the private 10.0.0.0 address space. This enterprise can change ISPs with relative ease, needing only to reconfigure the IG address pools when the ISP changes.
Figure 4-7 The IL Addresses of This Enterprise Have No Relationship to Any ISP; All ISP CIDR Blocks Are Assigned to NAT Inside Global Address Pools
Port Address Translation
The many-to-one applications of NAT discussed so far have involved a statistical multiplexing of a large range of addresses into a smaller pool of addresses. However, there is a one-to-one mapping of individual addresses. When an address from an inside global pool is mapped to an inside local address, for instance, that IG address cannot be mapped to any other address until the first mapping is cleared. However, there is a specialized function of NAT that allows many addresses to be mapped to a single address at the same time. Cisco calls this function port address translation (PAT). The same function is known in other circles as network address and port translation (NAPT) or IP masquerading. It is also sometimes referred to as address overloading.
A TCP/IP session is not identified as a packet exchange between two IP addresses, but as an exchange between two IP sockets. A socket is an (address, port) tuple. For example, a Telnet session might consist of a packet exchange between 192.168.5.2, 23 and 172.16.100.6, 1026. PAT translates both the IP address and the port. Packets from different addresses can be translated to a common address, but to different ports of that address, and therefore can share the same address. Figure 4-8 shows how PAT works.
Figure 4-8 By Translating Both the IP Address and the Associated Port, PAT Allows Many Hosts to Simultaneously Use a Single Global Address
Four packets with inside local addresses arrive at the NAT. Notice that packets 1 and 4 are from the same address but different source ports. Packets 2 and 3 are from different addresses but have the same source port. The source addresses of all four packets are translated to the same inside global address, but the packets remain unique because they each have a different source port. By translating ports, approximately 32,000 different inside local sockets can be translated to a single inside global address. As a result, PAT is a very useful application for small office/home office (SOHO) installations, where several devices might share a single assigned address on a single connection to an ISP.
NAT and TCP Load Distribution
You can use NAT to represent multiple, identical servers as having a single address. In Figure 4-9, devices on the outside reach a server at address 220.127.116.11. In actuality, there are four mirrored servers on the inside, and the NAT distributes sessions among them in a round-robin fashion. Notice that the destination addresses of packets 1 through 4, each from a different source, are translated to servers 1 through 4. Packet 5, representing a session from yet another source, is translated to server 1.
Obviously, the accessible contents of the four servers in Figure 4-9 must be identical. A host accessing the server farm might hit server 2 at one time and server 4 another time. It must appear to the host that it has hit the same server on both occasions.
Figure 4-9 TCP Packets Sent to a Server Farm, Represented by the Single Address 18.104.22.168, Are Translated Round-Robin to the Actual Addresses of the Four Identical Servers
This scheme is similar to DNS-based load sharing, in which a single name is resolved round-robin to several IP addresses. The disadvantage of DNS-based load sharing is that when a host receives the name/address resolution, the host caches it. Future sessions are sent to the same address, reducing the effectiveness of the load sharing. NAT-based load sharing performs a translation only when a new TCP connection is opened from the outside, so the sessions are more likely to be distributed evenly. In NAT TCP load balancing, non-TCP packets pass through the NAT untranslated.
It is important to note that NAT-based load balancing, like DNS-based load balancing, is not robust. NAT has no way to know when one of the servers goes down, so it continues to translate packets to that address. As a result, a failed or offline server can cause some traffic to the server farm to be black-holed.
NAT and Virtual Servers
NAT also can allow the distribution of services to different addresses, while giving the appearance that the services are all reachable at one address (see Figure 4-10).
Figure 4-10 You Can Configure NAT to Translate Incoming Packets to Different Addresses Based on the Destination Port
In Figure 4-10, the enterprise has a mail server at the local address 192.168.50.1 and an HTTP server at the local address 192.168.50.2. Both servers have a global address of 22.214.171.124. When a host from the outside sends a packet to the inside, the NAT examines the destination port in addition to the destination address. In Figure 4-10, a host has sent a packet to 126.96.36.199 with a destination port of 25, indicating mail. The NAT translates this packet's destination address to the mail server's, 192.168.50.1. A second packet from the same host has a destination port of 80, indicating HTTP. The NAT translates this packet's destination address to the Web server's, 192.168.50.2.