First Hop Redundancy Protocols
When networks use a design that includes redundant routers, switches, LAN links, and WAN links, in some cases, other protocols are required to take advantage of that redundancy and prevent problems caused by it.
For instance, imagine a WAN with many remote branch offices. If each remote branch has two WAN links connecting it to the rest of the network, those routers can use an IP routing protocol to pick the best routes. The routing protocol learns routes over both WAN links, adding the best route into the routing table. When the better WAN link fails, the routing protocol adds the alternate route to the IP routing table, taking advantage of the redundant link.
As another example, consider a LAN with redundant links and switches. Those LANs have problems unless the switches use Spanning Tree Protocol (STP) or Rapid STP (RSTP). STP/RSTP prevents the problems created by frames that loop through those extra redundant paths in the LAN.
This section examines yet another protocol that helps when a network uses some redundancy, this time with redundant default routers. When two or more routers connect to the same LAN subnet, the hosts in that subnet could use any of the routers as their default router. However, another protocol is needed to use the redundant default routers best. The term First Hop Redundancy Protocol (FHRP) refers to the category of protocols that enable hosts to take advantage of redundant routers in a subnet.
This first major section of the chapter discusses the major concepts behind how different FHRPs work. This section begins by discussing a network’s need for redundancy in general and the need for redundant default routers.
The Need for Redundancy in Networks
Networks need redundant links to improve the availability of those networks. Eventually, something in a network will fail. A router power supply might fail, or a link might break, or a switch might lose power. And those WAN links, shown as simple lines in most drawings in this book, represent the most complicated physical parts of the network, with many individual components that can fail as well.
Depending on the design of the network, the failure of a single component might mean an outage that affects at least some part of the user population. Network engineers refer to any one component that, if it fails, brings down that part of the network as a single point of failure. For instance, in Figure 16-1, the LANs appear to have some redundancy, whereas the WAN does not. If most of the traffic flows between sites, many single points of failure exist, as shown in the figure.
Figure 16.1 R1 and the One WAN Link as Single Points of Failure
The figure notes several components as a single point of failure. If any of the network’s noted parts fail, packets cannot flow from the left side of the network to the right.
To improve availability, the network engineer first looks at a design and finds the single points of failure. Then the engineer chooses where to add to the network so that one (or more) single point of failure now has redundant options, increasing availability. In particular, the engineer
Adds redundant devices and links
Implements any necessary functions that take advantage of the redundant device or link
For instance, of all the single points of failure in Figure 16-1, the most expensive over the long term would likely be the WAN link because of the ongoing monthly charge. However, statistically, the WAN links are the most likely component to fail. So, a good upgrade from the network in Figure 16-1 would be to add a WAN link and possibly even connect to another router on the right side of the network, as shown in Figure 16-2.
Many real enterprise networks follow designs like Figure 16-2, with one router at each remote site, two WAN links connecting back to the main site, and redundant routers at the main site (on the right side of the figure). Compared to Figure 16-1, the design in Figure 16-2 has fewer single points of failure. Of the remaining single points of failure, a risk remains, but it is a calculated risk. For many outages, a reload of the router solves the problem, and the outage is short. But the risk still exists that the switch or router hardware will fail and require time to deliver a replacement device on-site before that site can work again.
Figure 16.2 Higher Availability but with R1 Still as a Single Point of Failure
For enterprises that can justify more expense, the next step in higher availability for that remote site is to protect against those catastrophic router and switch failures. In this particular design, adding one router on the left side of the network in Figure 16-2 removes all the single points of failure noted earlier. Figure 16-3 shows the design with a second router, which connects to a different LAN switch so that SW1 is no longer a single point of failure.
Figure 16.3 Removing All Single Points of Failure from the Network Design
The Need for a First Hop Redundancy Protocol
Of the designs shown so far in this chapter, only Figure 16-3’s design has two routers to support the LAN on the left side of the figure, specifically the same VLAN and subnet. While having the redundant routers on the same subnet helps, the network must use an FHRP when these redundant routers exist.
To see the need and benefit of using an FHRP, first think about how these redundant routers could be used as default routers by the hosts in VLAN 10/subnet 10.1.1.0/24, as shown in Figure 16-4. The host logic will remain unchanged, so each host has a single default router setting. So, some design options for default router settings include the following:
All hosts in the subnet use R1 (10.1.1.9) as their default router, and they statically reconfigure their default router setting to R2’s 10.1.1.8 if R1 fails.
All hosts in the subnet use R2 (10.1.1.8) as their default router, and they statically reconfigure their default router setting to R1’s 10.1.1.9 if R2 fails.
Half the hosts use R1 and half use R2 as their default router, and if either router fails, half of the users statically reconfigure their default router setting.
Figure 16.4 Balancing Traffic by Assigning Different Default Routers to Different Clients
To ensure the concept is clear, Figure 16-4 shows this third option, with half the hosts using R1 and the other half using R2. The figure removes all the LAN switches just to unclutter the figure. Hosts A and B use R1 as their default router, and hosts C and D use R2 as their default router.
All these options have a problem: the users must act. They have to know an outage occurred. They have to know how to reconfigure their default router setting. And they have to know when to change it back to the original setting.
FHRPs use the redundant default routers without the end users being aware of any changes. The two routers appear to be a single default router. The users never have to do anything: their default router setting remains the same, and their ARP tables remain the same.
To allow the hosts to remain unchanged, the routers must do more work, as defined by one of the FHRP protocols. Generically, each FHRP makes the following happen:
All hosts act like they always have, with one default router setting that never has to change.
The default routers share a virtual IP address in the subnet, defined by the FHRP.
Hosts use the FHRP virtual IP address as their default router address.
The routers exchange FHRP protocol messages so that both agree as to which router does what work at any point in time.
When a router fails or has some other problem, the routers use the FHRP to choose which router takes over responsibilities from the failed router.
The Three Solutions for First-Hop Redundancy
The term First Hop Redundancy Protocol does not name any one protocol. Instead, it names a family of protocols that fill the same role. For a given network, like the left side of Figure 16-4, the engineer would pick one of the protocols from the FHRP family.
Table 16-2 lists the three FHRP protocols in chronological order as first used in the market. Cisco first introduced the proprietary Hot Standby Router Protocol (HSRP), which worked well for many customers. Later, the IETF developed an RFC for a similar protocol, Virtual Router Redundancy Protocol (VRRP). Finally, Cisco developed a more robust option, Gateway Load Balancing Protocol (GLBP).
Table 16-2 Three FHRP Options
Acronym |
Full Name |
Origin |
Redundancy Approach |
Load Balancing Per… |
---|---|---|---|---|
HSRP |
Hot Standby Router Protocol |
Cisco |
active/standby |
subnet |
VRRP |
Virtual Router Redundancy Protocol |
RFC 5798 |
active/standby |
subnet |
GLBP |
Gateway Load Balancing Protocol |
Cisco |
active/active |
host |
The CCNA 200-301 version 1.1 blueprint requires you to know the purpose, functions, and concepts of an FHRP. To do that, the next section takes a deep look at HSRP concepts, while the final section of the chapter compares VRRP and GLBP to HSRP. (This chapter does not discuss FHRP configuration, but if you want to learn beyond the plain wording of the exam topics, note that Appendix D, “Topics from Previous Editions,” contains a short section about HSRP and GLBP configuration, copied from an earlier edition of the book.)