Advertising BGP Routes to Neighbors

The previous section focused on the tools that BGP can use to inject routes into a local router's BGP table. BGP routers take routes from the local BGP table and advertise a subset of those routes to their BGP neighbors. This section continues focusing on the BGP table because the BGP route advertisement process takes routes from the BGP table and sends them to neighboring routers, where the routes are added to the neighbors' BGP tables. Later, the final major section in the chapter, "Building the IP Routing Table," focuses on the rules regarding how BGP places routes into the IP routing table.

The BGP Update Message

Once a BGP table has a list of routes, paths, and prefixes, the router needs to advertise the information to neighboring routers. To do so, a router sends BGP Update messages to its neighbors. Figure 12-5 shows the general format of the BGP Update message.

Figure 12-5 BGP Update Message Format

Length (Bytes) of Withdrawn Routes Section Withdrawn Routes (Variable)

Length (Bytes) of Path Attributes Section

Path Attributes (Variable)

Prefix Length

Prefix (Variable)

Prefix Length

Prefix (Variable)

Each Update message has three main parts:

■ The Withdrawn Routes field enables BGP to inform its neighbors about failed routes.

■ The Path Attributes field lists the PAs for each route. NEXT_HOP and AS_PATH are sample values for this field.

■ The Prefix and Prefix Length fields define each individual NLRI.

The central concept in an individual Update message is the set of PAs. Then, all the prefixes (NLRIs) that share the exact same set of PAs and PA values are included at the end of the Update message. If a router needs to advertise a set of NLRIs, and each NLRI has a different setting for at least one PA, then separate Update messages will be required for each NLRI. However, when many routes share the same PAs—typical of prefixes owned by a particular ISP, for instance— multiple NLRIs are included in a single Update. This reduces router CPU load and uses less link bandwidth.

Determining the Contents of Updates

A router builds the contents of its Update messages based on the contents of its BGP table. However, the router must choose which subset of its BGP table entries to advertise to each neighbor, with the set likely varying from neighbor to neighbor. Table 12-8 summarizes the rules about which routes BGP does not include in routing updates to each neighbor; each rule is described more fully following the table.

Table 12-8 Summary of Rules Regarding Which Routes BGP Does Not Include in an Update

Table 12-8 Summary of Rules Regarding Which Routes BGP Does Not Include in an Update

KEY POINT

iBGP and/or eBGP

Routes Not Taken from the BGP Table

Both

Routes that are not considered "best"

Both

Routes matched by a deny clause in an outbound BGP filter

iBGP

iBGP-learned routes*

eBGP

Routes whose AS_PATH includes the ASN of the eBGP peer to which a BGP Update will be sent

This rule is relaxed or changed as a result of using route reflectors or confederations.

KEY POINT

This rule is relaxed or changed as a result of using route reflectors or confederations.

BGP only advertises a route to reach a particular subnet (NLRI) if that route is considered to be the best route. If a BGP router learns of only one route to reach a particular prefix, the decision process is very simple. However, when choosing between multiple paths to reach the same prefix, BGP determines the best route based on a lengthy BGP decision process, as described in detail in Chapter 13. Assuming that none of the routers has configured any routing policies that impact the decision process, the decision tree reduces to a four-step process that is mainly comprised of tiebreakers, as follows:

1. Choose the route with the shortest AS_PATH.

2. If AS_PATH length is a tie, prefer a single eBGP-learned route over one or more iBGP routes.

3. If the best route has not yet been chosen, choose the route with the lowest IGP metric to the NEXT_HOP of the routes.

4. If the IGP metric ties, choose the iBGP-learned route with the lowest BGP RID of the advertising router.

Additionally, BGP rules out some routes from being considered best based on the value of the NEXT_HOP PA. For a route to be a candidate to be considered best, the NEXT_HOP must be either:

■ 0.0.0.0, as the result of the route being injected on the local router.

■ Reachable according to that router's current IP routing table. In other words, the NEXT_HOP IP address must match a route in the routing table.

Because the NEXT_HOP PA is so important with regard to BGP's choice of its best path to reach each NLRI, this section summarizes the logic and provides several examples. The logic is separated into two parts based on whether the route is being advertised to an iBGP or eBGP peer. By default, when sending to an eBGP peer, the NEXT_HOP is changed to an IP address on the advertising router—specifically, to the same IP address the router used as the source IP address of the BGP Update message, for each respective neighbor. When sending to an iBGP peer, the default action is to leave the NEXT_HOP PA unchanged. Both of these default behaviors can be changed via the commands listed in Table 12-9.

Table 12-9 Conditions for Changing the NEXTHOP PA

Table 12-9 Conditions for Changing the NEXTHOP PA

KEY POINT

Type of Neighbor

Default Action for Advertised Routes

Command to Switch to Other Behavior

iBGP

Do not change the NEXT_HOP

neighbor... next-hop-self

eBGP

Change the NEXT_HOP to the update source IP address

neighbor... next-hop-unchanged

KEY POINT

Note that the NEXT_HOP PA cannot be set via a route map; the only way to change the NEXT_HOP PA is through the methods listed in Table 12-9.

Example: Impact of the Decision Process and NEXT_HOP on BGP Updates

The next several examples together show a sequence of events regarding the propagation of network 31.0.0.0/8 by BGP throughout the network of Figure 12-4. R6 originated the routes in the 30s (as in Example 12-4) by redistributing EIGRP routes learned from R9. The purpose of this series of examples is to explain how BGP chooses which routes to include in Updates under various conditions.

The first example, Example 12-9, focuses on the commands used to examine what R6 sends to R1, what R1 receives, and the resulting entries in R1's BGP table. The second example, Example 12-10, then examines those same routes propagated from R1 to R3, including problems related to R1's default behavior of not changing the NEXT_HOP PA of those routes. Finally, Example 12-11 shows the solution of R1's use of the neighbor 3.3.3.3 next-hop-self command, and the impact that has on the contents of the BGP Updates in AS 123.

Example 12-9 R6 Sending the 30s Networks to R1 Using BGP

R6 has injected the three routes listed below; they were not learned from another BGP neighbor. Note all three show up as >, meaning they are the best

_(and only in this case) routes to the destination NLRIs.

R6# show ip bgp

BGP table version is 5, local router ID is 6.6.6.6

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

R6 now lists the routes it advertises to R1—sort of. This command lists R6's

BGP table entries that are intended to be sent, but R6 can (and will in this case) change the information before advertising to R1. Pay particular attention to the Next Hop column, versus upcoming commands on R1. In effect, this command shows R6's current BGP table entries that will be sent to R1, but it shows them

Example 12-9 R6 Sending the 30s Networks to R1 Using BGP (Continued)

before R6 makes any changes, including NEXT_HOP. R6# show ip bgp neighbor 172.16.16.1 advertised-routes

BGP table version is 5, local router ID is 6.6.6.6

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

Total number of prefixes 3

The next command (R1) lists the info in the received BGP update from R6. Note that the NEXT_HOP is different; R6 changed the NEXT_HOP before sending the update, because it has an eBGP peer connection to R1, and eBGP defaults to set NEXT_HOP to itself. As R6 was using 172.16.16.6 as the IP address from which to send BGP messages to R1, R6 set NEXT_HOP to that number. Also note that R1 lists the neighboring AS (678) in the Path column at the end, signifying the AS_PATH for the route. R1# show ip bgp neighbor 172.16.16.6 received-routes BGP table version is 7, local router ID is 111.111.111.111

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete

Network *> 31.0.0.0 *> 32.0.0.0 *> 32.1.1.0/24

Next Hop 172.16.16.6 172.16.16.6 172.16.16.6

Metric LocPrf Weight Path

156160 0 678 ?

156160 0 678 ?

Total number of prefixes 3

! The show ip bgp summary command lists the state of the neighbor until the ! neighbor becomes established; at that point, the State/PfxRcd column lists the number ! of NLRIs (prefixes) received (and still valid) from that neighbor.

Total number of prefixes 3

! The show ip bgp summary command lists the state of the neighbor until the ! neighbor becomes established; at that point, the State/PfxRcd column lists the number ! of NLRIs (prefixes) received (and still valid) from that neighbor.

R1# show ip bgp summary | b

egin

Neighbor

Neighbor

V

AS MsgRcvd

MsgSent

TblVer

InQ

OutQ

Up/Down

State/PfxRcd

2.2.2.2

4

123

55

57

7

0

0

00:52:30

0

3.3.3.3

4

123

57

57

7

0

0

00:52:28

3

172.16.16.6

4

678

53

51

7

0

0

00:48:50

3

R1 has also learned of these prefixes from R3, as seen below. The routes through R6 have one AS in the AS_PATH, and the routes through R3 have two autonmous systems, so the routes through R6 are best. Also, the iBGP routes have an "i" for "internal" just before the prefix. R1# show ip bgp

BGP table version is 7, local router ID is 111.111.111.111

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale

Example 12-9 R6 Sending the 30s Networks to R1 Using BGP (Continued) Origin codes: i - IGP, e - EGP, ? - incomplete

Network

Next Hop

172.16.16.6

172.16.16.6

172.16.16.6

Metric LocPrf Weight Path

156160 0 0 0

156160

0 45 678 ? 0 678 ? 0 45 678 i 0 678 i 0 45 678 ? 0 678 ?

Example 12-9 showed examples of how you can view the contents of the actual Updates sent to neighbors (using the show ip bgp neighbor advertised-routes command) and the contents of Updates received from a neighbor (using the show ip bgp neighbor received-routes command). RFC 1771 suggests that the BGP RIB can be separated into components for received Updates from each neighbor and sent Updates for each neighbor. Most implementations (including Cisco IOS) keep a single RIB, with notations as to which entries were sent and received to and from each neighbor.

NOTE For the received-routes option to work, the router on which the command is used must have the neighbor neighbor-id soft-reconfiguration inbound BGP subcommand configured for the other neighbor.

These show ip bgp neighbor commands with the advertised-routes option list the BGP table entries that will be advertised to that neighbor. However, note that any changes to the PAs inside each entry are not shown in the command output. For example, the show ip bgp neighbor 172.16.16.1 advertised-routes command on R6 listed the NEXT_HOP for 31/8 as 10.1.69.9, which is true of that entry in R6's BGP table. R6 then changes the NEXT_HOP PA before sending the actual Update, with a NEXT_HOP of 172.16.16.6.

By the end of Example 12-9, R1 knows of both paths to each of the three prefixes in the 30s (AS_PATH 678 and 45-678), but has chosen the shortest AS_PATH (through R6) as the best path in each case. Note that the > in the show ip bgp output designates the routes as R1's best routes. Next, Example 12-10 shows some possibly surprising results on R3 related to its choices of best routes.

Example 12-10 Examining the BGP Table on R3

! R1 now updates R3 with R1s "best" routes

R1# show ip bgp neighbor 3.3.3.3 advertised-routes | begin Network

Network Next Hop Metric LocPrf Weight Path

Example 12-10 Examining the BGP Table on R3 (Continued)

Total number of prefixes 3

Example 12-10 Examining the BGP Table on R3 (Continued)

Total number of prefixes 3

! R3 received the routes, but R3's best routes to each prefix point back to

! R4 in AS 45, with AS_PATH 45-678, which is a longer path. The route through R1

! cannot be "best" because the NEXT_HOP was sent unchanged by iBGP neighbor R1.

R3# show ip bgp

BGP table version is 7, local router ID is 3.3.3.3

Status codes: s

suppressed, d damped, h history, * valid, > best, i - internal,

r

RIB-failure, S Stale

Origin codes: i

- IGP, e - EGP, ? - incomplete

Network

Next Hop Metric LocPrf Weight Path

*> 31.0.0.0

4.4.4.4 0 45 678 ?

* i

172.16.16.6 156160 100 0 678 ?

*> 32.0.0.0

4.4.4.4 0 45 678 i

* i

172.16.16.6 0 100 0 678 i

*> 32.1.1.0/24

4.4.4.4 0 45 678 ?

* i

172.16.16.6 156160 100 0 678 ?

! Proof that R3

cannot reach the next-hop IP address is shown next.

R3# ping 172.16

.16.6

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 172.16.16.6, timeout is 2 seconds:

Success rate is

0 percent (0/5)

Example 12-10 points out a quirk with some terminology in the show ip bgp command output, as well as an important design choice with BGP. First, the command output lists * as meaning valid; however, that designation simply means that the route is a candidate for use. Before the route can be actually used and added to the IP routing table, the NEXT_HOP must also be reachable. In some cases, routes that the show ip bgp command considers "valid" might not be usable routes, with Example 12-10 showing just such an example.

Each BGP route's NEXT_HOP must be reachable for a route to be truly valid. With all default settings, an iBGP-learned route has a NEXT_HOP IP address of the last eBGP router to advertise the route. For example, R3's route to 31.0.0.0/8 through R1 lists R6's IP address (172.16.16.6) in the NEXT_HOP field. Unfortunately, R3 does not have a route for 172.16.16.6, so that route cannot be considered "best" by BGP.

There are two easy choices to solve the problem:

■ Make the eBGP neighbor's IP address reachable by advertising that subnet into the IGP.

■ Use the next-hop-self option on the neighbor command that points to iBGP peers.

The first option typically can be easily implemented. Because many eBGP neighbors use interface IP addresses on their neighbor commands, the NEXT_HOP exists in a subnet directly connected to the AS. For example, R1 is directly connected to 172.16.16.0/24, so R1 could simply advertise that connected subnet into the IGP inside the AS.

However, this option might be problematic when loopback addresses are used for BGP neighbors. For example, if R1 had been configured to refer to R6's 6.6.6.6 loopback IP address, and it was working, R1 must have a route to reach 6.6.6.6. However, it is less likely that R1 would already be advertising a route to reach 6.6.6.6 into ASN 123.

The second option causes the router to change the NEXT_HOP PA to one of its own IP addresses— an address that is more likely to already be in the neighbor's IP routing table, which works well even if using loopbacks with an eBGP peer. Example 12-11 points out such a case, with R1 using the neighbor next-hop-self command, advertising itself (1.1.1.1) as the NEXT_HOP. As a result, R3 changes its choice of best routes, because R3 has a route to reach 1.1.1.1, overcoming the "NEXT_HOP unreachable" problem.

Example 12-11 points out how an iBGP peer can set NEXT_HOP to itself. However, it's also a good example of how BGP decides when to advertise routes to iBGP peers. The example follows this sequence, with the command output showing evidence of these events:

1. The example begins like the end of Example 12-10, with R1 advertising routes with R6 as the next hop, and with R3 not being able to use those routes as best routes.

2. Because R3's best routes are eBGP routes (through R4), R3 is allowed to advertise those routes to R2.

3. R1 then changes its configuration to use NEXT_HOP SELF.

4. R3 is now able to treat the routes learned from R1 as R3's best routes.

5. R3 can no longer advertise its best routes to these networks to R2, because the new best routes are iBGP routes.

Example 12-11 R3 Advertises the 30s Networks to R2, and Then R3 Withdraws the Routes

! (Step 1): At this point, R3 still believes its best route to all three prefixes

! in the 30s is through R4; as those are eBGP routes, R3 advertises all three

! routes to iBGP peer R2, as seen next.

R3# show ip bgp neighbor 2.2.2.2 advertised-routes

BGP table version is 7, local router ID is 3.3.3.3

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete

Example 12-11 R3 Advertises the 30s Networks to R2, and Then R3 Withdraws the Routes (Continued)

Network *> 31.0.0.0 *> 32.0.0.0 *> 32.1.1.0/24

Metric LocPrf Weight Path

Total number of prefixes 3

! (Step 2) R2 lists the number of prefixes learned from R3 next (3). R2# show ip bgp summary | begin Neighbor

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

! (Step 3) R1 now changes to use next-hop-self to peer R3. R1# conf t

Enter configuration commands, one per line. End with CNTL/Z. R1(config)# router bgp 123

R1(config-router)# neigh 3.3.3.3 next-hop-self

! (Step 4) R3 now lists the routes through R1 as best, because the new ! NEXT_HOP is R1s update source IP address, 1.1.1.1, which is reachable by R3. R3# show ip bgp

BGP table version is 10, local router ID is 3.3.3.3

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale Origin codes: i - IGP, e - EGP, ? - incomplete

Metric LocPrf Weight Path

0 45 678 100 0 678 i

156160

Network Next Hop

(Step 5) First, note above that all three "best" routes are iBGP routes, as noted by the "i" immediately before the prefix. R3 only advertises "best" routes, with the added requirement that it must not advertise iBGP routes to other iBGP peers. As a

_|result, R3 has withdrawn the routes that had formerly been sent to R2.

R3# show ip bgp neighbor 2.2.2.2 advertised-routes

Total number of prefixes 0

! The next command confirms on R2 that it no longer has any prefixes learned from ! R3.

R2# show ip bgp summary | begin Neighbor

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

Summary of Rules for Routes Advertised in BGP Updates

The following list summarizes the rules dictating which routes a BGP router sends in its Update messages:

KEY ■ Send only the best route listed in the BGP table. POINT

■ To iBGP neighbors, do not advertise paths learned from other iBGP neighbors.

■ To eBGP neighbors, do not advertise paths for which the neighbor's AS is already in the AS_PATH PA.

■ Do not advertise suppressed or dampened routes.

■ Do not advertise routes filtered via configuration.

The first two rules have been covered in some depth in this section. Chapter 13 covers the other three rules in more depth.

+1 0

Post a comment