Next-hop fixup in partially-meshed NBMA networks

From CT3

Jump to: navigation, search

By Ivan Pepelnjak

If the network design requires partially-meshed NBMA network in a single IP subnet, extra configuration steps depending on the routing protocol used in the network have to be taken to ensure that the edge routers with partial connectivity can propagate the traffic according to the entries in the IP routing table.

Switched WAN technologies (Frame Relay, ATM or X.25) are the most common examples of NBMA networks. You might also encounter the same limitations in private VLAN environments.

Contents

Next-hop issues

The next-hop issues will be illustrated with the sample network displayed in the following figure:

Image:NBMA_NH_Diagram.png

In some cases, routing protocols change the IP next hop of advertised prefixes to a third-party IP address; most often when the recipient of the routing update resides in the same IP subnet as the source of the information. For example, when R1 sends RIP update to R2 advertising IP prefix 10.0.1.3/32, it sets the next-hop IP address to 10.0.0.3 because R2 (10.0.0.2) and R3 (10.0.0.3) are in the same IP subnet. Behavior of various routing protocols is summarized in the following table:

Routing protocol Default behavior Changed with
RIP Next-hop IP address is changed if the source and the recipient of the routing update reside in the same IP subnet. For example, R1 advertises 10.0.1.3/32 to R2 with next-hop set to 10.0.0.3. Not configurable
EIGRP Routes received through an interface are not advertised back through the same interface. no ip split-horizon eigrp
EIGRP next-hop processing is disabled by default. EIGRP does not have next-hop issues in NBMA networks, but you have to enable next-hop calculations if you want to establish spoke-to-spoke shortcuts in DMVPN networks. no ip next-hop-self eigrp
OSPF All routers belonging to the same IP subnet are assumed to be directly reachable. Point-to-multipoint OSPF mode should be used in partially-meshed NBMA network. Special precautions are needed if OSPF routers reside in the same subnet as external sources of routes redistributed into OSPF. ip ospf network point-to-multipoint
BGP Next-hop IP address is unchanged if the next-hop in the local BGP table and the destination router reside in the same subnet. For example, R1 advertises 10.0.1.3/32 to R2 with next-hop set to 10.0.0.3. Default BGP next-hop processing should be disabled on partially-meshed NBMA networks. neighbor ip-address next-hop-self
OSPF point-to-multipoint network type should be used without exception on partially meshed NBMA network. Other OSPF-related solutions are thus not discussed in this article.

RIP version 2 configuration

The router configurations in the sample network are extremely simple. The configuration of R1 (hub router) is displayed in the following listing:

hostname R1
!
interface Loopback0
 ip address 10.0.1.1 255.255.255.255
!
interface Serial1/0
 description Link to FR
 ip address 10.0.0.1 255.255.255.0
 encapsulation frame-relay
!
router rip
 version 2
 network 10.0.0.0

Frame Relay inverse ARP is used to establish dynamic mappings between neighbor IP addresses and DLCI numbers:

R1#show frame map
Serial1/0 (up): ip 10.0.0.2 dlci 200(0xC8,0x3080), dynamic,
              broadcast,, status defined, active
Serial1/0 (up): ip 10.0.0.3 dlci 300(0x12C,0x48C0), dynamic,
              broadcast,, status defined, active

After RIPv2 is configured throughout the network, the next-hop of the route from R2 toward the 10.0.1.3/32 is 10.0.0.3 due to RIP next-hop processing rules:

R2#show ip route 10.0.1.3
Routing entry for 10.0.1.3/32
  Known via "rip", distance 120, metric 2
  Redistributing via rip
  Last update from 10.0.0.3 on Serial1/0, 00:00:01 ago
  Routing Descriptor Blocks:
  * 10.0.0.3, from 10.0.0.1, 00:00:01 ago, via Serial1/0
      Route metric is 2, traffic share count is 1
R2#show ip cef 10.0.1.3 internal
10.0.1.3/32, version 15, epoch 0, cached adjacency 10.0.0.3
0 packets, 0 bytes
  via 10.0.0.3, Serial1/0, 0 dependencies
    next hop 10.0.0.3, Serial1/0
    invalid cached adjacency
  refcount 5

Obviously, as there is no direct connectivity between R2 and R3, the traffic cannot flow between 10.0.1.2 and 10.0.1.3:

R2#show access-list
Extended IP access list 199
    10 permit ip host 10.0.1.2 host 10.0.1.3
R2#debug ip packet 199
IP packet debugging is on for access list 199
R2#ping ip 10.0.1.3 source loop 0 repeat 2

Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 10.0.1.3, timeout is 2 seconds:
Packet sent with a source address of 10.0.1.2

*Jun 22 16:51:26.083: IP: tableid=0, s=10.0.1.2 (local), d=10.0.1.3 (Serial1/0), routed via RIB
*Jun 22 16:51:26.083: IP: s=10.0.1.2 (local), d=10.0.1.3 (Serial1/0), len 100, sending
*Jun 22 16:51:26.083: IP: s=10.0.1.2 (local), d=10.0.1.3 (Serial1/0), len 100, encapsulation failed.
It’s mandatory to test connectivity between the loopback interfaces to ensure the next-hop issues are resolved on both edge routers involved in the test.

Single-subnet partially-meshed NBMA designs should be avoided at all costs. Unless the scalability issues prevent it, the best model of partially-meshed NBMA network is a point-to-point subinterface configured over each virtual circuit.

Static switched WAN maps

Packet propagation across NBMA networks relies on mappings between layer-3 (IP) addresses and layer-2 addresses (Frame Relay VC numbers, X.25 addresses or ATM VPI/VCI pairs). These mappings, which are equivalent to the ARP tables used on LAN media, can be discovered dynamically (Frame Relay only) or defined statically with the frame-relay map, x25 map or ATM pvc/map-list configuration commands. The static WAN maps can be used to bypass the next-hop issues; you could associate the IP address of an unreachable next-hop with one of the existing virtual circuits.

This method does not provide any redundancy, as you can define a single L3-L2 mapping.

In the sample network you can use the frame-relay map commands on R2 and R3 to establish virtual direct connectivity between 10.0.0.2 and 10.0.0.3:

R2#show frame map
Serial1/0 (up): ip 10.0.0.1 dlci 200(0xC8,0x3080), dynamic,
              broadcast,
              CISCO, status defined, active
R2#configure terminal
Enter configuration commands, one per line.  End with CNTL/Z.
R2(config)#interface serial 1/0
R2(config-if)#frame map ip 10.0.0.3 200

R3#show frame map
Serial1/0 (up): ip 10.0.0.1 dlci 300(0x12C,0x48C0), dynamic,
              broadcast,, status defined, active
R3#configure terminal
Enter configuration commands, one per line.  End with CNTL/Z.
R3(config)#interface serial 1/0
R3(config-if)#frame map ip 10.0.0.2 300

After the fake Frame Relay maps have been configured on R2 and R3, ping between the loopback interfaces works:

R2#ping 10.0.1.3 source 10.0.1.2 repeat 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 10.0.1.3, timeout is 2 seconds:
Packet sent with a source address of 10.0.1.2
!!!
Success rate is 100 percent (3/3), round-trip min/avg/max = 8/9/12 ms

RIP version 2 with host routes toward next-hop router

IOS releases 12.4, 12.4T and 12.2S do not perform recursive lookup in the IP routing table for the next-hop address received in RIP update. The route received by RIP is installed directly in the IP routing table and CEF table.

In the sample network, you might want to configure static host routes on R2 and R3 trying to redirect the traffic toward the RIP-advertised IP next-hop through R1:

R2(config)#ip route 10.0.0.3 255.255.255.255 10.0.0.1

R3(config)#ip route 10.0.0.2 255.255.255.255 10.0.0.1

Although the static routes appear in the IP routing table, they do not influence the IP routing table or CEF entries:

R2#show ip route 10.0.0.3
Routing entry for 10.0.0.3/32
  Known via "static", distance 1, metric 0
  Routing Descriptor Blocks:
  * 10.0.0.1
      Route metric is 0, traffic share count is 1

R2#show ip route 10.0.1.3
Routing entry for 10.0.1.3/32
  Known via "rip", distance 120, metric 2
  Redistributing via rip
  Last update from 10.0.0.3 on Serial1/0, 00:00:12 ago
  Routing Descriptor Blocks:
  * 10.0.0.3, from 10.0.0.1, 00:00:12 ago, via Serial1/0
      Route metric is 2, traffic share count is 1

R2#show ip cef 10.0.1.3
10.0.1.3/32, version 15, epoch 0, cached adjacency 10.0.0.3
0 packets, 0 bytes
  via 10.0.0.3, Serial1/0, 0 dependencies
    next hop 10.0.0.3, Serial1/0
    invalid cached adjacency

BGP configuration

BGP was configured in the sample network with each router running in a separate autonomous system, as shown in the next diagram:

Image:NBMA_NH_BGP.png

BGP configuration from R1 is displayed in the next printout:

router bgp 65001
 no synchronization
 bgp log-neighbor-changes
 network 10.0.1.1 mask 255.255.255.255
 neighbor 10.0.0.2 remote-as 65002
 neighbor 10.0.0.3 remote-as 65003
 no auto-summary

After the BGP convergence process completes, the route to 10.0.1.3 on R2 has R3 as the next-hop due to BGP next-hop processing rules:

R2#show ip bgp 10.0.1.3
BGP routing table entry for 10.0.1.3/32, version 4
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Not advertised to any peer
  65001 65003
    10.0.0.3 from 10.0.0.1 (10.0.1.1)
      Origin IGP, localpref 100, valid, external, best

Next-hop static host routes with BGP

BGP’s architecture (primarily in IBGP designs) relies on recursive lookup of the next-hop IP address. The traffic toward the BGP next-hop can thus be influenced with static host routes for BGP next-hops.

The correct solution to the BGP next-hop issue is the disabling of the BGP next-hop processing with neighbor next-hop-self. It should always be preferred over static host routes or fake frame-relay map entries.

In the sample network, the host route toward Frame Relay address of R3 is configured on R2 pointing to R1:

R2(config)#ip route 10.0.0.3 255.255.255.255 10.0.0.1

After the host route has been configured, the CEF entry for the IP prefix 10.0.1.3/32 points to the IP address 10.0.0.1 even though the BGP table entry has not changed:

R2#show ip bgp 10.0.1.3
BGP routing table entry for 10.0.1.3/32, version 4
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Flag: 0x820
  Not advertised to any peer
  65001 65003
    10.0.0.3 from 10.0.0.1 (10.0.1.1)
      Origin IGP, localpref 100, valid, external, best
R2#show ip route 10.0.1.3
Routing entry for 10.0.1.3/32
  Known via "bgp 65002", distance 20, metric 0
  Tag 65001, type external
  Last update from 10.0.0.3 00:01:46 ago
  Routing Descriptor Blocks:
  * 10.0.0.3, from 10.0.0.1, 00:01:46 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65001
R2#show ip cef 10.0.1.3 detail
10.0.1.3/32, version 20, epoch 0, cached adjacency 10.0.0.1
0 packets, 0 bytes
  via 10.0.0.3, 0 dependencies, recursive
    next hop 10.0.0.1, Serial1/0 via 10.0.0.3/32
    valid cached adjacency

Additional Resources  

Configuring BGP on Cisco Routers (BGP) course
Other links
Personal tools

CT3

Main menu