Aggressive BGP fall-over behavior
From CT3
BGP support for Fast Peering Session Deactivation described in the IP Corner article »Designing Fast Converging BGP Networks« is a feature that tracks the reachability of the BGP neighbor in the IP routing table and disconnects the BGP session as soon as the neighbor is no longer reachable.
The IP corner article mentions that this feature might be too aggressive in environments with mixed routing protocols due to a small interval between the loss of primary IP route and installation of backup IP route. However, BGP sessions might be lost in pure link-state IGP environments if a directly connected interface of a BGP router fails.
The following topology will be used to illustrate this undesired behavior:
Two BGP routers (A1 and A2) are connected through the core network (C1 and C2) as well as with a backup lower-speed (256 kbps) link. Default OSPF costs are used on all links, thus the primary path from A1 to A2 goes through C1 and C2.
Contents |
Core link failure
When a core link fails (for example, the core serial interface on C1), the following sequence of events unfolds on C1 and A1:
- C1 detects link loss and adjusts its IP routing tables.
- After the OSPF process detects interface loss, C1 generates modified router LSA and floods it.
- A1 receives modified router LSA from C1 and schedules SPF (delayed for 5 seconds unless tuned with timers throttle spf command).
- SPF computes new network topology and installs the backup route toward A2.
An IP route toward A2 is always present in the IP routing table; the BGP session is therefore not disconnected.
Selected (and heavily edited) debugging printouts taken on A1 illustrate the process:
08:09:08.263: OSPF: received update from 10.0.1.2, Serial1/0 08:09:08.263: OSPF: Rcv Update Type 1, LSID 10.0.1.2, Adv rtr 10.0.1.2, age 1, seq 0x80000008 ... 08:09:13.267: OSPF: Begin SPF for topology Base with MTID 0 at 1083.844ms, process time 988ms
Directly connected failure
The behavior of A1 is completely different if the directly connected link (A1 to C1) fails:
- A1 detects interface loss and removes all routes using that interface from the IP routing table.
- After the OSPF process detects interface loss, A1 generates modified router LSA and floods it.
- A1 schedules SPF (delayed for 5 seconds unless tuned with timers throttle spf command).
- SPF computes new network topology and installs the backup route toward A2.
The route toward A2 is lost between the moment the interface loss is detected and the moment SPF computes new network topology. The BGP session with A2 is thus disconnected even though a backup route exists and is eventually used.
Debugging printouts (heavily edited) generated by A1 illustrate this behavior:
08:07:06.519: RT: interface Serial1/0 removed from routing table 08:07:06.527: RT: delete route to 10.0.1.4 via 10.0.7.10, Serial1/0 08:07:06.531: RT: no routes to 10.0.1.4, flushing 08:07:06.567: %BGP-5-ADJCHANGE: neighbor 10.0.1.4 Down Route to peer lost 08:07:07.583: %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial1/0, changed state to down … 08:07:16.519: %OSPF-5-ADJCHG: Process 1, Nbr 10.0.1.2 on Serial1/0 from FULL to DOWN, Neighbor Down: → Interface down or detached … 08:07:17.027: OSPF: Build router LSA for area 0, router ID 10.0.1.1, seq 0x80000006, process 1 … 08:07:22.007: OSPF: Begin SPF for topology Base with MTID 0 at 972.580ms, process time 780ms 08:07:22.007: spf_time 00:16:07.580, wait_interval 5000ms 08:07:22.019: RT: updating ospf 10.0.1.4/32 (0x0) via 10.0.7.6 Se1/1 08:07:22.023: RT: add 10.0.1.4/32 via 10.0.7.6, ospf metric [110/391]
Workarounds
OSPF convergence optimization might result in fast discovery of backup path (avoiding the loss of BGP session). OSPF has to detect interface loss and install the backup path before the layer-1 or layer-2 detects interface loss. To improve OSPF convergence, use the following techniques:
- Fast OSPF neighbor loss detection configured with ip ospf dead-interval minimum interface configuration command or BFD-assisted neighbor loss detection.
- Fast SPF response configured with timers throttle spf router configuration command.
A network design in which every BGP router has two equal-cost paths into the network core is the only reliable means of preventing BGP session loss when using Fast Session Deactivation. When a BGP router has two equal-cost paths, both of them are installed in the IP routing table and thus an alternate path is used immediately following a link failure.
Router configurations
The following configurations were used in the tests. All routers were running IOS release 12.2(33)SRC3.
Configuration of A1
version 12.2 service timestamps debug datetime msec service timestamps log datetime msec no service password-encryption ! hostname A1 ! logging buffered 4096 ! no aaa new-model ip subnet-zero ! ip cef no ip domain lookup ip host A1 10.0.1.1 ip host C1 10.0.1.2 ip host C2 10.0.1.3 ip host A2 10.0.1.4 ! interface Loopback0 ip address 10.0.1.1 255.255.255.255 ip ospf 1 area 0 ! interface Serial1/0 description Link to C1(ROUTER) s1/0 ip address 10.0.7.9 255.255.255.252 encapsulation ppp ip ospf dead-interval minimal hello-multiplier 5 ip ospf 1 area 0 keepalive 1 serial restart-delay 0 ! interface Serial1/1 description Link to A2(ROUTER) s1/1 bandwidth 256 ip address 10.0.7.5 255.255.255.252 encapsulation ppp ip ospf 1 area 0 keepalive 1 serial restart-delay 0 ! router ospf 1 log-adjacency-changes timers throttle spf 2 1000 2000 ! router bgp 65000 no synchronization bgp log-neighbor-changes neighbor 10.0.1.4 remote-as 65000 neighbor 10.0.1.4 update-source Loopback0 neighbor 10.0.1.4 fall-over no auto-summary ! ip classless ! line con 0 exec-timeout 0 0 privilege level 15 logging synchronous transport preferred none stopbits 1 line aux 0 stopbits 1 ! ntp logging end
Configuration of R2
version 12.2 service timestamps debug datetime msec service timestamps log datetime msec no service password-encryption ! hostname A2 ! logging buffered 4096 ! no aaa new-model ip subnet-zero ! ip cef no ip domain lookup ip host A1 10.0.1.1 ip host C1 10.0.1.2 ip host C2 10.0.1.3 ip host A2 10.0.1.4 ! interface Loopback0 ip address 10.0.1.4 255.255.255.255 ip ospf 1 area 0 ! interface Serial1/0 description Link to C2(ROUTER) s1/0 ip address 10.0.7.14 255.255.255.252 encapsulation ppp ip ospf 1 area 0 keepalive 1 serial restart-delay 0 ! interface Serial1/1 description Link to A1(ROUTER) s1/1 bandwidth 256 ip address 10.0.7.6 255.255.255.252 encapsulation ppp ip ospf 1 area 0 keepalive 1 serial restart-delay 0 ! router ospf 1 log-adjacency-changes ! router bgp 65000 no synchronization bgp log-neighbor-changes neighbor 10.0.1.1 remote-as 65000 neighbor 10.0.1.1 update-source Loopback0 neighbor 10.0.1.1 fall-over no auto-summary ! ip classless ! line con 0 exec-timeout 0 0 privilege level 15 logging synchronous transport preferred none stopbits 1 ! ntp logging end
Configuration of C1
version 12.2 service timestamps debug datetime msec service timestamps log datetime msec no service password-encryption ! hostname C1 ! logging buffered 4096 ! no aaa new-model ip subnet-zero ! ip cef no ip domain lookup ip host A1 10.0.1.1 ip host C1 10.0.1.2 ip host C2 10.0.1.3 ip host A2 10.0.1.4 ! interface Loopback0 ip address 10.0.1.2 255.255.255.255 ! interface Serial1/0 description Link to A1(ROUTER) s1/0 ip address 10.0.7.10 255.255.255.252 encapsulation ppp ip ospf dead-interval minimal hello-multiplier 5 keepalive 1 serial restart-delay 0 ! interface Serial1/1 description Link to C2(ROUTER) s1/1 ip address 10.0.7.17 255.255.255.252 encapsulation ppp ip ospf dead-interval minimal hello-multiplier 5 keepalive 1 serial restart-delay 0 ! router ospf 1 log-adjacency-changes network 0.0.0.0 255.255.255.255 area 0 ! ip classless ! line con 0 exec-timeout 0 0 privilege level 15 logging synchronous transport preferred none stopbits 1 ! ntp logging end
Configuration of C2
version 12.2 service timestamps debug datetime msec service timestamps log datetime msec no service password-encryption ! hostname C2 ! logging buffered 4096 ! no aaa new-model ip subnet-zero ! ip cef no ip domain lookup ip host A1 10.0.1.1 ip host C1 10.0.1.2 ip host C2 10.0.1.3 ip host A2 10.0.1.4 ! interface Loopback0 ip address 10.0.1.3 255.255.255.255 ! interface Serial1/0 description Link to A2(ROUTER) s1/0 ip address 10.0.7.13 255.255.255.252 encapsulation ppp keepalive 1 serial restart-delay 0 ! interface Serial1/1 description Link to C1(ROUTER) s1/1 ip address 10.0.7.18 255.255.255.252 encapsulation ppp ip ospf dead-interval minimal hello-multiplier 5 keepalive 1 serial restart-delay 0 ! router ospf 1 log-adjacency-changes network 0.0.0.0 255.255.255.255 area 0 ! ip classless ! line con 0 exec-timeout 0 0 privilege level 15 logging synchronous transport preferred none stopbits 1 ! ntp logging end



BlogMarks
del.icio.us
digg
Facebook
LinkedIn
Newsvine
reddit
Slashdot