You are here: Home > Publications > RIPE Document Store > RIPE Routing Working Group Recommendations On Route-flap Damping

Changes to RIPE Routing Working Group Recommendations On Route-flap Damping

Legend (+) Added (-) Deleted
Changed Tag Added Tag Deleted
ripe-378: This document discusses Route-flap ripe-580: RIPE Routing Working Group Recommendations on Route Flap Damping and recommends acceptable practices for ISPs who are considering deploying Route-flap Damping.
insert: <p>

insert: <a class="anchor-link" href="#introduction"> Introduction insert: </a> insert: </p>

insert: <p>

insert: <a class="anchor-link" href="#history"> History and Background insert: </a> insert: </p>

insert: <p>

insert: <a class="anchor-link" href="#analysis"> Analysis insert: </a> insert: </p>

insert: <p>

insert: <a class="anchor-link" href="#recommendations"> Recommendations insert: </a> insert: </p>

insert: <p>

insert: <a class="anchor-link" href="#references"> References and Further Reading insert: </a> insert: </p>

Abstract

delete: <p> This document discusses Route-flap Damping and recommends acceptable practices for ISPs who are considering deploying Route-flap Damping. delete: </p> delete: <hr /> delete: <p>   delete: </p> delete: <p> delete: <a href="#intro"> 1.0 insert: <h3>

insert: <a name="introduction"> insert: </a> Introduction delete: </a> delete: <br /> delete: <a href="#1_1"> 1.1 Background delete: </a> delete: <br /> delete: <a href="#1_2"> 1.2 Coordination of insert: </h3>

insert: <p>

Route Flap Damping Parameters delete: </a> delete: <br /> delete: <a href="#2"> 2.0 Current Status of Route-flap Damping delete: </a> delete: <br /> delete: <a href="#2_1"> 2.1 Impeded Convergence delete: </a> delete: <br /> delete: <a href="#2_2"> 2.2 Updates transiting the network delete: </a> delete: <br /> delete: <a href="#sol"> 3.0 Solutions delete: </a> delete: <br /> delete: <a href="#recommendation"> 4.0 Recommendation delete: </a> delete: <br /> delete: <a href="#conclusion"> 5.0 Conclusion delete: </a> delete: <br /> delete: <a href="#acknowledgements"> 6.0 Acknowledgements delete: </a> delete: <br /> delete: <a href="#aw"> 7.0 References delete: </a> delete: </p> delete: <p>   delete: </p> delete: <hr /> delete: <h2> delete: <a name="intro"> delete: </a> 1.0 Introduction delete: </h2> delete: <p> Route-flap Damping (RFD) [1] [ insert: <a class="anchor-link" href="#ref1"> 1 insert: </a> ] is a mechanism for BGP speaking routers that penalises prefixes that exhibit a large number of updates (‘flapping’), and suppresses a route when the accumulated penalty exceeds a given threshold.  The penalty decays over time until it reaches a lower threshold at which point the route is unsuppressed. RFD is intended to improve the overall stability of the Internet routing table and reduce the load on the CPUs of the core BGP speaking routers. Unfortunately, In ripe-378 [ insert: <a class="anchor-link" href="#ref2"> 2 insert: </a> ] it was stated that due to the dynamics of the protocol, common simple BGP, especially a phenomenon called ‘path hunting,’ the default configurations of flap damping can do more harm than good, see [3,4]. good as it may suppress a prefix after it has only flapped a few times. Consequently RFD was deprecated due to the problem of over damping (see [ insert: <a class="anchor-link" href="#ref2"> 2 insert: </a> ] for more details). insert: </p>

insert: <p>

A small number of prefixes on the Internet continue to flap rapidly and cause a disproportionate number of updates to BGP and load on BGP speaking routers.  This document uses experimental data gathered from an operational environment to suggest changes to the RFD parameters to suppress the prefixes that flap the most, while minimising the suppression of other prefixes. insert: </p>

insert: <p>

This document suggests parameters which would make RFD usable and is based around the work of Cristel Pelsser, Olaf Maennel, Pradosh Mohapatra, Randy Bush, and Keyur Patel presented at PAM2011[ insert: <a class="anchor-link" href="#ref3"> 3 insert: </a> ].

delete: <a name="1_1"> delete: </a> 1.1 insert: <a name="history"> insert: </a> History and Background

In the early 1990s the accelerating growth in the number of prefixes being announced to the Internet (often due to inadequate prefix-aggregation), prefix aggregation), the denser meshing through multiple inter-provider paths, and increased instabilities started to cause significant impact on the performance and efficiency of the some Internet backbone routers. Every time a routing prefix became unreachable altered state because of a single line-flap, the withdrawal was advertised to the whole core Internet BGP-Speaking Zone (BSZ) and handled by every single router that carried the full Internet routing table. delete: <br /> delete: <br /> It was soon realized that the increasing routing churn created significant insert: </p>

insert: <p>

The load this processing load placed on routing engines, sometimes sufficiently high load to cause router crashes. delete: <br /> delete: <br /> the control planes of routers caused further instability as the routers were not able to process other BGP updates or they dropped traffic transiting the device. This could produce cyclic crashing behaviour. insert: </p>

insert: <p>

To overcome this situation RFD was developed in 1993 and has since been integrated into most router BGP software implementations. RFD is described in detail in RFC 2439. RFD is now used in many service provider networks in the Internet. delete: </p> delete: <h3> delete: <a id="1_2" name="1_2"> delete: </a> 1.2 Coordination of flap damping parameters delete: </h3> 2439[ insert: <a class="anchor-link" href="#ref1"> 1] insert: </a> . insert: </p>

When RFD was first implemented in commercial routers, vendor implementations had different default values and different characteristics. As this inconsistency would result in different rates of flap damping, and therefor therefore introduce inconsistent path selection and thus behavior that was very hard to diagnose, the ISP operator community introduced a consistent set of recommendations for flap damping parameters, so that ISPs deploying RFD would treat flapping prefixes in the same way.

This call for consistency resulted in the RIPE Routing Working Group producing first ripe-178, then ripe-210, and finally the ripe-229 documents [2], following consensus of the Routing Working Group. [ insert: <a class="anchor-link" href="#ref2a"> 2a insert: </a> ].  The parameters documented in ripe-229 were considered, at at  time of publication in 2001, the best current practice. delete: </p> delete: <hr /> delete: <h2> delete: <a name="2"> delete: </a> 2.0 Current Status of Route-flap Damping delete: </h2> delete: <p> Research in the years following the introduction of In 2006, this was reviewed again and resulted in ripe-378 [ insert: <a class="anchor-link" href="#ref2"> 2 insert: </a> ] which recommended to disable RFD into because it created more harm than good. insert: </p>

insert: <h3>

insert: <a name="analysis"> insert: </a> Analysis insert: </h3>

insert: <p>

In the work by Pelsser et al [ insert: <a class="anchor-link" href="#ref3"> 3 insert: </a> ], it is shown that 3% of all prefixes cause 36% of BGP implementations, and the publication of updates, and just 0.01% of the prefixes cause 10% of the BGP updates.  The aim is to only penalise those prefixes with excessive numbers of updates. insert: </p>

insert: <p>

The default values used in current implementations of RFD apply a penalty of 1000 each time a route flaps, and suppresses the prefix when the penalty exceeds a figure in the region of 2000 (Cisco IOS) or 3000 (Juniper JunOS). insert: </p>

insert: <p>

The table shows the percentage of prefixes above the suppress threshold and the percentage reduction in BGP churn for various values of suppress threshold.  The current default suppress value of 2000 reduces BGP churn by 47%, but it suppressed 14% of the prefixes at some point over the lifetime of the experiment. Significantly larger values of suppress threshold such as 12000, 15000 or 18000 still reduced BGP churn, but suppressed far fewer prefixes which it is believed reduces the risk of penalising otherwise well-behaved prefixes. insert: </p>

insert: <p>

  insert: </p>

insert: <table class="listing"> insert: <tbody>insert: <tr>insert: <td>insert: <td>insert: <td>insert: </tr>insert: <tr>insert: <td>insert: <td>insert: <td>insert: </tr>insert: </tbody>insert: </table>
insert: <p>

insert: <b> Suppress insert: </b> insert: </p>

insert: <p>

insert: <b> Threshold insert: </b> insert: </p>

insert: </td>
insert: <p>

insert: <b> % prefixes insert: </b> insert: </p>

insert: <p>

insert: <b> suppressed insert: </b> insert: </p>

insert: </td>
insert: <p>

insert: <b> % reduction in BGP churn insert: </b> insert: </p>

insert: <p>

insert: <b> compared with no damping insert: </b> insert: </p>

insert: </td>
insert: <p>

2000 insert: </p>

insert: <p>

4000 insert: </p>

insert: <p>

6000 insert: </p>

insert: <p>

12000 insert: </p>

insert: <p>

15000 insert: </p>

insert: <p>

18000 insert: </p>

insert: </td>
insert: <p>

14 insert: </p>

insert: <p>

4.2 insert: </p>

insert: <p>

2.1 insert: </p>

insert: <p>

0.63 insert: </p>

insert: <p>

0.44 insert: </p>

insert: <p>

0.32 insert: </p>

insert: </td>
insert: <p>

47 insert: </p>

insert: <p>

26 insert: </p>

insert: <p>

19 insert: </p>

insert: <p>

11.26 insert: </p>

insert: <p>

9.51 insert: </p>

insert: <p>

8.12 insert: </p>

insert: </td>
insert: <h3>

insert: <a name="recommendations"> insert: </a> Recommendations insert: </h3>

insert: <p>

In order to punish the biggest offenders - those prefixes that flap the most – yet without punishing others, the RIPE Routing Working Group recommendations, Routing-WG recommends vendors raise the maximum suppress threshold in router implementations to 50,000 and operators configure a suppress threshold value of at least 6,000.   The vendors might also change the default suppress threshold to 6,000.  But this might surprise operators who use the default. insert: </p>

insert: <p>

This has demonstrated that there are real and signficant problems with RFD as deployed on the Internet today. delete: </p> delete: <h3> delete: <a id="2_1" name="2_1"> delete: </a> 2.1 Impeded Convergence delete: </h3> delete: <p> Perhaps the best known work highlighting major problems with a number of advantages: insert: </p>

insert: <ul>
    insert: <li>
  • it is easy to implement insert: </li>
  • insert: <li>
  • it will reduce the churn compared to the situation we havenow where no RFD is that by Zhuoquing Mao and colleagues, presented at Sigcomm in 2002. Following presentations by Randy Bush and colleagues explain the research work more accessibly. delete: </p> delete: <p> The major issue is that if one path is withdrawn, all BGP speakers will use best path selection to pick the next best path, and advertise this best path to all their neighbours. These neighbours will see a change in path; a change in path is a change in attribute, so the prefix as seen on a neighbouring router will attract a flap penalty - even though that path is perfectly valid and there has been no disappearance of the prefix from the routing applied insert: </li>
  • insert: <li>
  • it spares the smaller offenders. insert: </li>
  • insert: </ul>
insert: <p>

Changing the default suppress threshold could result in an increase in forwarding table [5]. delete: </p> delete: <p> And this path "hunting" goes on throughout the Internet - a simple prefix withdrawal can result in the appearance of a major flap event a few AS hops away in the Internet, with the result that vendor default and even the RIPE-229 recommended flap damping parameters will mark the prefix to be suppressed. While the operator can see this is an error, the routers are simply reacting to the circumstances presented to them. delete: </p> delete: <h3> delete: <a id="2_2" name="2_2"> delete: </a> 2.2 Updates transiting the network delete: </h3> delete: <p> Problems are not just caused by path "hunting". Each implementation of BGP either has differing values of the Minimum Route Advertisement Interval (MRAI) Timer (the amount of time a router waits before passing on a route update) size or does not implement MRAI at all in favour of the vendor's own throttling algorithm. delete: </p> delete: <p> Some implementations pass on the update without waiting at all, others wait for 30 seconds, etc. These differences mean that update messages transiting different ASNs using different vendor equipment will arrive at the target router at different times. This router will see these different messages, and will consider each one for best path options. This will more than likely result in a different best path offered to its neighbours for each message update arriving. delete: </p> delete: <p> The result of this is that a simple update message from one ASN would be seen as a multiple route flap event a few ASN hops away - when in fact there was no instability whatsoever. There have been actual measurements where this resulted in a single prefix withdrawal producing 41 BGP events a few hops away! delete: </p> delete: <p> Not only is the MRAI timer a potential source of problems, but also differences in CPU loadings and CPU speed will result in different update times for prefixes announcements passing from router to router. These differences will also contribute to the effects described above. delete: </p> delete: <hr /> delete: <h2> delete: <a id="sol" name="sol"> delete: </a> 3.0 Solutions delete: </h2> delete: <p> Possible solutions to the problems summarised above have been proposed and analysed in the work by Zhouqing Mao and colleagues. delete: </p> delete: <p> However, despite publication in 2002, there has since then been no desire expressed from the ISP industry for these modifications to be made to the BGP implementations. Nor has there been any activity by the BGP implementors to enhance their flap damping implementations to follow those recommendations. delete: </p> delete: <p> As the power of routers has increased, the original needs for BGP Flap Damping is no longer a major concern announcement rate for operators or router equipment vendors as it was in the mid-1990s when route flapping consumed a signficant percentage of the CPU of early routers. In fact, the negative effects of RFD, as described above, have become the major concern, the cure has become worse than the disease! delete: </p> delete: <hr /> delete: <h2> delete: <a name="recommendation"> delete: </a> 4.0 Recommendation delete: </h2> delete: <p> who use RFD with the default settings.  This Routing Working Group document proposes that with the current implementations of BGP flap damping, the application of flap damping in ISP networks is NOT recommended. The recommendations given in ripe-229 and previous documents [2] are considered obsolete henceforth. delete: <br /> delete: <br /> If flap damping is implemented, the ISP operating that network will cause side-effects to their customers and the Internet users of their customers' content and services as described in the previous sections. These side-effects would quite likely be worse than the impact caused by simply not running flap damping at all. delete: </p> delete: <hr /> delete: <h2> delete: <a id="conclusion" name="conclusion"> delete: </a> 5.0 Conclusion delete: </h2> delete: <p> With current vendor implementations, BGP flap damping is harmful to the reachability of prefixes across the Internet. We would like to encourage more work to correct some of the issues highlighted by the work of Mao et al [3], to allow the viewing of prefix flap statistics without applying flap damping, and permit more flexible per eBGP neighbour damping configuration features for network operators. delete: </p> delete: <hr /> delete: <h2> delete: <a id="acknowledgements" name="acknowledgements"> delete: </a> 6.0 Acknowledgements delete: </h2> delete: <p> We would like to acknowledge valuable contributions and feedback from Randy Bush. delete: </p> delete: <hr /> delete: <h2> delete: <a name="aw"> delete: </a> 7.0 warrants further discussion. insert: </p>

insert: <h3>

insert: <a name="references"> insert: </a> References delete: </h2> delete: <p> and Further Reading insert: </h3>

insert: <p>

insert: <a name="ref1"> insert: </a> [1] Curtis Villamizar, Ravi Chandra, Ramesh Govindan delete: <br /> insert: </p>

insert: <p>

RFC2439: BGP Route-flap Damping (Proposed Standard) delete: <br /> delete: <a href="ftp://ftp.ietf.org/rfc/rfc2439.txt" target="_blank"> insert: </p>

insert: <p>

insert: <a href="ftp://ftp.ietf.org/rfc/rfc2439.txt"> ftp://ftp.ietf.org/rfc/rfc2439.txt

  insert: </p>

insert: <p>

insert: <a name="ref2"> insert: </a> [2] Most recent RIPE Document insert: </p>

insert: <p>

insert: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-378.txt"> ftp://ftp.ripe.net/ripe/docs/ripe-378.txt insert: </a> insert: </p>

insert: <p>

  insert: </p>

insert: <p>

insert: <a name="ref2a"> insert: </a> [2a] Older RIPE Documents

delete: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-178.txt" target="_blank"> insert: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-178.txt"> ftp://ftp.ripe.net/ripe/docs/ripe-178.txt delete: <br /> delete: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-210.txt" target="_blank"> ftp://ftp.ripe.net/ripe/docs/ripe-210.tx delete: </a> t delete: <br /> delete: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-229.txt" target="_blank"> insert: </p>

insert: <p>

insert: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-210.txt"> ftp://ftp.ripe.net/ripe/docs/ripe-210.txt insert: </a> insert: </p>

insert: <p>

insert: <a href="ftp://ftp.ripe.net/ripe/docs/ripe-229.txt"> ftp://ftp.ripe.net/ripe/docs/ripe-229.txt

  insert: </p>

insert: <p>

insert: <a name="ref3"> insert: </a> [3] Cristel Pelsser, Olaf Maennel, Pradosh Mohapatra, Randy Bush and Keyur Patel. "Route Flap Damping Made Usable". PAM 2011, March 2011. insert: </p>

insert: <p>

insert: <a href="http://www.iij-ii.co.jp/en/lab/researchers/cristel/publications/Pelsser-RFD-PAM2011.pdf"> http://www.iij-ii.co.jp/en/lab/researchers/cristel/publications/Pelsser-RFD-PAM2011.pdf insert: </a> insert: </p>

insert: <p>

  insert: </p>

insert: <p>

[4] Zhouqing Mao, Ramesh Govindan, George Varghese, Randy Katz delete: <br /> insert: </p>

insert: <p>

Route-flap Damping Exacerbates Internet Routing Congerence Sigcomm SIGCOMM 2002 delete: <br /> delete: <a href="http://www.eecs.umich.edu/~zmao/Papers/sig02.pdf" target="_blank"> insert: </p>

insert: <p>

insert: <a href="http://www.eecs.umich.edu/~zmao/Papers/sig02.pdf"> http://www.eecs.umich.edu/~zmao/Papers/sig02.pdf

[4]   insert: </p>

insert: <p>

[5] Randy Bush, Tim Griffin, Zhouqing Mao delete: <br /> Route-flap Damping: Harmful? delete: <br /> insert: </p>

insert: <p>

NANOG 26 delete: <br /> delete: <a href="http://www.nanog.org/mtg-0210/ppt/flap.pdf" target="_blank"> insert: </p>

insert: <p>

insert: <a href="http://www.nanog.org/mtg-0210/ppt/flap.pdf"> http://www.nanog.org/mtg-0210/ppt/flap.pdf

[5]   insert: </p>

insert: <p>

[6] Craig Labovitz, Abha Ahuja, Abhijit Bose, Farnam Jihanian delete: <br /> insert: </p>

insert: <p>

Delayed Internet Routing Convergence delete: <br /> Sigcomm SIGCOMM 2000 delete: <br /> delete: <a href="http://www.acm.org/sigs/sigcomm/sigcomm2000/conf/paper/sigcomm2000-5-2.pdf"> http://www.acm.org/sigs/sigcomm/sigcomm2000/conf/paper/sigcomm2000-5-2.pdf delete: </a> delete: </p> delete: <hr /> insert: </p>

insert: <p>

insert: <a href="http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-5-2.pdf"> http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-5-2.pdf insert: </a> insert: </p>

RIPE Routing Working Group Recommendations On Route-flap on Route Flap Damping
RIPE Documents Search