About RIPE | Contact  | Search | Sitemap    
Homepage RIPE  
RIPE Community Mail Archives
search  
     
RIPE Navigation Ends
About RIPE Maillists
Maillists Archive
Global Lists
Non Active Lists
RIPE NCC Navigation Ends
Next Section
<<< Chronological >>> Author Index    Subject Index <<< Threads >>>

Re: Draft of Route-Flap Dampening Paper

  • To: "Christian Panigl, ACOnet/UniVie" < >
  • From: Curtis Villamizar < >
  • Date: Tue, 23 Sep 1997 12:43:12 -0400
  • Reply-to:

In message <009BAB0A.A37D7B9E.15@localhost>, "Christian Panigl, ACOnet/Un
iVie" writes:
>     
> 1.4 Motivation for coordinated parameters
> 
>     There is a strong need for the coordinated use of dampening parameters
>     because of several reasons:
>     
>     Coordination of "progressiveness":
>     
>     If the boundaries for different treatment of longer prefixes and the
>     penalties are not coordinated throughout the Internet, route-flap
>     dampening could even lead to additional flapping or temporary
>     routing-loops because longer prefixes might already be re-announced
>     through some parts of the Internet where shorter prefixes are still held
>     down through other paths.

This is not true.  If route flap damping is only applied to EBGP
routes there are no problems except long secondary paths getting used.
Some more specifics will be blocked in a few places and not in others
and they will follow whatever route remains.  If all of the more
specifics are lost an aggregate will be followed and either blackholed
at the aggregator or it will get to the dest.

I don't see any opportunity for routing loops.  I don't see any issue
at all with less specifics being withdrawn and more specifics
remaining as described above (I assume you meant the opposite).

>     Coordination of "aggressiveness":
> 
>     If an upstream or peering provider would be dampening more aggressively
>     (e.g. triggered by less flaps or applying longer hold-down timers) than
>     an access-provider towards his customers it will lead to a very
>     inconsistent situation, where a flapping network might still be able to
>     reach "near-line" parts of the Internet.  Debugging of such
>     instabilities is then much harder because the effect for the customer
>     leads to the assumption that there is a problem "somewhere" in the
>     "upstream" Internet instead of making him just call his ISPs hotline and
>     complain that he can't get out any longer.
>     
>     Further, after successful repair of the problem the access-provider can
>     easily clear the flap-dampening for his customer on his local router
>     instead of needing to contact upstream NOCs all over the Internet to get
>     the dampening cleared.

This would be an argument in favor of very aggressive damping of ones
own customer routes which is unlikely to be a good idea.

> 2. Recommended dampening parameters
> 
> 2.1 Motivation for recommendation
> 
>     At RIPE26 and 27 Christian Panigl presented the following network
>     backbone maintenance example from his own experience, which was
>     triggering flap dampening in some upstream and peering ISPs routers for
>     all his and his customers /24 prefixes for more than 3 hours because of
>     too "aggressive" paramters:
>     
>     scheduled SW upgrade of backbone router failed:
>     
>     	- reload after SW upgrade	1 flap
> 	- new SW crashed		1 flap
> 	- reload with old SW		1 flap
> 					------
> 					3 flaps within 10 minutes
> 					
>     which resulted in the following dampening scenario at some boundaries
>     with progressive route-flap dampening enabled:
>     
>     Prefix length:	/24	/19	/16
>     suppress time:	~3h	45-60'	<30'
>     
>     Therefore, in the Routing-WG session at RIPE27, it was agreed that
>     suppression should not start until the 4th flap in a row and that the
>     maximum suppression should in no case last longer than 1 hour from the
>     last flap.
> 	 
>     It was agreed that a recommendation from RIPE would be desirable.  Given
>     that the current allocation policies are expected to hold for the
>     foreseeable future, it was suggested that all /19's or shorter prefixes
>     are not penalised harder than current Cisco default dampening does.
>     
>     Those suggestions in mind Tony Barber designed the following set of
>     route-flap dampening parameters which have prooved to work smoothly in
>     his environment for a couple of months.

Why is a /24 being announced globally?  Our private peerings use a
prefix taken from one of the provider's aggregates.

The answer to this problem is to arrange things so the rest of the
world doesn't need to know about a /24 that can be taken up and down
by the software upgrade of a single router.  That's what route flap
damping can encourage and it seems to have worked in this case except
the message didn't register.

> 3. Open problems
> 
> 3.1 Multiplication of flaps through multiply interconnected ASes
> 
>     Christian Panigl recently made the following experience with a line
>     upgrade of an Ebone customer:
>     
>     - It is absolutely positive that through the upgrade process just ONE
>       flap was generated (disconnect router-port from modem A reconnect to
>       modem B), nevertheless the customers prefix was dampened in all ICM
>       routers (ICM/AS1800 is US upstream for Ebone).
> 
>     - The flap statistics in the ICM routers stated *4* flaps !!!  
>     
>     - The only explanation would be that the multiple interconnections
>       between Ebone/AS1755 and ICM/AS1800 did multiply the flaps
>       (advertisements/withdrawals arrived time-shifted at ICM routers
>       through the multiple paths).

The flap damping parameters should be applied to Adj-In routes which
are per peer.  The only problem then can occur if the AS-path changes
multiple times.  The only solution to that is to keep separate data
structs for Adj-In and each observed AS path.

> 3.2 Is dampening of customer route-flaps a good idea ?
> 
>     As already explained in section 1.3 flap-dampening is at its best value
>     and most consistent and helpful if applied as near to the source of
>     the problem as possible.  Therefore flap-dampening should not only be
>     applied at peering boundaries but even more at customer boundaries !

This is highly unreasonable.  Do you really expect to shut off peer
route damping every where and ask [insert irresponsible and clueless
ISP name here] to damp at the customer attachment?

Don't damp the customer attachment.  Aggregate!

If the customer's connectivity gets hosed a few times, be very
persistent in reminding them that renumbering into an aggregate is an
option that will solve that problem.  Then the rest of the Internet
has less flapping routes to damp.

Curtis




  • Post To The List:
<<< Chronological >>> Author    Subject <<< Threads >>>
 

Next Section
     About RIPE | Site Map | LIR Portal | About the RIPE NCC | Contact | © RIPE Community. All rights reserved.
RIPE.NET Homepage LIR Portal RIPE Community