Case Study 4 - OmanTel: Explosion in AS Path Count, Hours of BGP Churn
The AS path changes graph for Oman caught our attention as being quite different from the general pattern: instead of going down, because of loss of connectivity, the number of distinct AS paths, as observed from RRC03 peers in the eight hourly RIB dumps, went up. The average AS path length also increased for the duration of the cable outages.
Number of AS Paths And Average Path Length For Oman
The RIR stats files show only one AS assigned directly to a provider in Oman: AS28885 - OmanTel NAP. In January 2008, RIS saw 26 prefixes originated by this AS. Looking at the raw data, we noticed these prefixes are usually announced to the RIS peers in batches. One BGP update message carries the bulk of OmanTel prefixes, one or two other updates carry the rest. From RIS data alone is not possible to deduce with 100% certainty the reasons behind the observed behaviour; however, we imagine specific routing policies related to the networks served by the prefixes could play a role.
In relation to the cable outages we see the following:
* Before the cable fault each collector peer received the same AS path in all update messages for AS28885. This indicates the same routing policy for all prefixes.
* During the cable outage the origin AS apparently used a different policy for different sets of prefixes. So the collector peers receive different AS paths on the update messages for the full set of Oman prefixes.
For example, immediately before the RIB dump at 15:00 (UTC), 30 January, peer 18.104.22.168 AS1103 (Surfnet, Netherlands) sent RIS the following updates:
TIME: 01/30/08 14:48:59 TYPE: BGP4MP/MESSAGE/Update FROM: 22.214.171.124 AS1103 TO: 126.96.36.199 AS12654 ORIGIN: IGP ASPATH: 1103 1273 6762 8529 8529 8529 8529 28885 NEXT_HOP: 188.8.131.52 ANNOUNCE 184.108.40.206/16 TIME: 01/30/08 14:48:59 TYPE: BGP4MP/MESSAGE/Update FROM: 220.127.116.11 AS1103 TO: 18.104.22.168 AS12654 ORIGIN: IGP ASPATH: 1103 1273 6762 8529 28885 NEXT_HOP: 22.214.171.124 ANNOUNCE 126.96.36.199/19 188.8.131.52/20 184.108.40.206/20 220.127.116.11/18 18.104.22.168/18 22.214.171.124/19 126.96.36.199/19 188.8.131.52/19 184.108.40.206/21 220.127.116.11/21 18.104.22.168/20 22.214.171.124/22 126.96.36.199/23 188.8.131.52/23 184.108.40.206/18 220.127.116.11/21 18.104.22.168/23 22.214.171.124/22 126.96.36.199/22 TIME: 01/30/08 14:48:59 TYPE: BGP4MP/MESSAGE/Update FROM: 188.8.131.52 AS1103 TO: 184.108.40.206 AS12654 ORIGIN: IGP ASPATH: 1103 3549 3491 8529 28885 NEXT_HOP: 220.127.116.11 ANNOUNCE 18.104.22.168/24 22.214.171.124/24 126.96.36.199/24 188.8.131.52/24 TIME: 01/30/08 14:56:37 TYPE: BGP4MP/MESSAGE/Update FROM: 184.108.40.206 AS1103 TO: 220.127.116.11 AS12654 ORIGIN: IGP ASPATH: 1103 3549 3491 8529 28885 NEXT_HOP: 18.104.22.168 ANNOUNCE 22.214.171.124/24 126.96.36.199/24 188.8.131.52/24 184.108.40.206/24 TIME: 01/30/08 14:57:07 TYPE: BGP4MP/MESSAGE/Update FROM: 220.127.116.11 AS1103 TO: 18.104.22.168 AS12654 ORIGIN: IGP ASPATH: 1103 3549 3491 8529 28885 NEXT_HOP: 22.214.171.124 ANNOUNCE 126.96.36.199/24 188.8.131.52/24 184.108.40.206/24 220.127.116.11/24
So instead of three times the same path, this peer gave RIS three different AS paths in the 15:00 (UTC) RIB dump. As other RIS peers experienced similar behaviour, the total number of distinct AS paths increased by a factor of 2.5 for this period.
Routing States of a Prefix Originated by AS28885 - BGPlay screenshots
We looked at the routing dynamics of the prefix 18.104.22.168/18 (originated by AS28885), using BGPlay.
04:03 (UTC), 30 January 2008: Before the cable outages; primary transits for Oman are AS3491 (PCCW) and AS6762 (Telecom Italia Sparkle). Note how the purple histogram on the left indicates a period of prolonged, continuous BGP updates is about to begin. Over 10,000 messages were recorded in 90 hours, which means that on average the collective of all RIS peers saw an announcement or withdrawal every 30 seconds.
08:23 (UTC), 30 January 2008: Shortly after the second cable went down; first signs of rerouting are visible.
15:59 (UTC), 30 January 2008: Immediately before the second RIB dump of the day. Many peers have switched from using AS3491 to AS6762 as transit to Oman. From the graph we can see most of these peers need more AS hops to reach AS28885, thus the average AS path length increases.
23:39 (UTC), 30 January 2008: Before the last RIB dump of the day. AS24493 (STIXLITE Transit Service Provider Singapore) has taken over the transit for most peers who first used AS6762 (Telecom Italia Sparkle).
15:44 (UTC), 31 January 2008: Yet another routing state. AS6762 is used by more peers than ever, AS24493 (Singapore) is still strong and AS3491 (PCCW) is the least preferred transit provider.
The case of OmanTel shows how a combination of (likely) routing policy and an explosion in BGP activity increase the routing topology entropy for AS28885. The number of observed distinct AS paths for the prefixes announced from Oman doubled and the average AS path length increased by 20%. Because there was a constant high rate of changes, we can't be sure if BGP ever converged in that period, or if the routes which were seen could really be used.