Re: [ipv6-wg] Re: [address-policy-wg] Re: 200 customer requirements for IPv6
-
To: "Oliver Bartels" <>
-
From: Iljitsch van Beijnum <>
-
Date: Tue, 6 Dec 2005 21:42:57 +0100
-
Cc: "" <>, "" <>
On 6-dec-2005, at 20:10, Oliver Bartels wrote:
I can't imagine what such a layer would look like...
Clustering all PI-prefixes originating at the same AS to form
a single "super-prefix" makes policy processing a lot easier,
because it need to be done just once for the whole block.
I'm not sure I understand the "superprefix" but obviously a lot of
work that now happens per-prefix in BGP should happen per-AS. But
that's mostly moot in IPv6 as we should never reach the numbers of
prefixes per AS that we see in IPv4.
With as few as 256MByte of DDRAM plus a 64K TCAM chip it is
possible to handle max. 8 Million Forwarding entries at full 128 bit
resolution
I guess that means you throw everything in the TCAM first to get from
8M to about 125 entries and then look those up in a tree or hash table?
Obviously it's possible to build architectures that allow fast
forwarding with big tables. However, this doesn't come free: it takes
more iron to do this, and also more power. TCAMs suck down the juice
like a depressed alcoholic. This is bad for your design (both the box
itself and the datacenter), your wallet and the environment.
I personally just received a patent on such router hardware concept.
So big routing tables are good business for you, then?
Sure. But trying to aggregate on network topology is never going to
work for two very simple reasons:
1. It changes all the time
The same is true for geographical aggregation.
I guess I you live in California or another place that is plagued by
frequent earthquakes...
Geographical aggregation would require free transit, otherwise
it is not compatible with the ISP's business models.
The point is to keep the aggregation inside the ISP network. Tier-1
ISPs would still have a full routing table, but rather than have a
copy in each router, it's distributed over the network. So there is
no free transit requirement.
country boundaries.
Such boundaries are artifical, the EU tries to avoid them.
The idea behind aggregation is that you can move up or down. If
country borders get in the way, drill down a bit and start looking at
provinces or cities. In our design there are potentially 64k distinct
areas (mostly cities) so if you want, you can have a route for each
of those in your routing table and never run into country borders.
2. You can't express a topology with loops in it in an addressing
hierarchy
Avoiding loops is the job of the routing protocol, not the
topology.
??? Are we talking about spanning tree now? Loops in the topology are
good. You can't remove them. Routing is also dynamic, BTW.
Distance is actually very important. It's very hard to do decent high
speed file transfer on out of the box OSes and applications with high
delay. Also, it often makes sense to backhaul traffic over SOME
distance, but that doesn't mean it also makes sense to backhaul it
over even larger distances. I.e., even if a link to New York is
cheap, you don't want to go over Palo Alto.
If I would be located in Seattle, Palo Alto might be an alternative
way point as well as Chicago or even Dallas.
Of course. But we're in Europe. If you're in Seattle you'll see a lot
of your traffic to other people in Seattle flow through Palo Alto.
That's normal, because it's not economical to peer with everyone
everywhere. It's not so cool when intra-Seattle traffic starts to
flow through Miami.
What you are trying is:
Map a two-and-a-half-dimensional world on a one-dimensional
address range. This won't work by Math.
Dimensions can only be replaced by dimensions.
Ah, but we're not mathematicians but engineers. In software, you have
one dimensional memory. Still, you can have multidimensional arrays.
Asked a database programmer how difficult it is to implement
a geographical 10km around some place search on a database
and ask them about the algorithms in use.
Easy: select everything that's in a 20x20 km grid around the center
point and then do the real distance calculation on everything in that
grid. Obviously you'll select stuff that's at x+8 y+8 = ~12km from
the center but that's only true for a relatively small part of the
intermediate results.
What they try is interleaving the West-East (X) and North-South (Y)
coordinates bitwise in the search key and handle overruns by
exceptions,
That sounds like Tony Hain's geographical addressing.
The variant Michel Py and myself came up with is based on
administrative borders such as countries so you already have on
dimension: the alphabet. (Ok, not entirely how it works, but still.)
However this requires a _significant_ exception handling effort,
nothing someone would like to implement in a fast forwarding
engine for packet routing.
Geography is long gone by the time we're forwarding packets.
Today, IPv4 routing works but it has come close to the edge of the
cliff twice (early 1990s just before CIDR routing tables were too
large and late 1990s flapping cost too much CPU) and it's still
pushing towards that edge, which we can't see clearly but know is out
there somewhere.
It works. Period.
Hm, if you only descern "works" / "doesn't work" it's hard to say
anything about the routing system... Some quantitative and
qualitative analysis can be helpful.
And it will continue to work, because of the
economic pressure. Engineers have found a solution, thus:
Don't worry.
Guess what. I'm an engineer. I'm working on this stuff. And I'm
saying: when de facto unlimited PI is allowed, it may not mean the
end of the internet, but it's certainly reason to worry. Of course
things will continue to work. However, they'll be less reliable and
more expensive.
So because you can't prove that you're right I should just believe
you without proof?
Yes, because the theory of computer sience gives you the
prrof that there are theorems in this world which can't be proven.
There are also many theorems that turn out to be false. Proof is a
pretty good method to avoid those. If we can't have proof we'll need
to have less reliable methods to avoid them. Just accept anything is
not the solution.
The scenario that de facto unlimited PI in IPv6 will make routing
tables so large that it becomes problematic in some way or another is
entirely reasonable, on the other hand.
The current experience let us make a reasonable and responsible
assumption that a IPv6 routing table would take not much more
growth than the current IPv4 table, whereas current technology
permits tables of 10 to 100 times that size.
Today, people sometimes deaggregate a /16. That's bad: 255
unnecessary routes. What if they do the same thing with a /32 in
IPv6? 65535 unnecessary routes. That will probably kill most existing
IPv6 routers today.
10 times is 1.75M routes, 100 = 17.5. The former is probably doable
for IPv4 on some extremely high end boxes but I'm not sure how those
would handle real issues such as flapping, lots of full feeds etc. I
don't believe the latter exists or will exist in the forseeable
future, at least not in a way that anyone can afford to actually use.
Even those 1.75M boxes will be very expensive and only affordable by
the largest networks. Don't forget you and I all pay for their
hardware, directly or indirectly.
Iljitsch
--
I've written another book! http://www.runningipv6.net/
|