About RIPE | Contact  | Search | Sitemap    
Homepage RIPE  
RIPE Community Mail Archives
search  
     
RIPE Navigation Ends
About RIPE Maillists
Maillists Archive
Global Lists
Non Active Lists
RIPE NCC Navigation Ends
Next Section

Re: [ipv6-wg] Re: [address-policy-wg] Re: 200 customer requirements for IPv6

  • To: "Iljitsch van Beijnum" <
    >
  • From: "Oliver Bartels" <
    >
  • Date: Tue, 06 Dec 2005 20:10:09 +0100
  • Cc: "
    " <
    >, "
    " <
    >
  • Priority: Normal
  • Reply-to: "Oliver Bartels" <
    >

On Tue, 6 Dec 2005 18:08:40 +0100, Iljitsch van Beijnum wrote:
>On 6-dec-2005, at 17:36, Oliver Bartels wrote:
>> You may aggregatable PI space if you can convince the router
>> manufacturers create and implement a new RFC which adds an
>> additional layer for prefix aggregation within the BGP protocol.
>
>I can't imagine what such a layer would look like...

Clustering all PI-prefixes originating at the same AS to form
a single "super-prefix" makes policy processing a lot easier,
because it need to be done just once for the whole block.
This is obvious for single homed PI and even saves
processing time on multihomed PI.

Whether the address space occupied by this prefix is
homogenous or not is a "don't care" nowadays, as a large
forwarding table isn't a problem at all.

With as few as 256MByte of DDRAM plus a 64K TCAM chip it is
possible to handle max. 8 Million Forwarding entries at full 128 bit
resolution with appx. 10 GBit/s typical and 1.5 GBit/s "bad weather"
(small packets only) line rate.
I personally just received a patent on such router hardware concept.

>In IP? Don't think so. If you have pointers, please enlighten me.

In POTS and X.25 networks. The latter died ...

>> Nowadays carriers use DWDM technology, and yes, a link
>> between Frankfurt and London or even New York is cheaper and
>> easier to obtain than a link between Frankfurt and Wiesbaden.
>
>Sure. But trying to aggregate on network topology is never going to  
>work for two very simple reasons:
>
>1. It changes all the time

The same is true for geographical aggregation.
Geographical aggregation would require free transit, otherwise
it is not compatible with the ISP's business models.

There is no free transit and thus it doesn't work.
Business relations changes all the time and in a global markets
world these business relations don't stop on country boundaries.

Such boundaries are artifical, the EU tries to avoid them.

Thus the only aggregation you would gain is:
Americas / Europe-Asia / Asia-Pacific.

Not even Africa and with huge problems in Asia.

Everything below this continental boundaries can be treated
local and, with your own words : It changes all the time

( Why: See below, dimensions. )

Thus with a geographical approach your aggregation gain
is near zero (might be a factor of 2..3 in the table, which
is near zero in this context).

Sorry, but geographical aggregation won't work in these days
any more.

>2. You can't express a topology with loops in it in an addressing  
>hierarchy

Avoiding loops is the job of the routing protocol, not the
topology.

>Distance is actually very important. It's very hard to do decent high  
>speed file transfer on out of the box OSes and applications with high  
>delay. Also, it often makes sense to backhaul traffic over SOME  
>distance, but that doesn't mean it also makes sense to backhaul it  
>over even larger distances. I.e., even if a link to New York is  
>cheap, you don't want to go over Palo Alto.

If I would be located in Seattle, Palo Alto might be an alternative
way point as well as Chicago or even Dallas.

What you are trying is:

Map a two-and-a-half-dimensional world on a one-dimensional
address range. This won't work by Math.
Dimensions can only be replaced by dimensions.

Asked a database programmer how difficult it is to implement
a geographical 10km around some place search on a database
and ask them about the algorithms in use.

What they try is interleaving the West-East (X) and North-South (Y)
coordinates bitwise in the search key and handle overruns by exceptions,
like:

X_MSB Y_MSB X_2SB Y_2SB X_3SB Y_3SB ...

However this requires a _significant_ exception handling effort,
nothing someone would like to implement in a fast forwarding
engine for packet routing.

Beleave me: The cost for 256MBytes of DDRAM is in the lower
two digits Euro range, nothing to be discussed with high end
routers. A large forwarding table is much cheaper than a
geographical packet resolution.

>However, the choice isn't between allowing unlimited growth and  
>hoping we can fix it later and not using IPv6, the choice is between:
>
>- Not allowing PI
>- Coming up with aggregatable PI of some kind
>- Give up and make the exact same mess of IPv6 as we did with IPv4
>
>Today, IPv4 routing works but it has come close to the edge of the  
>cliff twice (early 1990s just before CIDR routing tables were too  
>large and late 1990s flapping cost too much CPU) and it's still  
>pushing towards that edge, which we can't see clearly but know is out  
>there somewhere.

It works. Period. And it will continue to work, because of the
economic pressure. Engineers have found a solution, thus:
Don't worry.

>So because you can't prove that you're right I should just believe  
>you without proof?

Yes, because the theory of computer sience gives you the
prrof that there are theorems in this world which can't be proven.

Computer Software belongs to this sort of items, someone can
just test it, make assumptions, but can't prove it will be correct
for any but the most simple algorithms.

And this is proven. Try to build an algorithm which checks that
any other algorithm will correctly terminate. Try to apply this
algorithm to itself, think about the consequences and then
you might understand:

Real technology is always believe at some point and not proof.

There is no way to get a _proof_ that your post (or my) doesn't
trigger a nasty CPU or router operating system bug which causes
the internet meltdown.

There is just believe and trust that reasonable router manufacturers
will deliver reasonable products.

>No, but there's no reasonable scenario that makes this happen either.  

Of course there is.
Think of the nasty multiplication bug with the Intel 486 CPU.
This happened just with a few numbers.
Intel replaced a huge amount of these chips free of charge.

Think of a race condition in a chip, which processes your post and
creates a checksum. Under very rare circumstances the race condition
applies for your checksum (see multiply bug), cascades to a bus collision
and melts down the router.
However the forwarding hardware already received the ok to output the
dangerous packet to the next router in the chain without further
intervention of the hardware which just died.

>The scenario that de facto unlimited PI in IPv6 will make routing  
>tables so large that it becomes problematic in some way or another is  
>entirely reasonable, on the other hand.

The current experience let us make a reasonable and responsible
assumption that a IPv6 routing table would take not much more
growth than the current IPv4 table, whereas current technology
permits tables of 10 to 100 times that size.

>Yes, I've heard it all before. Why don't we work on something that we  
>can all get behind? The beauty of my geographical aggregation thing  
>is that you still get PI even if it turns out that it doesn't work.  
>So you get what you want regardless of who's right. Pretty good deal,  
>don't you think?

Again: 2 dimensions -> 1 dimension doesn't work.

>Please replace "RIPE NCC members" with "people who have to pay for  
>bigger routers world wide" (not just the RIPE region) and I'm all for  
>it.

People who pay for big routers pay for big pipes, too.
They pay significantly more for big pipes than for big routers.

If you give them the choice to save 10% on the pipe for using
the cheaper offer instead of the one-dimensional-geographical
ordering-correct offer, and increase the router price by 30%
percent for increased routing flexibility and larger tables, they still
will vote for the cheaper pipe.

Best Regards
Oliver Bartels



Oliver Bartels F+E + Bartels System GmbH + 85435 Erding, Germany
oliver@localhost + http://www.bartels.de + Tel. +49-8122-9729-0






 

Next Section
     About RIPE | Site Map | LIR Portal | About the RIPE NCC | Contact | © RIPE Community. All rights reserved.
RIPE.NET Homepage LIR Portal RIPE Community