Re: [enum-wg] 9.3.e164.arpa down
-
To: Michael Haberler mah@localhost
-
From: Jim Reid jim@localhost
-
Date: Thu, 16 Nov 2006 13:17:38 +0000
-
Cc: John C Klensin john+ietf@localhost, Klaus Darilion <klaus.mailinglists@localhost, enum-wg@localhost
On Nov 16, 2006, at 11:54, Michael Haberler wrote:
I'm sorry to say, but "asking an administration politely" might be
politically correct for RIPE NCC and other parties as things stand
now -
as an answer to an operational problem it is insufficient and bound to
kill the trust of potential operators in the technology in the
first place.
While that may be true, this is the world we live in. As far as the
ITU is concerned, ENUM delegations have been made under interim
procedures on the understanding that this was for trials, not
production services.
I strongly urge *everyone* to not rock the boat here.
If operators are experiencing difficulties because of lame
delegations or whatever, tough. That's what happens on the internet.
Things break. Get over it.
This is part of a real time service and operators rely on it today.
Hoping some administration can get their act together days later to
defibrillate a broken box is while dozens of operators have stuck
boxes
denying service to *all* users (not just the folks calling to the
country with broken nameservers) is unacceptable.
What you say is true Michael, but it's based on a false premise.
Where are the service level guarantees and commitments that everyone
using ENUM has agreed to? This is all being done on an informal best
efforts basis underpinned by ad-hoc arrangements between the IAB, ITU
and RIPE NCC. Operators should plan their service offerings accordingly.
If someone's SIP servers can't cope with lame delegations, then
that's a problem for the SIP server. It's better they get these fixed
now instead of later when there could be millions of lame ENUM
delegations. My guess is ~60% of .com has lame delegations so if/when
ENUM gets mass public acceptance, comparable levels of DNS brokenness
should be expected. SIP servers and other ENUM-aware software need to
be able to deal with that.
And note, this
particular ENUM delegee isnt even doing any service with his
delegation
in the first place. Yes, it has to do with current SIP server
technology
(thread starvation) - asynchronous DNS resolution and deadline
scheduling may alleviate the problem, but no, that is not the only
thing
we can do except hope and pray.
If some part of the ENUM tree is unreliable, work around it until the
underlying problem gets fixed. For example by configuring the SIP
servers not to do DNS lookups for that broken part of the tree. This
is no different from how problems elsewhere with lame delegations
sometimes get handled: eg temporarily tweak the mail server to shunt
mail for lamedelegation.com off to a queue instead of doing DNS
lookups in the hope of achieving SMTP delivery. And of course tell
the DNS adminsitrator for that zone that their name servers are broken.
The other part of the story is that ENUM delegations are currently
made
not just for reasons of providing production service, but - let's be
frank - occupying a "national resource" by some party which is not
necessarily part of the Internet management culture, and intending,
willing and able to provide realtime service on a 7x24 basis.
There are very few parts of the internet that have 24x7 monitoring.
Or cast-iron service level agreements for things like DNS operations.
Why project these for ENUM when they're not in place for most TLDs
and huge chunks of in-addr.arpa? Which is not to say that these sorts
of arrangements shouldn't be in place. That would obviously be a good
thing. I'm just curious why operators appear to have expectations for
stuff living under e164.arpa when they don't have those elsewhere in
the DNS. Or perhaps they're just using software that doesn't cope
with lame delegations as well as it could.
If some operator decides that a country has just done a defensive
ENUM delegation, they should take appropriate action. That might mean
not doing ENUM lookups for that part of the tree for a variety of
reasons: dodgy DNS service, weird NAPTR records, sparse population,
whatever. That's a policy decision each operator has to make for
themselves IMO.
What we as a community have failed to to is to attach appropriate
rules
and strings to that step. I believe we need to do the following:
1. trials need to be tagged so as not to interfere with production.
True. But there is no "production" ENUM service. At least not
formally. There can't be while the current arrangements for e164.arpa
exist. And yes, I do know providers are selling ENUM registrations
and offering services for money on top of ENUM.
2. we need to establish rules on what goes into e164.arpa which goes
beyond the regulator/ITU TSB loop. IMO there is no point in activating
NS delegations when the requesting organisation isnt committed to
*run
service*. Having some ministerial nameservers working on and off
just so
the administration can display territorial behaviour is a vanity
affair
where somebody else pays the bill.
Why not write up a draft/paper for this WG? If there was some formal
document that explained to government officials how they should
operate their Tier-1 name servers, that would be a big step forward:
follow the advice in RFCs 2870 and 2182, don't allow servers to go
lame, have some sort of external monitoring in place, etc, etc. There
are no functional specifications for Tier-1 name servers, so writing
these up would be a big help. At then least there would be a document
that the bureaucrats could be pointed at.
In other words let's try to light a candle instead of cursing the dark.
3. "Pulling a delegation" and removing a "harm to the network"
obstacle
are different things.
Klaus seemed to be suggesting the delegation for 9.3.e164.arpa should
be pulled to prevent "harm to the network". For some definition of harm.
Yes, lame delegations are bad. They're annoying and they create
operational problems. But they're a fact of life and they'll never go
away. Software just has to deal with this. For ENUM, the problem of
lame delegations will get worse because there will be much more bit
rot the further the DNS moves away from the clueful core to the edge
of the network. So I think in addition to writing a doc that could
persuade Tier 1 operators to do the Right Thing, there's a need to
get software to handle lame delegations more gracefully.
This could be through the sorts of things you mentioned Michael:
configuration options, smarter timeout strategies and so on.
|