This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/dns-wg@ripe.net/
[dns-wg] Action Item 48.1: Lame Delegations -- first draft
- Previous message (by thread): [dns-wg] Action Item 48.1: Lame Delegations -- first draft
- Next message (by thread): [dns-wg] WG Agenda for RIPE50
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Edward Lewis
Ed.Lewis at neustar.biz
Thu May 5 11:54:39 CEST 2005
At 23:55 +0200 5/4/05, Peter Koch wrote:
>Dear all,
>
>here is a first draft addressing action item 48.1 on lame delegation problems
>on large scale name servers. It's a -00 kind of document mainly issuing the
>problem statement. Although there are some hours left, I don't expect
>anyone to have read it by the WG meeting. An HTML version may be made
>available later and depending on how the PDP evolves, we might want or need
>to inject it into the policy developing engine. Comments are welcome!
Okay, "comments are welcome." ;) Here they come...
# RIPE DNS WG 48.1 P. Koch
# DENIC eG
# May 4, 2005
#
#
# DNS lame delegations caused by AXFR source unavailability
...
# RIPE DNS WG DRAFT Large Scale DNS Lame Dels May 2005
#
#
# 1. Introduction
#
# This document analyses causes for DNS lame delegations seen on large
# (thousands of zones) name servers and investigates and assesses
# countermeasures.
#
# First we will define the term "lame delegation" and similar
# operational problems. In the third section we will address various
# reasons that lead to lame delegations. The fourth paragraph will
# summarize mechanisms server administrators currently do or could
# apply to lower the impact of lame delegations.
#
# 2. Signs of Lame Delegations
Please, no more jokes about my picture appearing here. ;)
# A lame delegation is a DNS delegation where the target of the NS RR
# does not respond authoritatively to queries for the domain so
# delegated.
#
# Lame delegations show different symptoms, which are sometimes given
# separate names:
#
# 1. The server's responses do not have the AA bit set
#
# 2. The server responses with an (upward) referral
#
# 3. The server responses SERVFAIL
#
# 4. The server responses REFUSED
#
# 5. The server refuses the query packet (giving either ICMP port
# unreachable or TCP RST)
#
# 6. The server does not respond at all
#
# 7. The server's name does not exist (NXDOMAIN)
#
# 8. The server's name does not own any A (or AAAA) RRs
Some others...
First, let's assume that the query is (akin to):
dig <zone.name> soa +norec
There can be "no error/no data" - a problem unique to the reverse map
is that registrants who have a /17 worth of IPv4 address space might
have mistakenly configured the entire /16 and not the 128 /24's they
really have. (This is one of a few places were I can isolate an
diagnosis from the symptoms that are observable.)
Another failure scenario to consider is that, just like the query for
the SOA, you can have no response for the address lookup for the
domain name of the name server. I.e., akin to #7 and #8, there's a
"name query is not answered."
#6 is a problem. There is a name server implementation that will
answer only for what it is authoritative for - and not answer at all
for other queries. So, imagine a registrant has a /22 and reserves
the first 256 addresses (a /24) for later use. The registrant may
not configure the zone because it's not being used - meaning the
first zone goes response-less on the server, but the other 24's are
good. (This assumes you group the zones by some registration record.)
I believe it would be good to document the query you will use and
what the set of acceptable answers will be. I considered the return
code, authority bit, answer and authority counts. (An answer count
of 1 and authority records could indicate a CNAME in the answer and a
referral to another server. Yes, it happens.)
# Cases (1) and (2) above are classical signs of a zone that has been
# forgotten by its server, either by expiry or due to syntax errors.
#
# Cases (1) through (4) are common lame delegations, cases (5) and (6)
# often just appear as temporary operational problems and cases (7) and
# (8) are sometimes called stale delegations. The latter may result in
# a significant increase of the query volume at the servers serving the
# domain the non existing name server is expected to reside in.
Okay, now I'm going to start down a path that may lead to a rathole.
(Just a warning to regular participants of mailing lists.)
First, trying to guess the reason for symptoms is a slippery slope.
Over the years I have found such a divergence in operational
practices that diving what's in the configuration file via the
network protocol is nearly impossible. There are some common
meltdowns - like configuring the /16 instead of 128 /24's for a /17 -
but there aren't enough "common" meltdowns to say that we can
efficiently send diagnosis to all the problem cases. I no longer
have an idea of numbers, but I think that something like 10-30% of
problems (depending on how you count problems) fall into easy to
diagnose, the remaining majority quickly falls into "other problems."
Second, to begin the pathway to the home of the rat, you have to ask
"what do you want to do here?" What is a lame delegation? A lame
server is defined a few times in RFCs. Is the purpose to prune off
lame servers or clean up DNS operations?
E.g., what if you see a server answering correctly for a zone and the
other server is running recursively and answers non-authoritatively.
This could be because the second server has learned the answer from
the first via something like forwarding. In this case, both servers
answer correctly and won't cause an operational problem (which is
where the 'problem' of lame delegations originated). However, such a
configuration could be considered a problem registration (not truly
meeting the need for 2+ servers), and may even be an indication of
subscriber (registrant) fraud. (I.e., hijacking by changing the name
server registrations, etc.)
The rathole is "what's the target of stamping out lame delegations?"
My work in the field (no longer being pursued, at least by me)
resulted in just stating "observations." I.e., no diagnosis, just a
report. Servers that did not answer gave "never heard from" or "last
heard from" dates.
The cautionary tale here is that - because you need to repeat tests
(I'm just stipulating that here) to properly test non-answering
servers, sometimes a server that passed a test will slip into a
failure mode. My tests ignored "once good" results, meaning fat
fingering post haste could introduce lameness. (No test can really
stop that.) But the test code has to be able to "deal with it."
# While operational guidelines suggest that the NS RRSet of a zone and
# the corresponding delegation in the parent zone should match, there
# are sometimes inconsistencies. Without acknowledging or endorsing
# this practice the union of both NS RRSets shall be eligible for a
# lame delegation assessment. In other words, even an NS RR that is
# only present in the delegated (child) zone may constitute a lame
# delegation as well.
I firmly believe that you cannot require that the parent copy of the
NS set be the same as the child NS set. The parent copy should be a
subset (improper or proper) at all times. As a lame tester, you
should be testing against the parent set only - as that's the data
registered.
I'll stop with just comments on lame delegations for now. One thing
to keep in mind is how to discuss this. It would be good to document
the testing done to explore the situation. It would be good to
document the symptoms observed. It is also good to document
diagnosis, and also remediation.
I "repeat" that because "good to document" depends on the purpose of
the document. A tool to perform the testing ought not get caught up
in diagnosis, I found having it just "log" symptoms much more useful
that having it try to "think" about the cause.
Remediation is tool dependent, so a "parent" trying to help the
"children" needs to judge if it is willing to handhold completely,
for preferred DNS implementations, or be fair by not offering the
teaching service at all. (Not helpful, but unbiased on tool choice.)
PS - There was some other rathole I wanted to step in, but I've
forgotten it. ;)
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Edward Lewis +1-571-434-5468
NeuStar
If you knew what I was thinking, you'd understand what I was saying.
- Previous message (by thread): [dns-wg] Action Item 48.1: Lame Delegations -- first draft
- Next message (by thread): [dns-wg] WG Agenda for RIPE50
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[ dns-wg Archives ]