RIPE 71 DNS Working Group Minutes

Thursday, 14 May, 9:00-10:30
WG Co-Chairs: Jim Reid, Jaap Akkerhuis, Peter Koch
Scribe: Emile and Florian

A. Usual Administrivia

Co-chair Jim Reid opened the session. Jim announced a change to the published agenda. Item G (Discussion of Latest SSAC Recommendations) had been removed, allowing slightly more time for discussion on other presentations. He apologised that the minutes from the DNS Working Group session at RIPE 70 were not yet available. This would be resolved soon. There were no open action items.

B. RIPE NCC Report - Anand Buddhdev, RIPE NCC

Geoff Huston, APNIC, asked where the requirement came from that states everything must be signed with both keys. Anand replied that RFC6840 specifies this and the zone signer should arrange this. Anand explained that he discovered that Unbound and Verisign's public DNS were following this part of the protocol very strictly. They wanted signatures for both keys to be present, even while the DS record was still pointing to the old KSK. Geoff then asked whether Anand thought this was due to an old DS record issue or something within the zone itself. Anand said he believes it is an old DS record issue.

Dave Knight, Dyn, asked Anand expand on his description of running different software implementations on K-root, specifically with regard to verifying the answers received from each of them. Anand answered that before the RIPE NCC deployed Knot and NSD they had scripts that run queries against a zone, and have looked at responses of several queries to see if they identical. In some cases they have found differences. Anand mentioned that he had sent a stream of reports onto the Knot users mailing list to describe this. Dave then asked Anand about the distribution, and whether it is done on a per-server basis or something else. Anand replied that, at the multi-server sites, each server runs different software. Where the RIPE NCC arranges single-server sites, the software varies, balancing out the numbers of NSD, Knot and BIND equally. Dave then recommended that in the future Anand expose the implementation in the NSID of the host name in order to provide the version in reported problems.

Xavier Gorjón, NLnetLabs, asked Anand why the NCC used multiple DNS implementations instead of just using one they knew best. Anand explaind that this done to ward against the possibility of a vulnerability in one implementation taking out the enitre DNS server cluster. For example, if a bad packet was sent to the cluster and took down every server that would cause a big service outage, so the decision is to guard against this type of situation.

Dimitry Kohmanyuk, Hostmaster Ltd., asked why the RIPE NCC used two BGP daemons at the same time. Anand said it is simply a matter of different architectures: the RIPE NCC uses Bird in K-root nodes based at Internet exchange points because these clusters have to carry a full routing table while Exa-BGP is fine for single instance K-root locations which only need to announce K-root's anycast prefix.

C. Measuring the Impact of IPv6 Resolver Preference - Chris Baker, Dyn

Geoff Huston made a few comments on the subject of this presentation, saying the value of the research comes in what is queried for, rather than how the query is done, and that the question of “A” or “Quad A” makes no difference. Apple's choice of 25ms vs Chrome's choice of 300ms is the interesting issue. Geoff went on to explain he had expected Chris to look at that variance, given that, on the whole, the DNS does not use IPv6, and that Google's public DNS doesn't use IPv6 if it can get away with it. Almost no one does any DNS queries over IPv6 transport. Geoff pointed out his own research and the stats he has produced on the topic, saying the DNS doesn't do Happy Eyeballs, and doesn't prefer IPv6.

Dmitry Kohmanyuk contested Geoff's point, at least on the authoritative servers for .ua. He said he has seen a steady increase of the rate of IPv6 versus IPv4 transport: there are a lot of queries for authoritative servers on IPv6 transport. The issue of Google's use of IPv6 transport is another question.

D. Impact of DNS over TCP - a Resolver Point of View - Joao Damas, Bondis

Jim Reid asked Joao if he had any plans for doing TCP queries to authoritative servers on the Internet to measure brokenness caused by firewalls and routers that are blocking TCP port 53 traffic. Joao replied that he'd like to look into that.

Shane Kerr, BII, said given the packet sizes, maybe the push for Elliptic Curve Cryptography may be an anti-goal, as it moves the encryption burden from the authoritative to the resolver. Joao replied that it is a trade-off between bandwidth and server CPU, and CPU seems to be in abundance these days.

Shane went on to say that one thing missing in the list of motivations for TCP is the effect of middle boxes on traffic in general, since some factors will cause problems with fragmented packets. This is a further motivation for using TCP.

Shane then mentioned Joao is probably following the work of the IETF, and his presentation looks like it argues against a lot of complicated signalling and management options between servers and clients. It seems like the authoritative servers would be unable to trust any information they get from resolvers. Joao agreed, saying clients have all the motivation in the world to cheat.

Geoff Huston answered Jim's question, stating that his team did this experiment two years ago and found that about 17% of recursive resolvers won't use TCP. It only affects a much smaller number of stub resolvers – most of these go and use another resolver. A little over 2% of users get stuck and they don't get an answer. Geoff concluded, however, that this is not a serious problem.

E. Integration Testing of DNS Recursive Servers - Ondřej Surý, CZ.NIC

There were no questions. Benno Overeinder from NLnetLabs congratulated Ondřej for a great piece of work.

F .nl Open DNS Datasets and Statistics - Marco Davids, SIDN

Jim Reid asked if there were any plans to expire the data in the Hadoop repository. Marco replied in the affirmative, saying a policy is in place: the IP address info is removed after 18 months, so that data get anonymised, but for now the rest remain stored indefinitely.


H. Discovery Method for a Validating Stub Resolver - Xavier Gorjón, NLnetLabs

Ralf Weber from Nominum asked how Xavier got the ISP results because he had the feeling they may be wrong. Ralf explains that in the test, an authoritative server was used, on which you did a 'create' on the forwarding chain. Then, the authoritative server got an IP address, and that is what he used as the IP address for the ISP. Xavier confirmed this. Ralf said that the problem with this scenario is that the receiving IP address is the one the ISP uses for the outgoing address. In addition, a lot of ISPs use anycast addresses behind a load balancer, which means it's the unicast address of the resolver that's seen and not the real address that the ISPs give out. That may explain why there are some errors.

I. DNSSEC for Legacy Applications - Willem Toorop, NLnetLabs

Jelte Jansen from SIDN said that in the office they are running a pilot with a hack on Unbound that does exactly what Willem just presented on: it returns a fake answer and then points you to a page where you can set an NTA. He suggested that he and Willem catch up for a chat. In addition, Jelte wanted to know if Willem has seen the discussion that was happening on the glibc mailing list on DNSSEC resolution. Willem said that he had seen this.

Peter Koch asked Jelte to summarise the discussion. Jelte said that it's difficult. <laugher from the room> Jelte only found it because he saw it on a news site about Linux and Open Source that contained a summary, but that was already two pages. Essentially, most of the discussion was about how to get the configuration right because you cannot trust resolve.conf. One of the points was that you might use Name Service Switch (NSS) or maybe Name Service Caching Daemon (NSCD).

Eduardo Duarte from DNS.PT wanted to know what happens when it fails for other things than HTTP. Willem explained that the nsswitch modules work as a shared library that is loaded by the process that does the request, so you know which process is asking for it. So, the software checks for the process name — it's a dirty hack — and if it says Firefox then it rewrites the address and otherwise it does not. Willem added that it does nothing with HTTPS.

Tim Armstrong from Treestle said that he is happy to see that non-technical users are made aware of DNSSEC.

Warren Kumari from Google commented that he and Evan Hunt have an IETF draft allows recursive servers to signal additional error information back to stub, which will allow the recursive server to at least tell the stub what went wrong and why. But really, the question that Warren waneds to ask is how dirty Willem felt after implementing this some of this stuff. Willem replied that he loves it. <more laughter from the room> Willem said he was even thinking about a Jabber proxy that would try to interfere.

Peter Koch continued on the topic of dirty feelings by asking the room how many people have heard of the guidance by US-Cert. A couple of hands are raised. Peter remarked that the text is not that bad, but mainly focusses on enterprise setups, but he would like to prevent that the guidance spreads too far without proper qualification. Jaap Akkerhuis from NLnetLabs had read all the text. He said it contains a lot of references to other documents and tries to discourage people from running their own DNS. And then at the very end of the document it says it would be a good idea to implement DNSSEC. In short, it has a tendency of scaring readers. Peter added that its primary value seems to be that it's an additional item on the audit checklist.

J. Implementation Challenges of Geographic Split-horizon DNS - Jan Včelák, CZ.NIC

Via the chatroom, Peter van Dijk from PowerDNS comfirmed that the Powerdns geoipbackend most definitely supports edns-client-subnet, though the documentation could do with some improvement.

Victoria Risk from Internet Systems Consortium said BIND 9.10.0 has geoip support which uses ACLs. This was first available as a patch but is now integrated. BIND has had support for edns-client-subnet on the authoritative side for about a year and that also originated with as patch. The code is available on ISC's open git repository.

Peter Koch noted that the tools/code Jan's discussed were IPv4 only. He asked about the rest and for example the databases that were mentioned. Jan replied that Maxmind, which leads the development, is also doing this geolite library. The content in the old geoip database is in two separate files: one for IPv4 and one for IPv6. The new one, libmaxminddb, contains all of this in one space. Jan said he thought they are using IPv4 to IPv6 mappings in one tree. Maxmind will also make the information available in CVS.

K. Root Zone KSK Rollover - Jaap Akkerhuis

Warren Kumari from Google remarked that quite a number of people are grumpy with the whole process, as there was already a public consultation in 2013 and a number of documents were already published on the topic. However, nothing really happened after that; it doesn't seem any of the recommendations have been followed. Warren said that he has seen a numbers of presentations at various operator meetings that are very similar to this one. Yet, none of the other issues that were raised were addressed, such as the communications plan. There are a lot of people who have deployed DNSSEC now, but the document doesn't go beyond "we'll talk to technical people". The documentation also doesn't go into how breakage will be detected and what the metrics will be. Hopefully the next version of the document will have it. There are a bunch of other concerns that people have, such as an emergency key roll, which is a topic that is ignored so far. Warren realised that this these topics are not of the concern of the design team, but he would like to bring them up whenever he can. Jaap explained that he cannot channel the whole chain, that is more up to ICANN. He went on to say that based on the comments ICANN has received, outreach remains a large part of the efforts.

Jaap admitted that not all of the public comments have been publicly addressed yet, but they have been discussed in the design team. Not everything is published, to prevent other questions while the design team is still at work. Warren said that he has the impression that ICANN is just claiming that they have a solution for RFC 5011 but is silent on the rest.

Matthijs Mekking, NLnetLabs, pointed out that if something goes wrong and the outgoing root key has to be restored, this has to be done before implementations have removed the missing key. Unbound's default for this is 366 days, which seems safe. He said it would be worth checking what other implementations do. Jaap said that Matthijs refers to the fact that if it's not in the zone itself, the implementation might still have the outgoing key on disk and might still be able to use it.. Jaap admitted that he was unclear what behaviour is specified in RFC 5011.

Shane Kerr from BII remarked that the future is hard to predict. Shane asked how long the process is going to take. Jaap explained there are a couple of factors that make it difficult. For example, the key rollover would need to be aligned with the current key signing ceremonies for the root. That will make the process rather long; something like six to nine months including the removal of the old KSK. However, the timeline is something that needs to be looked at after the design team's report has been published.

Shane was concerned about all of the software packages that have the key included in them. He thought this would need to be treated as a security patch for a lot of these, because ultimately this would be the most reliable way to get this into enterprise-level Linux-distributions. Jaap explained that this is only the case of the validators cannot be configured using RFC 5011.

Geoff Huston from APNIC stated that as a member of the design team he is appalled with the DNSSEC standards that came out of the IETF. The description of the way in which you maintain keys, the fact that you have a relationship with related parties and the way in which change the key material is changed at the root: is nonsensical, if the description is there at all. The standards simply said that it's just the DS record at the parent. When the IANA was pushed into putting up a key at the root, there was nothing else, such as an emergency roll. If the key is rolled in an emergency, every relying party has got the wrong key and there is no way of fixing that. Even this planned roll is a roll into disaster. There is no emergency procedure that will work. Right now nobody has an understanding of how many validating resolvers are using the old automated old-signs-new approach or how many are using RFC 5011. Or what important stuff depends on these validating resolvers. There is no emergency key in preparation: all of this is just sitting there in a vacuum.

Geoff said the only way we do security with DNS right now is with this Certificate Authority (CA) stuff. Every time a CA gets busted and someone prints a new fake certificate for Google, everyone should look at their online banking and quake in fear. Because frankly, the current framework for security on the Internet is bullshit. The only known way to fix this is to actually tie in a better trust system. And this really impinges upon the use of DANE, which in turn relies on DNSSEC. So this really needs to work, like yesterday.

Geoff claimed the problem is that most of the standards are missing or defective. But when the roll over happens next year, there will be trouble and a lot of people will be shouting at each other. If the engineers don't take an interest in this now and figure out the signalling is wrong, we'll find out at the worst possible moment that the standards are not right. There is a large bunch of stuff missing right now and we're exposed to a process that is aimed at minimising the damage. That's Mission Impossible in his opinion.

Peter Koch mentioned that interesting things might happen in September 2016, regarding the oversight on layer 9 and 10. Peter asked if that deadline has influenced the actions and the time line of the design team in any way. Jaap said that he was not aware of anything about that. The only thing he knows that it should not hamper the current operations in any way. If the transition happens in the middle of this, it should be solved in some way.

L. Co-chair appointment process

Peter Koch confirmed that he had stood down as a WG co-chair. Jim thanked Peter for his years of service to the WG. He announced that the consensus decision of the WG was Dave Knight should become the new co-chair. Jim welcomed him and thanked Dave's rival, Ondřej Caletka from CESNET, for volunteering.


Eduardo Duarte from DNS.PT asked about the status of the revised arrangements for the NCC's DNSMON service. Jim explained that the NCC had published its plans and asked the WG for comment. If there were no substantive changes, the document would be accepted as-is and put into effect. Since this was not a policy in the sense that it applies to some other working groups, it would not be necessary to put the document through the Policy Development Process. It could just be published as a RIPE Document.

The WG was asked to submit any comments on the proposed document in the next two weeks. Jim said that if all went well, the NCC should be able to implement the new arrangements by the end of the year or early next year.

Jim closed the meeting by thanking everyone for attending and to the NCC staff and the stenographers for their assistance.

RIPE Forum

The RIPE Forum is an additional way to participate in RIPE community mailing list discussions using a web-based interface rather than an email client.

Check out the forum