[enum-wg] Re: DNSSEC outage in e164.arpa
Wolfgang Nagele wnagele at ripe.net
Fri Mar 4 13:40:11 CET 2011
Dear colleagues, Together with our vendor, we have analysed the DNSSEC outage that occurred in e164.arpa on 15 February 2011 and have come to the conclusion that it was caused by an unknown bug in the signer system. The publication of the new DS record for e164.arpa (key tag 33067) occurred on the same day that we began rolling out our new DNS provisioning system. During that migration we had to change our serial timestamp format from YYYYMMDDNN serial number to Unix timestamp format. To do this, we had to increment serials in all of our zones twice to roll to the new values (e.g. 2011021500 -> 1297728000). This excessive re-transfer behaviour seemed to cause the system that verifies the publication of new DS records to skip this signature. As this re-transfer behavior doesn't occur during normal operations, we do not foresee this becoming an reoccurring problem in the future. However, we have decided that there is a need to increase the sanity checks performed before a zone gets published. We are now in contact with NLNetLabs and are investigating the possibility of having a zone transfer proxy that can be configured to validate a zone that comes in on one end and is only sent out on the other end if it validates against a given set of trust anchors. For more information: https://lists.dns-oarc.net/pipermail/dns-operations/2011-March/006926.html We believe this type of sanity check would benefit the community by reducing the occurrence of DNSSEC-related incidents. We would appreciate your input on what requirements should be included in a sanity checker. You can share your input with me by email, or in person at the DNS-OARC meeting in San Francisco (13-14 March). Regards, Wolfgang Nagele RIPE NCC DNS Group Manager On 2/15/11 17:30, Wolfgang Nagele wrote: > Dear colleagues, > > Our DNSSEC signer system produced a e164.arpa zone that was missing a > signature for the current KSK at approximately 13:00 UTC today. > > We resolved the error at approximately 16:30 UTC by producing a new zone > (serial 1297787668). > > We apologise for the outage and will provide more details after we've > analysed the incident. > > Regards, > > Wolfgang Nagele > RIPE NCC DNS Group Manager
[ enum-wg Archives ]