RIPE 85 Routing WG Minutes

Wednesday, 26 October 2022, 09:00-10:00 UTC+1
Chairs: Ignas Bagdonas, Paul Hoogsteder, Job Snijders
Scribe: Ties de Kock
Status: Final

1. Administrativia

The presentation is available at:
https://ripe85.ripe.net/wp-content/uploads/presentations/68-rtgwg-ripe85-221025-final.pdf

The video is available at:
https://ripe85.ripe.net/archives/video/899/

Job Snijders opened the session and welcomed attendees.

2. We Love Route Leaks

Alexander Azimov, Eugene Bogomazov

The presentation is available at:
https://ripe85.ripe.net/wp-content/uploads/presentations/69-ripe85.roles_.routingwg.pdf

The video is available at:
https://ripe85.ripe.net/archives/video/900/

Eugene Bogomazov explained route leaks and the issues they cause. Route leaks result in lost revenue and degradation of service, and they are often caused by misconfigurations. They happen often and happen to both small and big ISPs. Manually mitigating route leaks requires effort and communication. Instead of configuring route leak prevention by communities that are set on ingress and checked on egress, they proposed to add BGP roles and Only-To-Customer attributes (https://datatracker.ietf.org/doc/rfc9234/).

Geoff Huston, APNIC, said one of the more annoying route leaks was when he got an aggregate from his upstream provider – they de-aggregated internally and the de-aggregations leaked. He asked if it was correct that they were not marked with anything.

Alexander said far as he understood, the question was about the famous work of BGP optimisers. They were wonderful tools that gave you a way to de-aggregate your prefix when you were receiving an aggregate. He wasn’t sure how they worked if they copied all the attributes. If they didn’t, unfortunately they needed to wait for another document to arrive from the IETF: ASPA, and it should fill this gap too. Still, from his experience, the majority of route leaks did not originate from that source.

Job Snijders, Routing WG co-chair, said he agreed. What BGP optimisers did was more of a hijack than a route leak. Routing security was a multi-year journey, and they were not finished yet.

Rüdiger Volk, retired, said the most pressing question that came to his mind was why it took so many years from introducing the draft to actually getting it done. He didn’t think it was the author’s fault.

Alexander agreed and said it was their fault as a community. They could rely on some people to push the technology, but they would not get there without a collective effort. Routing security was a joint effort – and its success or failure was also shared. If people were not happy about some technology moving slowly, they should help with it.

3. Do We Still Need the IRR?

Massimiliano Stucchi

The presentation is available at:
https://ripe85.ripe.net/wp-content/uploads/presentations/71-10-RIPE85-IRRAnalysis.pdf

The video is available at:
https://ripe85.ripe.net/archives/video/904/

As part of MANRS, a project to improve routing security, Max took a deeper look into IRR data to ask if they needed this data and whether they could still trust it. There were about 30 IRRs which were separate databases with different levels of trust. Max and his colleagues had compared RADB and ALTDB against data from the RIRs and were sharing preliminary data. There was a lot of non-matching data in the IRR and they recommended that it was better to rely on RIR data. They encouraged the use of RPKI because it did not have external databases to look at. And they also thought legacy space holders should be allowed to use RIR services and RPKI.

Robert Lister, LONAP, said that AS sets were still important, even now that they had RPKI, because they needed a way to describe a set of ASes. So, they were still important for a route server filtering.

Referring to the previous presentation, Robert also that if you really wanted people to adopt something you had to force them. When someone joined his IXP, he expected them to have objects in RADB. Otherwise, they would say that it didn’t work – but why didn’t it work? Because they didn’t have the right stuff in RADB. Fortunately (or unfortunately), this was often driven by the larger players in the room – they were not going to put up with it anymore. Either people put this up or they were not going to peer – which caused people to decide they should do something. How many years did it take them to get working communities? And that was only brought about because they were saying to vendors ‘this does not do what we want, we are not going to buy your kit unless it supports RFC-’ – and they did it, in 2017. It was a hard job, but better than nonsense in his opinion. It was really hard to change things for the better unless it had consequences for people. A lot of people would say ‘we are not going to peer with you unless you do RPKI’ and that was likely the only way to nudge people in the right direction.

Max said there was a similar discussion yesterday during the MANRS community meeting. The goal was to set guidelines for where they wanted to be in five years. It would take time, but if they never started, they would never get there.

Robert said the other thing he disliked about the RADB was that they had a handful of networks that just seemed to put the entire world into the IRDB. They were actually only advertising 2,000 prefixes, but there were 250,000 prefixes listed in the IRDB – which prefixes were you going to accept? So, his route server had to process a huge filter list, while in practice they would only announce a tiny amount. There was not the granularity there that they would like. He thought that this analysis of what is being announced and not being announced was something to look at. Did people want to fix it? No.

Max said some people might put data in there for de-aggregation.

Job said he would close the microphone queue due to time, but he wanted to make one comment as a participant. Slide six showed how RPKI based filtering applied to the IRR was helpful. The lower-right object was not visible through the rr.ntt.net because the NTT IRR server that mirrored all other servers that did RPKI-based filtering. The object on the lower right violated the MaxLength that Max set. It was possible to mitigate the propagation of strange route objects in this ecosystem if everyone would upgrade to IRRD v4.

Stavros Konstantaras, AMS-IX, complimented Max on his work and the result and made two comments/requests. First of all, Max had mentioned that he had some preliminary data and they were going further. He proposed they add a check about how many split data they had, where, for example, the ORG object was with ARIN but the policy was in RADB or RIPDB. Why? Where should they go? What should they trust?

Stavros also asked Max if he saw a future where they no longer had these secondary databases like ALTDB. A situation where they only had some trustworthy databases where they could go and fetch data. Because for him at AMS-IX, as a big Internet exchange point, they would like to know that there were five reliable sources where they could go and fetch the data of their customers and then build reliable filters, in contrast to digging around.

Max said he had basically just described RPKI.

Stavros said it wasn’t there yet.

Max noted that the person behind him [Geoff Huston] was shaking his head. He thought that was a hint of the answer there. He didn’t know. What they already had was the tools they could move to make it more reliable and trustworthy. So, they should put more emphasis there rather than trying to fix what had been legacy for 20-30 years and in the IRR.

Geoff Huston, APNIC, said he was glad they were on slide six, because that slide illustrated his point. He was not speaking for AS58280. You could put any value you wanted there and it was still valid. You can put any number you wanted there and it was still valid. But you did not have the agreement from AS58280 to actually do it.

So, he could give permissions to any AS for his prefixes and it meant nothing until the originating AS said this was the span of objects it was prepared to announce, which was the second object. Because at the moment, in this partial deployment world, if he saw a number of prefixes coming from a network, the ones that were not covered by ROAs, what was he going to say? Was that real or was someone faking it? What Vultr was doing was the other part of the offer. You offered. They accepted. There was no RPKI object to describe that acceptance. And until you got that, if you wanted to do the auditing of the complete set of announcements from an AS, you had to rely on the existing routing databases. Without that, it was just an offer, without any visible acceptance, until you completed the underlying offer of what the crypto was meant to say. RPKI was not complete in that space.

4. An Opinionated Review of RPKI Validators

Marco d'Itri

The presentation is available at:
https://ripe85.ripe.net/wp-content/uploads/presentations/25-rpki-validators-ripe85.pdf

The video is available at:
https://ripe85.ripe.net/archives/video/906/

Marco gave an overview of RPKI validator software and how they were packaged in the Debian ecosystem. They provided an opinionated overview of the software packages for the validators and rtr servers (StayRTR, rtrtr).

Rüdiger Volk, retired, thanked Marco for the review. Seeing that more than two implementations were actually available and useful was obviously good news. Having someone provide reports about how to classify the available solutions was helpful.

From the days when he was operating stuff, he remembered that the question of whether the current state of his RPKI validator actually had some anomaly in the data or in the operations was something that he cared for quite a lot. Back in those days, anomalies did occur quite frequently. He guessed that the frequency was lower today. But the RPKI system, with its complexities both in structure and in operations, would always run the risk that anomalies of various types occurred. The reporting and alarming from the implementations and potentially the heuristics used to deal with them were interesting. He thought that talking about these was difficult but recommended that this was included in the overview.

Marco said that was a good idea.

Benno Overeinder, NLnet Labs, said his opinion was that he would like to give some credit to the RIPE NCC’s RPKI Validator because they were the first, and they paved the way for the current RPKI software. He also said he understood that OctoRPKI did now have a dedicated software engineer; they were refactoring. Maybe in the future, there would be more and more frequent updates.

As for Marco’s suggestion on the packaging of Routinator using the system trust anchors, Benno said this could be resolved. He agreed that software development and packaging were two different activities. NLNetLabs traditionally relied on the Debian/distributors for packaging the software. He asked how he thought they should go forward with the Rust ecosystem, because this issue would also appear with kernel modules, and for example, Firefox had 10% Rust code.

Marco said the problem was not Rust itself but the quantity of libraries. Obviously, using libraries and not reinventing the wheel every time was good. But at this point, as the Debian project, they did not have a solution for this situation. They might find an acceptable way to find vendoring but they didn’t know. They had talked about this earlier in 2022 but with no solution yet. He thought that at some point, the problem might become so big that Debian would have to come up with a solution.

5. Lightning Talk: An Update on BGP and RPKI Monitoring

Massimo Candela

The presentation is available at:
https://ripe85.ripe.net/wp-content/uploads/presentations/75-ripe85_monitoring.pdf

There was no time for Massimo Candela’s presentation.

Job recommended checking out Massimo’s slides on the RIPE 85 website.

Paul Hoogsteder, Routing WG Co-chair, said he hoped to see everyone again in Rotterdam in the spring of 2023.

End of session.