Insights

RPKI deployment best practices

On 05-01-2023
 
Reading time : 5 minutes

Resource Public Key Infrastructure (RPKI) is a framework that helps secure the internet's routing infrastructure, in particular border gateway protocol (BGP). It also helps legitimate resource holders better control their routing protocols to prevent hijacking and other attacks. RPKI is important but you need to know what you are doing: following a two-year deployment, Orange has some best practices to help you.

Consider the evolution of the internet: today, it has grown into a worldwide network on which consumers and companies rely for all kinds of tasks and communication. As more online services shift to IP transport, IP transit reliability and quality has become increasingly important for internet users, internet service providers (ISP), and content and cloud service providers. 

And while IP provides internet transport, border gateway protocol is essential in determining the path that IP packets are transported on. BGP is responsible for exchanging routing and reachability information in autonomous systems (AS), because it builds and maintains the internet routing table. However, BGP isn’t capable of validating routing information by itself, RPKI also has an important role to play. RPKI is both an IETF standard framework and also part of actions identified by the mutually agreed norms for routing security (MANRS) global initiative, in order to:

  • Improve the security and resilience of the internet’s global routing system
  • Reduce the risk of accidental BGP routing incidents, also known as route leaks
  • Prevent malicious IP resource hijacks 

Challenges of RPKI deployments 

Orange has spent the last two years deploying RPKI across our wholesale internet network. It’s been a big undertaking, but we are delighted with the result and we’re confident it will be a big contribution to the overall security of the global internet.

But challenges remain. When a telco implements RPKI, it can have a cumulative effect: it doesn’t just impact an operator’s direct customers, but also your own customers’ customers. For example, one of your own customer’s customers might have invalid IP addresses. They might try to remedy this by making changes to IP addresses, but neglect to update the route origin authorization (ROA). If this happens, RPKI could detect any discrepancy between the approved BGP prefixes and the ROA, consider the route as invalid and blocked. 

Orange’s structured deployment approach 

Our experience was that implementing RPKI across our network benefited from a structured approach that we broke down into six phases across the two years. We started from a position of minimizing any potential impact on our customers, which became our guiding principle throughout.
The six phases were as follows: 

1. Ensure all technical pre-requisites are deployed on the network. For example, you must make sure all your routers have the correct software version installed, otherwise RPKI may not perform correctly.
2. RTR session activation. This is where you implement the RPKI-to-router protocol, or RTR for short. It’s best done on a small scale, on individual routers to start with, to test that it is working correctly. Then you can accelerate implementation and scale up.
3. RPKI observation activation. Next, we created and published route origin authorization (ROA) and observed RPKI reaction without filtering. This activation varies depending on the router supplier: for some, it’s done by BGP session. Orange has over 2600 BGP sessions, which is the reason why this phase took some time. 
4. Notify customers of upcoming work and possible disruptions. Thanks to the observation we carried out in phase three, we were able to detect all the problems our customers might experience. RPKI filtering allowed us to identify which IP address ranges would be detected as non-valid, so we could warn them in advance and mitigate the impact.
5. Customer RPKI ROA clean up. We allocated time to spend with customers with invalid routes to identify root causes of issues and update their ROA information.
6. Activate ROV filtering with a defined plan. Thanks to our time spent on the observation phase, we were able to identify where the number of customers impacted would be lower. So, we could start activation in lower-risk regions. And using an automation program developed by our engineers, we could activate filtering by batches of around 20 customers.

What did we learn? 

Our structured, phased approach to RPKI implementation allowed us to assess the successes and potential pitfalls of the process along the way. The following are best practices to keep in mind when implementing RPKI.

Use a phased approach

This was a key learning: don’t rush. An RPKI implementation comprises several technical deployments and network checks, such as aligning different router software versions, before you can deploy specific RPKI software. Then there are checks as part of the RPKI implementation itself, such as ROA publication and ROV. All these can affect your end customers, so breaking your project into phases is important. This gives you greater control and helps you manage any incidents or disruptions more comprehensively.

Carry out bug-scrubbing before you start 

RPKI is still a relatively new functionality, and so not all router software supports it. Therefore, it’s a good idea for you and/or your router provider to run a simulated test to see whether your routers and their software release support RPKI. After you’ve done this, ensure all routers from a given supplier are updated with the latest software version, and confirm interoperability between different providers.

Keep your customers involved and informed 

In line with our guiding principle of minimizing impact on customers, we kept them informed at every step of the project. Good communications were paramount and we told them several weeks in advance that changes were coming. At the start of the project we told them the global RPKI deployment was starting and advised them of things they should do to limit potential impact on their operations. Next, and before we deployed filtering, we engaged a couple of customers who used different types of routers to perform tests on those. And finally, during the observation phase, we identified invalid routes for customers, informed them, and gave them time to make corrections to them prior to launching filtering.

Track your progress 

Monitoring and tracking how your implementation is working is crucial. If, like Orange, you need to implement RPKI on thousands of customer connections or network interconnections, it’s essential to produce automated dashboards that can track how many are complete against how many are still pending.

A progressive step forward 

It took two years, but Orange has now deployed RPKI across our entire wholesale internet network. It was a major undertaking, but we’re confident it was time and resources well spent and it will contribute to the overall security of the global internet.

 RPKI is now activated by default for any new Orange customer. And to avoid any unwanted filtering, we also provide every customer with a specific list of invalid addresses via our online eCare portal. This helps us keep minimizing impact for our customers, and improving the overall effectiveness and accuracy of the RPKI filtering.

Click here for more information on RPKI within our wholesale networks, how RPKI supports our IP Transit customers, and other good practices.

You may also be interested in these articles: