Operations

Introducing ERA-IX Rapid Shutdown

Author avatar
Elias
Published on 21 April 2026

We're introducing ERA-IX Rapid Shutdown to reduce the impact when a member unexpectedly loses physical connectivity towards ERA-IX. We've seen unexpected disconnects happen due to peering router crashes, fiber cuts, and various other reasons.

What's the problem?

On traditional internet exchanges, when your link towards the exchange goes down, the route-servers are unaware of this and will continue to redistribute your routes until the BGP hold timers expire. During this period all traffic towards your network will be dropped by the internet exchange, as it can not be delivered, leading to a complete loss of connectivity for as long as the hold timers last.

What's the solution?

With Rapid Shutdown, our route-servers are aware when a port goes down unexpectedly and purge the routes belonging to a physical connection immediately, notifying other networks to update their routes, getting around the problem faster. Reducing the impact caused by a loss of link towards ERA-IX. This is just one of the things we've implemented to facilitate a first-class peering experience on ERA-IX 😎

For Rapid Shutdown, we're detecting which services belong on a physical link when it changes state. Then the system immediately purges the BGP routes with next-hops belonging to this port. When this happens, for diagnostic purposes, the route-servers will also send a message to the (presumably unreachable) member. Which would read in the logs along the lines of "6/4 (Cease/administrative reset) reason: rapidshutdown".

Did BFD not solve this ages ago?

While we do support BFD (Bidirectional Forwarding Detection), we're not observing widespread implementation of BFD. We do support BFD on our route-servers, and encourage the use of BFD in case of transported services, there are also some disadvantages to BFD: in case of problems with the implementation (bugs) or operating conditions (high CPU load) it can cause unexpected session flaps.

In contrast to BFD, Rapid Shutdown is an out-of-band mechanism. Meaning, if the service fails or encounters a problem, it will be fail-safe, and no impact will be caused to any operational services.


Author avatar
Elias