High error rate and degrated performance for RIPEstat

Incident Report for RIPE NCC

Monitoring

The root cause likely is a cascade of HBase region server restarts after a Zookeeper session expired due to an unknown cause. The most likely cause is timeout after a worker did not communicate for 40 seconds.

We will add high resolution network monitoring between each zookeeper node and all zookeeper clients to track potential network issues.

Posted Jun 18, 2026 - 15:51 CEST

Identified

We have identified a faulty node in the HBase cluster and applied mitigations.

The error rate remains elevated at around 5%, but RIPEstat functionality has improved. We are continuing to monitor the situation.

Posted Jun 18, 2026 - 14:20 CEST

Investigating

Since around 10:15 UTC, RIPEstat has been experiencing elevated error rates and increased latency.

The issue is caused by a failing node in HBase, the distributed database used by many RIPEstat datasets. We are currently investigating the issue.

Posted Jun 18, 2026 - 13:45 CEST

This incident affects: Non-Critical Services (RIPEstat).