Users encounter issues when accessing West Europe (Production 1)

Incident Report for Templafy

Postmortem

Investigation

On February 9, 2026, at 8:19 AM CET, an issue began affecting the database. The incident was detected shortly after at 8:22 AM CET, when automated monitoring raised an alert indicating that database CPU utilization had reached 100%.

As CPU utilization increased, the database reached its maximum connection limit, which caused dependent services to become slow or unresponsive for users in the affected environment. The engineering team initiated an investigation immediately after detection and analyzed database performance metrics and active workloads.

By 9:05 AM CET, the team identified that the database stopped accepting new connections from services. The investigation revealed that a recently introduced query change was contributing to excessive CPU consumption under certain data conditions.

Mitigation and Resolution

To stabilize the platform and restore service responsiveness, the engineering team implemented an immediate mitigation at 9:15 AM CET by scaling up the database server. This action increased available compute capacity and reset active database connections, allowing blocked operations to complete and preventing further connection exhaustion.

The mitigation reduced CPU pressure and enabled services to gradually recover while further validation was performed.

Shortly after, the database had fully stabilized, connection limits were no longer being reached, and all dependent services were operating normally. Continuous monitoring confirmed that CPU utilization had returned to expected levels and that user-facing functionality was fully restored.

Impact and Scope

The incident affected most users served by West Europe (Production 1). During the incident window, users may have experienced slow responses or temporary unavailability in services relying on the database.

No other production clusters were impacted.

Post-Incident Actions

Following the incident, the engineering team defined several follow-up actions to reduce the risk of similar issues occurring in the future:
• Expanding test datasets to better reflect production-scale data and diverse data distributions when validating performance-sensitive changes
• Increasing internal awareness and knowledge sharing around database query optimization behavior and execution plan variability
• Investigating broader technical solutions and tooling to help detect and mitigate database execution plan selection issues before deployment

These actions are aimed at strengthening our validation processes and improving overall platform resilience.

Posted Feb 10, 2026 - 16:14 CET

Resolved

The incident has been resolved, and further information will be provided in a postmortem shortly.

We apologize for the impact to affected customers.

Posted Feb 09, 2026 - 16:28 CET

Monitoring

The incident has been successfully mitigated, and our team is actively monitoring the situation to ensure ongoing stability and performance. We are observing the systems to prevent any further disruptions.

Posted Feb 09, 2026 - 09:15 CET

Identified

We have identified an issue that affects a subset of customers and are working towards a resolution.
Further updates will be posted here soon.

Posted Feb 09, 2026 - 08:30 CET

This incident affected: Templafy Hive (Authentication, Library & Dynamics, User Management).