protel Cloud Service Interruption
Incident Report for protel Cloud Center
Postmortem

Problem Description

On Wednesday 3rd of August 2022, protel AIR customers were not able to access the NG environment, as well as connected applications, starting approx. 11:55 AM UTC+2:00.

The applications were reachable again, approx. 12:20 PM UTC+2:00.

In order to prevent such an incident from reoccurring, we have performed a thorough analysis of what happened and the corrective measures were taken. Our assessment of the fault, including future preventative actions, can be found below.

Affected systems

IAM and connected applications of the NG environment (for example pAir, dSignature, SMP)

Impact

Login to the applications was intermittently not possible.

Root Cause

One instance of the Identity Server, WSO2, ran out of disk storage due to continued Log file writing.

Ultimately that server didn’t accept any new login attempts.

The effect of this was that logging into IAM was not possible anymore for roughly half the user base, as one of the Identity Servers couldn’t be reached, affecting all connected applications. 

In order to fix the problem the following steps had to be taken:

  • Deletion of the current Log file, which was performed in collaboration with the DevOps department.#

At approx. 12:20 PM UTC+2:00, the failed Identity Server had been restarted and as a result all systems were restored and applications were once again accessible. 

Upon completion, the functionality of the Identity Server was closely monitored.

Mitigation / Preventive Actions

The following actions will be taken in order to reduce the chance of recurrence:

  • Limiting the Logs to the last few days only
  • Increasing the hard disk size
  • Improved monitoring of the Identity Servers

Disclaimer: This document has been compiled for information purposes only by protel Hotelsoftware GmbH (protel) to the best of its knowledge and belief based on information currently available and at hand. However, protel does not guarantee that the information is correct, complete, up-to-date and/or in the correct order. Protel reserves the right to make changes and/or additions without prior notice. Protel makes no express or implied warranty (including but not limited to any warranty or merchantability or fitness for a particular purpose or use, etc.) with respect to this information. Information from Protel is provided to users "as is". Protel shall not be liable to users or any other person for any interruptions, inaccuracies, errors or omissions etc. in protel's information, regardless of the cause, for any resulting damages (including, without limitation, direct or indirect, consequential damages, etc.). In all other respects, protel's General Terms and Conditions, which can be downloaded from the protel website at  http://www.protel.net/de/agb/ , shall apply.

Posted Aug 08, 2022 - 14:26 CEST

Resolved
This incident has been resolved.
Posted Aug 03, 2022 - 13:27 CEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Aug 03, 2022 - 12:39 CEST
Investigating
Login to individual protel services is currently interrupted for some customers. We are working to resolve this issue as soon as possible. View the current status and impacted services via https://cloudstatus.protel.net.
Posted Aug 03, 2022 - 12:04 CEST
This incident affected: protel Cloud Solutions | Europe, North America (protel Air) and protel Cloud Solutions | Australia, Asia (protel Air).