EU: Notification of new incident. Slow performance.
Incident Report for uniFLOW Online
Postmortem

User Impact 

uniFLOW Online experienced delays in accepting connections from devices and SmartClients. This impacted login, printing and general tenant performance. Feedback from support and internal tested confirmed the SmartClient was grayed out and hovering over would state the client is offline. 

 

Scope of Impact 

EU and UK, Deployment 

 

Incident Start Date and Time 

September 26 07:00 UTC 

Incident End Date and Time 

September 26 07:50 UTC (Primary Incident). 

 

Root Cause 

Following a detailed investigation, the root cause has been identified and we have placed mitigation controls in place.  The overall cause was resource exhaustion on single or multiple components which either accepted requests from endpoint or processed the requested actions. Addressing these resource issues is a standard operational process and in itself would address each incident. 

 

Underlaying condition: 

Under certain conditions this load was amplified and due to older (pre 2023.1) SmartClients in the field that would initiate multiple requests to reestablish the lost connection and by this action contribute to the artificial loading of the system. This behavior was changed in recent SmartClient including general performance improvements. The difference in operation can be seen on other deployments what have a higher percentage of newer clients. 

During the investigation we have also confirmed that the root cause was a contributing factor to previous incidents listed on the dates below.  

 

Minor: 

  • Sept 4th – EU - SmartClient registration delays EU Deployment.  
  • Sept 13th – UK - SmartClient connections issues.  
  • Sept 16th – EU - Email Print Degradation.  
  • Sept 20th – UK - Slow performance.  

Major: 

  • Sept 11th – UK - Performance and connections issues.  
  • Sept 26th – EU - Slow performance. 

Each of the listed incidents were unique and their impact on performance and usability. The visible impact to the end user was always the same or similar, being performance / connectivity issues.  

 

Action Taken: 

  • In each of these cases an investigation was undertaken which led to the overall conclusion. 
  • Scaling adjustments and service improvements have been implemented to address the current load conditions. 
  • Development implemented cloud side code improvements over the weekend 30th September - 1st of October to handle the SmartClient requests. This does not negate the need to update the SmartClients in the field but suppresses where possible the additional communication and connections. 

  • QA testing and Evaluation over the weekend saw an immediate improvement in the service condition and health. 

 

Customer Considerations: 

SmartClient as with any software, should be at the latest versions to benefit from performance and security improvements. It is recommended to customers that have not updated the SmartClient  prior to 2023.1 that they look to perform this update. 

 

Next Steps 

We apologize for the impact on affected customers. We are continuously taking steps to improve the uniFLOW Online Platform and our processes to help ensure such incidents do not occur in the future. In this case, this includes (but is not limited to): 

  • Metrics and performance monitoring has been extended to monitor the changes implements. 
  • Implementing an EOL plan for older SmartClients to improve overall performance. 

  • With 2023.4 the visibility of installed new SmartClients will be visible. 

  • Planning is working on specifications to implement an auto update mechanism for SmartClient configurable by the tenant admin.

Posted Oct 02, 2023 - 13:07 UTC

Resolved
Hello Everyone,

Update: Incident Resolved.

Date/Time:
Performance issue started September 26th at 7:00am UTC and resolved at 7:50am UTC.
From the resolution of the performance issue, we saw a gradual recovery of uniFLOW SmartClients as they successfully registered throughout the day.

We will review the findings and collected information from this incident to further improve our online services. There will be a postmortem published for Major incidents (max 20 business days) once a thorough investigation has been completed.

We are sorry for the inconvenience this has caused.

Kind Regards
Online Operations Team
Posted Sep 26, 2023 - 14:27 UTC
Update
Update:
We are continuing to monitor the deployment.

Next Update:
The next update will be in 1 hour.
Posted Sep 26, 2023 - 13:25 UTC
Update
Update:
We are continuing to monitor the deployment.

Next Update:
The next update will be in 1 hour.
Posted Sep 26, 2023 - 12:18 UTC
Update
Update:
We are continuing to monitor the deployment.

Next Update:
The next update will be in 1 hour.
Posted Sep 26, 2023 - 11:20 UTC
Update
Update:
Services are still running in a stable condition. uniFLOW SmartClients may show a greyed out icon while reestablishing a connection to uniFLOW Online. We are continuing to monitor the deployment.

Next Update:
The next update will be in 1 hour.
Posted Sep 26, 2023 - 10:20 UTC
Update
Update:
Services are still running in a stable condition. We are continuing to monitor the deployment.

Next Update:
The next update will be in 1 hour.
Posted Sep 26, 2023 - 09:13 UTC
Monitoring
Incident details:
We are currently monitoring an issue relating to slow performance in the EU deployment which has since been resolved.

Start Time:
26th September 2023
07:00am UTC

Incident Scope: 
EU Deployment

Description:
Customers may have experienced slowness or timeouts performing certain tasks across tenants in the EU deployment between 07:00am UTC and 07:45am UTC.

Next Update:
The next update will be in 1 hour.
Posted Sep 26, 2023 - 08:02 UTC
This incident affected: EU Deployment (Identification, Printing, Email print, Scanning, Reporting, Other services).