Storage resource required for providing device specific print data showed heavily increased latency
This incident impacted the EU deployment. The impact was mainly felt by users scanning and utilising the cloud print architecture.
· March 7th, 2022 – 11:00 UTC
· March 8th, 2022 - 17:00 UTC
The issue was found to be an increase of the access latency on our Azure storage providing the storage for print jobs that entered uniFLOW via Email, File Upload, Mobile App, Microsoft Universal Print, Chrome Extension or uniFLOW SmartClient with having uniFLOW Online configured as Spool Storage destination. This was brought about by excessive requests overly utilising the available storage account access defined limits.
Monday morning March 7th
The latency on the storage jumped to a value that caused extensive delays to print data delivery. The affected print job types mentioned above could no longer be processed in a reasonable timeframe anymore. In addition, Scan jobs performed during this time saw increased delays but should still have been delivered.
Measures were taken to limit the print processing to avoid running into a saturation of resources. As a result, printing was possible again however the time between requesting your print at the device and until the device started to print was increased and took some time to stabilize.
By mid-afternoon the latency dropped into normal boundaries and the system was operational with only minor delays by early evening.
Tuesday morning March 8th
Field reports and metrics showed the problem started to reappear despite the measures taken the following day. There was a slowdown in printing visible which resulted in another period where printing was delayed or unsuccessful.
Measures were taken to separate the azure storages used for providing the device specific print data necessary to formatting and output job settings. These changes were reviewed before being moved into production shortly after midday.
With this action the root cause of the problem was resolved, and the delivery of device specific print data was restored to the speed as before the incident, however emergency measures were left in place to avoid any saturation of resources.
By Tuesday evening uniFLOW Online was largely back to normal operational parameters. With the configuration changes to our storage and mitigations in place we closely monitored the situation.
We apologize for the impact to affected customers. We are continuously taking steps to improve the uniFLOW Online Platform and our processes to help ensure such incidents do not occur in the future. In this case, this includes (but is not limited to):
Note, this Postmortem is the same for both the 7th and 8th of March incident as they are continuation of the same issue.