Mobile Print Service Degredation Japan
Incident Report for uniFLOW Online
Postmortem

User Impact

A small number of users who submitted mobile print jobs during this incident experienced delays in retrieving their files. Users submitted mobile print jobs are not believed to be lost with the exception of one (1) file that was corrupted and caused the conversion queue to fail.

Scope of Impact

This only affected the Japan deployment and was limited to a small number of users and approximately 20 print jobs.

Incident Start Date and Time

Jun 16, 2020 - 05:39 UTC

Incident End Date and Time

Jun 16, 2020 - 06:14 UTC

Root Cause

It was found that a single Excel file had caused the mobile print conversion engine to become unresponsive. This same file was submitted multiple times blocking all conversion engines. This ultimately meant jobs from that point forward were now not being converted. The resolution was to clear the offending job from all queues and restart the services. The then saw the job processing return to normal.

Lessons Learnt

As we can not ensure the integrity of a submitted file and nor should we. The development team has implemented safeguards to ensure this type of incident can not occur in the future. The processing queues are now monitored for stale jobs and if a timing threshold is reached we will remove the file first followed by a service restart. This fix will be incorporated in the next release and will be monitored closely to ensure its effectiveness.

Posted Jun 25, 2020 - 07:39 UTC

Resolved
The issues is closed as resolved. The system has returned to a normal state and there are no further delays in the processing queues.

From the data and logs collected we will review and look to improve our monitoring and services.

Kind Regards
uniFLOW Online Team
Posted Jun 16, 2020 - 06:14 UTC
Monitoring
The issue has been identified and mitigating steps have been put in place. We saw a slight rise in the number of job held in the Mobile Print processing queue but this has already returned to normal.

This incident will be placed into monitoring but set this as operations.
Posted Jun 16, 2020 - 05:53 UTC
Investigating
Advisory Information

Start time: 10-06-2020 5:30 UTC
Affected deployment(s): JP

User Impact

Some users may be experiencing delays in releasing mobile print jobs on the Japan deployment. This is under investigation and there will be another update in 30 minutes.

Kind regards,
The uniFLOW Online team
Posted Jun 16, 2020 - 05:39 UTC
This incident affected: JP Deployment (Email print).