Increased latency on Apps Manager
Incident Report for Pivotal Web Services
Postmortem

Summary

Beginning around 16:24 UTC on July 16, 2018 latency on requests to the API nodes that back the Apps Manager developer console jumped significantly. Investigation narrowed the problem to an issue with DNS resolution on one API instance. The issue was resolved by restarting bosh-dns on the affected instance.

screen shot 2018-07-24 at 3 17 12 pm

Root Cause

The issue was a bug in bosh-dns that manifested itself on a single API instance. From that instance uaa.service.cf.internal could not be resolved.

curl --cacert uaa_ca.crt https://uaa.service.cf.internal:8443/token_keys
curl: (6) Could not resolve host: uaa.service.cf.internal

Before restarting bosh-dns on the instance, the bosh team extracted logs from the instance for further analysis.

Impact

Customers may have noticed occasional long page load times in Apps Manager.

Resolution

After the bosh team had analyzed the affected instance, the bosh-dns job was restarted, which has mitigated the issue for now. The fix was deployed on 2018-07-26.

Posted 15 days ago. Jul 31, 2018 - 16:04 PDT

Resolved
We have identified a potential bug in a DNS component that was causing significantly increased latency on some API calls. We have mitigated the issue and will follow up with a permanent fix.
Posted 29 days ago. Jul 17, 2018 - 12:33 PDT
Investigating
We are currently investigating an increase in response times affecting some requests to Apps Manager.
Posted 29 days ago. Jul 17, 2018 - 11:53 PDT
This incident affected: Apps Manager (formerly Developer Console).