Application Staging (cf push, etc.) Failures
Incident Report for Pivotal Web Services
Resolved
An official stemcell containing a fix has been deployed to the platform since last week. We have not observed any issues and are closing this long-running incident.
Posted 5 months ago. Oct 30, 2018 - 14:37 PDT
Monitoring
The update to mitigate the Linux kernel regression has been deployed to the entire platform, as of a few days ago. Platform stability has improved. We'll continue to monitor until a permanent resolution is deployed.
Posted 5 months ago. Oct 17, 2018 - 11:45 PDT
Identified
We’ve identified a potential Linux kernel regression around networking performance. We initially suspected that the impact was limited to Docker applications. However, while our initial mitigation of separating Docker and Buildpack applications lowered the rate of failure, we continue to see failures downloading resources in the Buildpack specific infrastructure.

We are rolling out an update that we believe mitigates the regression to a portion of the platform. We intend to monitor the update for a short time before applying it to the remainder of the platform.
Posted 5 months ago. Oct 12, 2018 - 15:45 PDT
Monitoring
Our mitigation has successfully controlled the error rate and only Docker apps should experience turbulence. We are monitoring the platform for further symptoms. A permanent resolution to Docker-based app issues will be forthcoming.
Posted 5 months ago. Oct 08, 2018 - 17:53 PDT
Update
We have finished applying a temporary mitigation and the platform appears to have stabilized as of 2018-10-07 02:00 UTC for buildpack-based applications. The root cause is still being investigated but is related to an OS update adversely affecting Docker-based applications. Docker-based applications have been moved to separate infrastructure as to not negatively impact buildpack-based applications so we expect Docker-based applications to continue to have intermittent issues until a fix can be put in place.
Posted 5 months ago. Oct 08, 2018 - 08:39 PDT
Identified
The issue has been identified and we are installing a temporary mitigation while we address the underlying cause.
Posted 5 months ago. Oct 05, 2018 - 10:14 PDT
Investigating
We've rolled out some fixes, but are continuing to see staging failures and are investigating.
Posted 6 months ago. Oct 04, 2018 - 06:59 PDT
Monitoring
The deploy to mitigate this issue has completed and we are seeing successful application pushes. We'll continue to monitor.
Posted 6 months ago. Oct 03, 2018 - 17:32 PDT
Identified
We believe we have identified the issue and are deploying a fix now.
Posted 6 months ago. Oct 03, 2018 - 16:19 PDT
Investigating
We are currently seeing failures when staging and or starting application (e.g., `cf push`) and are investigating.
Posted 6 months ago. Oct 03, 2018 - 15:31 PDT
This incident affected: Application Execution Pool.