Page MenuHomePhabricator

Project History
ArchivedPublic

Referenced Files
F192144: profile
Jul 14 2015, 4:03 PM

Details

Looks Like
Incident-20150312-whitespace
Hashtags
#incident-20150312-whitespace
Description

Project to track follow-up tasks from the March 12th site outage.

Summary:

Previously T91773: mc1014 server has been flaking out and dropping connectivity had meant mc1014 was disabled to diagnose some network issues. It was flagged for service again and so I went do put it back into service with https://gerrit.wikimedia.org/r/#/c/196279/ and https://gerrit.wikimedia.org/r/#/c/196281/. The changeset https://gerrit.wikimedia.org/r/#/c/196279/ unfortunately had leading whitespace which does not seem to be highlighted in red as default with the "Ignore whitespace:" setting. Jenkins gave me the +2 and go ahead so I went to Tin and pulled and issued sync-file wmf-config/session.php "enable mc1014". Shortly thereafter users reported 503'd to production sites. The change was reverted and things started returning to normal. A few bad cases were cached (seen in associated tickets) but overall the outage was sub 5 minutes. The change should have been flagged before merge as invalid, or at least flagged before sync.

Eventually, I did put mc1014 back into service successfully with https://gerrit.wikimedia.org/r/#/c/196302/

Event Timeline

greg edited a custom field.
greg renamed this project from to incident-20150312-whitespace.
greg added a member: greg.
greg renamed this project from incident-20150312-whitespace to Incident-20150312-whitespace.