Blameless Post Mortems

This commit is contained in:
Stefan Rotsch
2018-05-04 14:34:30 +02:00
parent 698944ee57
commit c5348dd4f4

View File

@@ -1,7 +1,17 @@
---
title: "Blameless post mortems"
title: "Blameless Post Mortems"
ring: trial
quadrant: methods-and-patterns
---
TBD
> Failure and invention are inseparable twins.
>
> — <cite>Jeff Bezos</cite>
Blameless Post Mortems provide a concept of dealing with failures inevitably occuring when developing and operating complex software solutions. After any major incident or outage the team gets together, performing an in-depth analysis of what has happened and what can be done to mitigate the risk of similar issues to happen in the future.
Based on trust and under the assumption that every person involved had good intentions doing the best given the information at hand, Blameless Post Mortems provide an opportunity to continuously improve the quality of software and infrastructure and the processes to deal with critical situations.
The post mortem documentation usually consist of both a timeline of the events leading to an incident and the steps taken to its remediation, and future actions and learnings for increasing reslience and stability of our services.
At AOE, we strive to conduct a Blameless Post Mortem meeting after every user-visible incident.