docs: adopt blameless post mortems

and emphasize the matter of trust in the description
This commit is contained in:
Daniel Wittich
2023-10-27 14:45:21 +02:00
committed by Stefan Rotsch
parent d2ee09f387
commit 19a91e66ec
2 changed files with 21 additions and 2 deletions

View File

@@ -1,6 +1,6 @@
---
title: "Blameless Post Mortems"
ring: trial
ring: assess
quadrant: methods-and-patterns
featured: false
---
---

View File

@@ -0,0 +1,19 @@
---
title: "Blameless Post Mortems"
ring: adopt
quadrant: methods-and-patterns
tags: [devops, documentation]
featured: false
---
> Failure and invention are inseparable twins.
>
> — <cite>Jeff Bezos</cite>
Blameless Post Mortems provide a concept for dealing with failures that inevitably occur when developing and operating complex software solutions. After any major incident or outage, the team gathers to perform an in-depth analysis of what happened and what can be done to mitigate the risk of similar issues in the future.
Based on trust and the assumption that everyone involved had good intentions to do the best possible job given the information at hand, Blameless Post Mortems offer an opportunity to continuously improve the quality of software and infrastructure and the processes for dealing with critical situations. We consider this a fundamental principle that enables our staff to address deficiencies without fear of repercussions and reduces the probability of incidents being concealed.
The post-mortem documentation usually includes a timeline of the events leading to an incident and the steps taken for its remediation, as well as future actions and lessons learned to enhance the resilience and stability of our services.
At AOE, we make it a priority to conduct a Blameless Post Mortem meeting after every user-visible incident.