From 19a91e66ec02a339f1ffa0f072fe9992d9fe5964 Mon Sep 17 00:00:00 2001 From: Daniel Wittich <45591101+danielvonwi@users.noreply.github.com> Date: Fri, 27 Oct 2023 14:45:21 +0200 Subject: [PATCH] docs: adopt blameless post mortems and emphasize the matter of trust in the description --- radar/2019-11-01/blameless-post-mortems.md | 4 ++-- radar/2023-11-01/blameless-post-mortems.md | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+), 2 deletions(-) create mode 100644 radar/2023-11-01/blameless-post-mortems.md diff --git a/radar/2019-11-01/blameless-post-mortems.md b/radar/2019-11-01/blameless-post-mortems.md index af63185..e863105 100644 --- a/radar/2019-11-01/blameless-post-mortems.md +++ b/radar/2019-11-01/blameless-post-mortems.md @@ -1,6 +1,6 @@ --- title: "Blameless Post Mortems" -ring: trial +ring: assess quadrant: methods-and-patterns featured: false ---- \ No newline at end of file +--- diff --git a/radar/2023-11-01/blameless-post-mortems.md b/radar/2023-11-01/blameless-post-mortems.md new file mode 100644 index 0000000..db9fba4 --- /dev/null +++ b/radar/2023-11-01/blameless-post-mortems.md @@ -0,0 +1,19 @@ +--- +title: "Blameless Post Mortems" +ring: adopt +quadrant: methods-and-patterns +tags: [devops, documentation] +featured: false +--- + +> Failure and invention are inseparable twins. +> +> — Jeff Bezos + +Blameless Post Mortems provide a concept for dealing with failures that inevitably occur when developing and operating complex software solutions. After any major incident or outage, the team gathers to perform an in-depth analysis of what happened and what can be done to mitigate the risk of similar issues in the future. + +Based on trust and the assumption that everyone involved had good intentions to do the best possible job given the information at hand, Blameless Post Mortems offer an opportunity to continuously improve the quality of software and infrastructure and the processes for dealing with critical situations. We consider this a fundamental principle that enables our staff to address deficiencies without fear of repercussions and reduces the probability of incidents being concealed. + +The post-mortem documentation usually includes a timeline of the events leading to an incident and the steps taken for its remediation, as well as future actions and lessons learned to enhance the resilience and stability of our services. + +At AOE, we make it a priority to conduct a Blameless Post Mortem meeting after every user-visible incident.