Skip to main navigation Skip to main content Skip to page footer

Keep Calm and Fix It

How to run an incident management process and my lessons learnt from a decade of failures.

A decade ago, I was part of the TYPO3 Server Admin Team and helped to keep *.typo3.org alive. When things fell apart, hopefully someone from the team read monitoring emails and was available to check.

In my current company, such best effort approach doesn’t fit. But this also became only apparent after a historic >24h incident years ago. As we are critical to the business of thousands of companies and their millions of IoT devices, we run an incident management process with 24/7 on-call across our team of almost 100 engineers.

In this talk, I give an introduction to the art of incident management, how you can set it up in your company, and most importantly, the learnings I made in open source as well as while being responsible for the process since 2019.

07.08.2026 15:45 - 16:30 Room 20/21 (CKEditor)
Talk Advanced / Basic knowledge