Skip to content

1. Retrospectives

Purpose

Structured evaluation of the AI system and the team to identify improvement areas and embed them in the next cycle.

1. Objective

We evaluate the functioning of the AI system and the team in a structured and periodic manner to identify improvement points, make adjustments and embed them in the next cycle.


2. Entry Criteria

  • The system is in production (Gate 4 approved).
  • Monitoring is active and delivering measurable data.
  • The management team is assembled and has agreed a fixed cadence.

3. Core Activities

Sprint Retrospective (Bi-weekly)

The sprint retrospective evaluates the functioning of the team and the system over the past sprint. Use the Start / Stop / Continue format as a basis, supplemented with AI-specific questions:

  • What data quality problems have emerged?
  • What outputs surprised us (positively or negatively)?
  • Have any Hard Boundaries been approached or crossed?
  • How did the collaboration with the Guardian go?

Root Cause Analysis

For each significant problem the team conducts a thorough root cause analysis. Use one of these methods:

  • 5× Why: Ask "why?" five times to move from symptom to root cause.
  • Fishbone diagram (Ishikawa): Categorise causes along dimensions: Data, Model, Process, People, Tooling.
  • Timeline analysis: Reconstruct the timeline of events that led to the problem.

Change Experiments

Each retrospective results in at least one concrete change experiment — a bounded adjustment in working method, process or tooling that the team tests in the next sprint:

Element Description
Hypothesis "If we change X, we expect Y improvement."
Measurement How do we measure whether the experiment succeeds? (KPI, observation, feedback)
Duration One sprint — then evaluate and decide: keep, adjust or stop.
Owner One team member who drives the experiment.

Duration: 60 minutes. Owner: AI Product Manager. Output: Action list + change experiment in the backlog.

Quarterly Model Retrospective

Every quarter we evaluate the model itself — not just the team:

  • Evolution of accuracy compared to the baseline.
  • Signals of Performance Degradation: has the distribution of input data changed?
  • Comparison with the original Business Case: are we still delivering the promised value?
  • Assessment of the Golden Set: are the test cases still representative?

Duration: 3 hours. Owner: Data Scientist + AI PM. Output: Quarterly Model Health Report.

AI-Specific Retrospective Questions

In addition to the usual team insights, we also ask at every AI project:

Dimension Question
Data quality Are our training data and production data still aligned?
Governance Have we complied with all Hard Boundaries this sprint?
Transparency Can we explain to the Guardian why the system made specific decisions?
Team capacity Does the team have sufficient AI knowledge to manage the system?
User feedback What are end users saying about the quality of the output?

4. Team & Roles

Role Responsibility R/A/C/I
AI Product Manager Facilitates the retrospective, guards action list A
Data Scientist Reports on model performance and Performance Degradation R
MLOps Engineer Reports on infrastructure and monitoring R
Guardian Evaluates compliance with Hard Boundaries and ethics C
End users Provide feedback on quality of outputs C

5. Exit Criteria

  • Action list is documented in the backlog with owner and deadline.
  • Model Health Report (quarterly) has been shared with the CAIO.
  • Significant findings have been passed on to the project Lessons Learned.
  • Decision on retraining or adjustment is documented.

6. Deliverables

Deliverable Description Owner
Sprint action list Concrete improvement points with deadline AI PM
Quarterly Model Health Report Performance, Performance Degradation, Business Case comparison Data Scientist
Retrospective Minutes Decisions and discussion points AI PM

Related modules:


Next step: Record improvements in the Kaizen Log → See also: Metrics & Dashboards