Skip to content

1. Core Activities & Roles (Monitoring & Optimisation)

Purpose

Overview of core activities and role assignments during the Monitoring & Optimisation phase, from operational monitoring to drift detection and cost control.

1. Core Activities

Operational Monitoring & MLOps

We monitor the 'heartbeat' of the system.

  • Real-time Performance Tracking: Dashboarding of critical metrics: Latency (speed), Error rates, Uptime, Throughput.
  • Performance Degradation Monitoring: Statistically monitoring whether production input data deviates from training data (Data Drift) or whether the relationship between data and outcomes changes (Concept Drift).
  • Data Loop Integration: Feeding production data and outcomes back into the development environment for analysis (Feedback Loop).
  • Automated Triggers: Setting alerts for drops below thresholds (e.g. accuracy \< 85%).

Continuous Improvement & Retraining

Standing still means falling behind.

  • Retraining Strategy: When do we retrain? (Periodically? On drift alert? On new data?).
  • Experiment Loops: Use production insights to test new hypotheses in short sprints (A/B testing, Canary releases).
  • Backlog Management: Maintain a living list of bugs, improvements and feature requests from users.

Cost Control & Energy Efficiency

Sustainability in euros and CO2.

  • Cloud & API Optimisation (Cost Overview): Monthly review of compute (GPU/CPU) and token costs. Optimise through model compression (quantisation) or caching.
  • Sustainability Measurement (ESG): Monitoring energy consumption (inference footprint) and reporting for ESG goals.
  • Resource Allocation: Set up autoscaling to adjust infrastructure to actual demand.

Ethical Oversight & Compliance Monitoring

Ongoing legal conformity.

  • Post-Market Surveillance: (EU AI Act requirement) Continuously scanning for unforeseen bias, discrimination or safety risks.
  • Audit-ready Logging: Retaining logs of decisions and human interventions for auditors.
  • Transparency Reports: Periodic reporting to stakeholders and CAIO on safety and performance.
  • Fairness Audit (Bias Audit): Regular sampling by the Ethicist of the 'tone' and quality of outputs.

Decommissioning

An AI system has a finite lifespan. Define in advance when shutdown is justified.

Decommissioning triggers:

Category Trigger Action
Technical Drift exceeds threshold and retraining does not improve performance System offline, root cause analysis
Economic Cost per Productive Outcome rises > 50% above baseline after 2 quarters CAIO review: stop or re-architect
Ethical/Legal Critical fairness audit finding or new legislation renders system non-compliant Immediate stop, Guardian review mandatory
Strategic Use case disappears due to organisational change or better alternative available Controlled wind-down per handover plan

Decommissioning process:

  1. Announcement: Inform users and stakeholders in advance (minimum 4 weeks).
  2. Archiving: Retain the technical dossier, validation reports and Kaizen Log per retention policy.
  3. Knowledge transfer: Document lessons learned in the Lessons Learned register.
  4. Data deletion: Delete or anonymise production data in accordance with GDPR [so-49].
  5. Infrastructure: Shut down compute, API keys and monitoring pipelines.
  6. Guardian sign-off: Guardian confirms all Hard Boundaries obligations have been fulfilled.

2. Team & Roles

Role Responsibility in Monitoring & Optimisation
MLOps Engineer Responsible: Owner of monitoring pipelines, infrastructure and stability.
AI Product Manager Accountable: Guards Business KPIs, manages backlog and user feedback.
Chief AI Officer (CAIO) Consulted: Evaluates long-term ROI and strategic impact.
Data Scientist Responsible: Analyses Performance Degradation, performs retraining and improves models.
Guardian (Ethicist) Consulted: Performs ethical reviews and post-market surveillance.

Further reading:

See also: Phase 5 Overview · Deliverables


Next step: Set drift thresholds and schedule the first quarterly review (Gate 4). → Use the Gate 4 Checklist as your starting point. → See also: Continuous Improvement | Evidence Standards