Why Azure Cost Optimisation Programmes Fail
The failure pattern is consistent across industries and organisation sizes. A cost review reveals substantial Azure waste. An optimisation sprint is initiated — resources are shut down, VMs are rightsized, a handful of Reserved Instances are purchased. Costs drop meaningfully for two to three months. Then the environment continues to grow, teams continue provisioning without governance, and twelve months later the waste figure has returned to or exceeded its pre-optimisation baseline.
The root cause of this decay is structural. A one-time cost review addresses the current inventory of waste. It does not change the provisioning culture, governance mechanisms, or accountability structures that produced the waste in the first place. Without those structural changes, the waste regenerates as fast as the environment grows.
A successful optimisation programme has two components: a remediation phase that addresses current waste, and an operating model that prevents its recurrence. This playbook covers both, sequenced in the order that delivers maximum impact with minimum organisational friction.
Phase One: Building the Foundation — Visibility and Governance
No optimisation action is reliable without accurate cost visibility and enforced governance. The first phase establishes both before any remediation begins.
Establishing Cost Visibility
Azure Cost Management is available at no additional charge and is the authoritative source for Azure spend data. Before any optimisation actions, configure Cost Management to provide the views that will drive decision-making. Enable resource-level cost analysis with subscription, resource group, and resource type breakdowns. Set up budget alerts at subscription and resource group scope, calibrated to current spend levels with ten and twenty percent threshold warnings. Activate anomaly detection notifications to surface unusual cost spikes before they compound.
The amortised cost view in Cost Management is more useful for ongoing management than the actual cost view. Amortised cost spreads RI and Savings Plan charges across the coverage period, giving a stable per-day spend figure rather than front-loaded or inconsistent actuals. Establishing amortised cost as the primary reporting metric prevents the confusion that arises when teams compare billing cycles against RI purchase months.
Designing and Enforcing a Tag Taxonomy
Cost allocation without tagging is approximation. Tagging allows you to allocate Azure spend to specific teams, projects, cost centres, and environments — the prerequisite for team-level cost accountability and showback or chargeback reporting. Azure Policy is the enforcement mechanism that converts tagging from a best-effort convention into a mandatory governance control.
Design a tag taxonomy of five to eight mandatory tags before enforcement begins. The standard mandatory set covers: cost centre or business unit, environment (Production, Staging, Development, Test, Sandbox), application or project identifier, resource owner (individual or team), and automated shutdown eligibility (for non-production resources). Keep the initial mandatory set small — teams will game or skip large mandatory tag lists, defeating the governance purpose.
Azure Policy's built-in "Require a tag on resources" initiative enforces mandatory tags at resource creation time, denying provisioning requests that omit required tags. Tag inheritance through Cost Management allows subscription and resource group tags to propagate to child resources, reducing the per-resource tagging burden for teams that provision inside properly tagged containers. Azure Policy's "Inherit a tag from the resource group" definition automates this inheritance at scale.
Establishing Accountability Structures
Technical governance without accountability structures does not sustain. Teams that know their Azure spend is visible to their management — and that they are accountable for it — behave differently from teams whose consumption is invisible. The accountability structure requires three elements. Monthly cross-functional spend reviews where engineering leads, IT finance, and cloud operations review the Cost Management data together. Team-level budget ownership where each application team has a defined Azure budget and receives alerts when they approach or breach it. Escalation paths for significant budget overruns that do not wait for the monthly review cycle.
In one engagement, a multinational technology services company was carrying £28 million in annual Azure spend across fragmented subscriptions with no cost visibility or governance framework. After implementing the three-element accountability structure outlined above — combined with the visibility and tagging controls from Phase One — Azure waste declined from 34% to 8% within six months. Redress designed and implemented the entire FinOps operating model, and the engagement fee was less than 2% of the first-year cost recovery achieved.
Want to accelerate your FinOps programme?
We build Azure governance frameworks and cost accountability structures for enterprise IT teams.Phase Two: Waste Remediation — The Immediate Return
With visibility and governance in place, the remediation phase addresses the current inventory of waste. The categories are predictable and the tooling is largely native.
Idle and Underutilised Virtual Machines
Azure Advisor identifies VMs where CPU utilisation has been consistently at or below five percent over a seven-day rolling window and network traffic below seven megabytes per day. These are shutdown or rightsize candidates. Before actioning either, validate with the owning team that the low-utilisation pattern reflects actual workload behaviour rather than an anomalous measurement window or a legitimate standby requirement.
Shutdown candidates — VMs that have no traffic and no operational justification for continued running — are the highest-ROI category. Stopping a VM eliminates compute billing entirely, although storage charges for attached disks continue. For environments that need periodic access rather than continuous availability, scheduled start-stop automation through Azure Automation runbooks or Azure Functions eliminates compute costs during idle periods without requiring permanent decommissioning.
Rightsize candidates — VMs with low average utilisation but genuine operational requirements — should be resized to a smaller VM within the same family before shutdown decisions are made. A Standard_D8s_v3 running at fifteen percent average CPU may be appropriately sized if the application has burst peaks that the average conceals, or significantly overprovisioned if utilisation is consistently flat. Operational context from the application team is the tiebreaker that Azure Advisor's utilisation data cannot provide.
Non-Production Environment Cost Recovery
Development, test, staging, and sandbox environments typically consume twenty to forty percent of enterprise Azure compute budgets while serving teams for nine to twelve hours per working day, five days per week. Full-time billing for nine-to-twelve-hour-per-day availability represents fifty to sixty percent waste by running hours alone.
Automated shutdown through Azure Automation or Azure Policy provides the enforcement mechanism. A policy that enforces auto-shutdown at 19:00 local time for all VMs tagged as Development or Test environments, with an auto-start at 07:00 the following morning, recovers the overnight and weekend compute cost without requiring manual action. Developer resistance to shutdown policies — often framed as a workflow objection — is manageable through opt-out processes for workloads that have documented overnight operation requirements, combined with visibility that shows team leads the cost their opt-outs generate.
Orphaned Resource Cleanup
Orphaned resources — unattached managed disks, unused public IP addresses, stale snapshots, empty load balancers, idle VPN gateways — accumulate after projects complete and environments are decommissioned. They continue incurring charges indefinitely because no individual team member feels ownership responsibility for them after the original project concludes.
A quarterly orphaned resource sweep using Cost Management resource group analysis and Azure Resource Graph queries identifies resources that have had no associated traffic or activity for thirty or more days. Pairing this sweep with a decommissioning process that explicitly lists resource cleanup as a project closure task prevents the backlog from rebuilding after each cleanup sprint.
Phase Three: Commitment Optimisation — Structural Cost Reduction
With waste eliminated and governance in place, the commitment optimisation phase converts remaining pay-as-you-go spend to discounted pricing through the combination of Azure Hybrid Benefit, Reserved Instances, and Azure Savings Plans.
Step 1: Map Azure Hybrid Benefit Eligibility
Before purchasing any commitment instruments, identify the subset of your Azure compute estate that can be covered by Azure Hybrid Benefit from existing on-premises Software Assurance licences. This is a free action with immediate billing impact — applying AHB to an eligible Windows Server VM eliminates the Windows licensing component of that VM's cost, reducing its pay-as-you-go rate by up to eighty percent. SQL Server workloads with SA coverage achieve up to eighty-five percent savings through AHB on Azure SQL.
Conduct a cross-functional audit: procurement or IT Asset Management identifies the SA licence inventory and coverage levels; cloud operations maps the Azure VM and SQL estate against eligibility criteria. Assign a joint owner for the central AHB application in Cost Management, targeting subscription-level assignment rather than per-resource manual management.
Step 2: Identify RI Candidates
After applying AHB, assess the remaining compute estate for Reserved Instance eligibility. The RI candidate criteria are: production-grade workload, consistent utilisation above fifty percent over a rolling thirty-day window, stable VM configuration with no planned architecture change within twelve to thirty-six months, and no containerisation or serverless migration planned within the commitment term.
Azure Advisor's RI recommendations provide a useful starting list but should be reviewed against operational knowledge before purchase. Advisor's recommendations are based on historical utilisation and may not reflect planned workload changes. Target an RI coverage rate of fifty to sixty-five percent of the stable production compute estate. Purchase one-year RIs initially if you have limited experience with RI management — the lower discount is a worthwhile trade-off for the flexibility to reassess at twelve months as your estate understanding matures.
Step 3: Apply Savings Plans for the Evolving Estate
The remaining compute estate — workloads in active transformation, containerised environments, serverless and PaaS services, and workloads with variable scaling patterns — is the Savings Plans domain. Commit an hourly spend amount that covers the expected consistent utilisation baseline for this workload category, leaving burst and peak consumption on pay-as-you-go. A conservative hourly commitment that covers seventy to eighty percent of typical compute consumption in this category is preferable to an aggressive commitment that may create over-commitment exposure if workloads scale down during the term.
Phase Four: Commercial Optimisation — EA and MACC Negotiation
The previous phases optimise consumption. This phase optimises the commercial terms under which that consumption is priced. With a clean, well-governed estate and accurate consumption forecasts, your position in EA and MACC negotiations is materially stronger than it would be with an unmanaged, over-consumed environment.
Preparing for EA Negotiation
Effective EA negotiation requires three inputs. An accurate consumption forecast built from the cleaned, governed estate — not inflated historical actuals that include waste you have now eliminated. Competitive analysis documenting AWS and Google Cloud as alternatives for the workload categories in your estate. This analysis does not need to be deep; it needs to be credible and documentable. And benchmarked pricing data — what comparable organisations pay for equivalent Azure services under EA, sourced from peer networks, analyst reports, or independent advisors.
Microsoft's enterprise sales teams respond to preparation. Organisations that arrive at EA negotiations with documented alternatives, a clear forecast, and competitive pricing data consistently receive better outcomes than those relying on the relationship or the renewal deadline as their primary leverage. The Microsoft team's job is to maximise the contract value. The CIO's job is to structure a commercially rational agreement that supports the organisation's growth trajectory without creating commitment risk or paying above-market rates.
Structuring the MACC Commitment
When structuring a MACC commitment, the cardinal rule is conservative sizing. Forecast your baseline Azure consumption for the commitment period — the consumption level you are highly confident of reaching — and commit to that figure. If you believe consumption will grow by thirty percent, do not commit to the thirty percent growth level. Commit to the baseline and renegotiate if growth materialises.
The asymmetric risk structure of MACC commitments makes conservative sizing rational even at the cost of a lower initial discount level. A shortfall payment on an overcommitted MACC is charged without Azure Consumption Discounts, making the effective unit cost of shortfall consumption higher than the effective cost of discounted consumption. The financial penalty for over-committing exceeds the discount premium forgone by under-committing in most scenarios where the consumption trajectory is genuinely uncertain.
Negotiate non-pricing MACC benefits alongside the discount level. Technical account management, architecture review services, migration funding, training and certification credits, and early access to preview features are all negotiable within a MACC structure and frequently more valuable over the commitment term than an additional half-percent improvement in the headline discount.
The Ongoing Optimisation Operating Model
Phase four completes the initial programme. The ongoing operating model maintains the gains and identifies new optimisation opportunities as the environment evolves. The minimum viable FinOps operating model for a mid-to-large enterprise Azure estate involves four recurring activities.
Monthly spend reviews using Cost Management data, attended by cloud operations, IT finance, and application team leads. Quarterly RI and Savings Plan utilisation reviews — using Cost Management's reservation utilisation reports — to identify underutilised commitments that require exchange or rebalancing decisions. Semi-annual tagging and governance audits using Azure Policy compliance reports, identifying resource groups or subscriptions with coverage gaps. Annual EA and MACC health checks — a structured review of current commercial terms against market benchmarks, consumption trajectories, and upcoming renewal dates.
The resource investment is modest. A part-time FinOps function — three to five hours per week of structured attention from an appropriately skilled individual — is sufficient to run this operating model for estates up to £20 million per year. For larger estates, dedicated FinOps capability proportional to the spend level is justified by the recovery opportunity it continuously surfaces.
Azure Cost Optimisation Intelligence — Free Monthly Briefing
Practical optimisation tactics, governance frameworks, and EA negotiation strategies from Redress Compliance advisors.
Playbook Checklist: Thirty Actions Across Four Phases
The following checklist provides an implementation reference for enterprise IT teams executing the optimisation programme described in this playbook.
Foundation phase: Configure Cost Management with resource-level analysis and budget alerts. Set anomaly detection thresholds. Enable amortised cost as the primary reporting view. Design tag taxonomy with five to eight mandatory tags. Deploy Azure Policy for mandatory tag enforcement. Configure tag inheritance at resource group scope. Establish monthly cost review cadence with cross-functional attendance. Assign team-level budget ownership for each application team.
Remediation phase: Run Azure Advisor cost recommendations and validate against operational context. Identify shutdown and rightsize candidates among idle VMs. Deploy automated shutdown policies for all non-production environments via Azure Policy and Automation. Conduct orphaned resource sweep using Resource Graph and Cost Management. Implement quarterly orphaned resource cleanup process. Add resource decommissioning checklist to project closure templates.
Commitment optimisation: Conduct cross-functional SA licence audit to map AHB eligibility. Apply AHB centrally at subscription scope for eligible Windows Server and SQL Server workloads. Identify RI candidates using Advisor recommendations validated against operational data. Purchase RIs for stable production compute at fifty to sixty-five percent coverage rate. Determine Savings Plan hourly commitment for evolving workload estate. Apply Savings Plan covering seventy to eighty percent of typical consumption in target workload category.
Commercial optimisation: Build accurate consumption forecast from governed estate. Prepare competitive AWS and GCP analysis for EA negotiation. Source benchmarked EA pricing data. Determine MACC commitment size using conservative baseline methodology. Prepare non-pricing benefit requirements for MACC negotiation. Align EA negotiation timing with Microsoft fiscal quarter-end windows.