This assessment covers four domains: AI Spend Visibility and Baseline, Unit Economics and Benchmarking, Licence and Contract Optimisation, and ROI Measurement and Governance. Use it to establish where AI investment is generating demonstrable value and where spend discipline requires strengthening.
Enterprise AI spend is notoriously fragmented. Microsoft 365 Copilot appears in the EA. OpenAI API credits appear in developer cost centres. Azure AI services appear in cloud infrastructure bills. AWS Bedrock consumption appears in AWS accounts managed by individual teams. Vendor-embedded AI — Salesforce Einstein, ServiceNow Now Assist, SAP Joule — is often invisible, buried in broader software subscription fees. Before benchmarking or optimising AI spend, consolidation of every cost line into a single inventory is prerequisite. Organisations consistently underestimate their total AI spend by 30 to 50 percent before this exercise.
Not all AI spend delivers equivalent value. Microsoft 365 Copilot licences that are actively used for document summarisation and meeting transcription deliver measurable productivity value. The same licences sitting unused in inactive user accounts deliver zero value while consuming the same per-seat cost. AI API credits consumed by production workloads processing customer transactions are categorically different from development team exploration credits. Without a value delivery categorisation, optimisation efforts are directionless. Tier AI spend into: proven ROI (justify and protect), probable ROI (measure and scale or cut), and experimental (apply strict time and cost limits).
Industry benchmarks consistently show that infrastructure and integration add 20 to 40 percent to direct AI API or model licensing spend in mature deployments. A $1M annual OpenAI API spend may sit on $300,000 of supporting vector database infrastructure, $200,000 of API gateway and orchestration tooling, and $400,000 of engineering time maintaining integrations. The per-capability total cost — including all these layers — is the correct unit for benchmarking and ROI assessment. Organisations that optimise direct API costs without addressing infrastructure and integration overhead miss a significant proportion of total AI cost.
Microsoft 365 Copilot is priced at $30 per user per month. ChatGPT Enterprise is priced by negotiation. Gemini for Workspace Advanced is approximately $20 to $30 per user per month. These are the headline rates — but effective cost per active user varies enormously based on adoption rates. An organisation with 10,000 Copilot licences and 30% active adoption has an effective cost of $100 per active user per month, not $30. Benchmarking effective cost per active user — rather than licensed cost per assigned seat — reveals whether the investment is generating productivity at market rates or subsidising non-adoption.
LLM API pricing fell approximately 80% between early 2025 and early 2026 at the major providers. Organisations locked into API pricing structures from 2023 or 2024 may be significantly overpaying relative to current market rates — particularly if they are on committed spend tiers that do not automatically benefit from model price reductions. Tracking token cost per business output creates visibility into cost trends and triggers model or provider switching decisions when cost efficiency degrades. The benchmark figure varies widely by use case: summarisation is inherently cheaper per output than reasoning-intensive agentic tasks.
Enterprise AI API spending rose from $0.5B globally in 2023 to $8.4B by mid-2025, and continues to grow rapidly. Individual enterprise AI spend is growing at similar rates. The critical question is not whether AI spend is growing — it inevitably will — but whether value delivery is growing at least as fast. Organisations where AI spend is doubling annually but measurable productivity outcomes are growing at 20% have a spend efficiency problem, not an AI adoption problem. Establish a quarterly AI spend-to-value ratio metric and make it visible to CIO and CFO audiences.
Microsoft 365 Copilot has a well-documented adoption challenge. Industry data consistently shows that 40 to 60 percent of assigned Copilot seats are inactive at any point in time — a shelfware rate that is structurally higher than most enterprise software because AI adoption requires behaviour change, not just access. At $30 per user per month, 4,000 inactive Copilot licences represent $1.44M in annual waste. Establish a minimum adoption threshold — typically 60% active weekly usage — as a gate before licence scale-up. Negotiate licence count flexibility into Microsoft EA Copilot addendum terms to allow right-sizing based on actual adoption.
The commercial terms of AI vendor contracts carry material legal and financial risk that is often underweighted relative to pricing in procurement negotiations. Key terms to scrutinise: whether your data is used to train the vendor's models (with opt-out rights); data residency and sovereignty requirements for regulated industries; who owns the intellectual property in AI-generated outputs; and whether the vendor provides indemnification if AI-generated content infringes third-party copyright. Enterprise versions of AI platforms (ChatGPT Enterprise, Copilot for Microsoft 365, Claude for Enterprise) offer materially stronger terms in these areas than consumer or developer tier agreements. Validate that the contract tier you are on matches your risk exposure.
AI platform vendors have become skilled at securing large committed spend tiers based on adoption projections that often do not materialise on the expected timeline. A $2M committed Azure OpenAI spend tier negotiated in Q4 2024 based on projected 2025 usage may be significantly over-committed if AI use case rollout faces the adoption delays that are standard in enterprise AI programmes. Build committed AI spend in tranches aligned with demonstrated adoption milestones. Reserve the right to reduce or reallocate commitments between AI platforms within a vendor family (e.g., between different Azure AI services) as use case mix evolves.
Shadow AI is the fastest-growing category of enterprise software spend. Individual teams subscribe to Claude, Perplexity, Midjourney, and dozens of task-specific AI tools on corporate credit cards, bypassing procurement and creating data security, contractual, and cost visibility gaps. A shadow AI spend audit — querying expense management systems for AI vendor names, surveying department heads, and reviewing IT service desk tickets — typically reveals 15 to 35 percent of enterprise AI spend that is untracked by central procurement. Shadow AI is not inherently bad; some shadow AI tools deliver genuine value. The issue is the absence of visibility, governance, and volume negotiation leverage.
The most common AI governance failure is deploying AI without defining what success looks like before deployment. Without pre-defined metrics, AI spend is impossible to justify to finance, impossible to optimise, and impossible to cut when it underperforms. Best practice is to treat AI deployments as you would any capital project: define the expected return, measure it, and make continuation/expansion decisions based on actual performance against the forecast. Organisations that apply this discipline consistently find that 20 to 30 percent of AI use cases are significantly underperforming their business case and should be redesigned or terminated.
AI spend governance requires both financial discipline and technical context. Finance teams that review AI costs without technology input cannot distinguish between high-value production AI spend and exploratory waste. Technology teams that manage AI spend without finance accountability can defer difficult optimisation decisions indefinitely. A quarterly joint review — with a standard reporting pack covering spend by platform, adoption rates, cost-per-output metrics, and pending renewals — creates the shared accountability required for effective AI spend optimisation.
AI platform vendors — Microsoft, OpenAI, Google, Anthropic, AWS — have annual or multi-year contract cycles with renewal terms that are negotiable but become more difficult to change as the renewal date approaches. Microsoft 365 Copilot renewals are part of the broader EA negotiation. OpenAI Enterprise and Anthropic Claude Enterprise are standalone contracts. AWS Bedrock committed spend is part of AWS EDP. Managing these renewals proactively — with adoption data, benchmark pricing comparisons, and alternative platform options on the table — produces better commercial outcomes than auto-renewing under default terms.
Frontier models (GPT-4o, Claude Opus, Gemini Ultra) are priced at 5 to 20 times the cost of efficient mid-tier models (GPT-4o mini, Claude Haiku, Gemini Flash) for equivalent token volumes. Many enterprise AI use cases — document classification, simple summarisation, FAQ response, structured data extraction — do not require frontier model capability and achieve equivalent business output at 80 to 90 percent lower token cost using a smaller model. A use-case-to-model-tier mapping exercise, applying the principle of minimum sufficient capability for each task, consistently identifies 30 to 50 percent API cost reduction opportunities without any reduction in output quality for the applicable use cases.
The AI vendor market is moving exceptionally fast. Pricing for the same model capability has fallen 80% in 18 months. New model tiers, new competitive alternatives, and new negotiating precedents emerge quarterly. An AI spend benchmark that was valid in mid-2024 is significantly outdated by mid-2026 — both in absolute pricing levels and in the competitive alternatives available. An annual independent benchmark, drawing on current market data rather than historical contract terms, identifies where AI vendor pricing has decoupled from the market and creates the evidence base for renegotiation or platform switching decisions.
Interpreting Your Assessment Results
The AI Spend Discipline Imperative
Global enterprise AI spending is projected at $644 billion in 2025 — growing at over 76% annually. Individual enterprise AI budgets are scaling at comparable rates. The organisations that build spend governance infrastructure now — visibility, value measurement, contract discipline — will compound the productivity gains of AI investment. The organisations that do not will face CFO scrutiny when AI spend is large enough to require explanation and the governance infrastructure to explain it does not exist.
The AI market is also unusual among enterprise software categories in that prices are falling, not rising. LLM API prices fell 80% between early 2025 and early 2026. Organisations locked into pricing structures from 2023 or 2024 may be significantly over-paying for capabilities that are now available at a fraction of the original cost. An annual AI spend benchmark and contract review is not administrative overhead — it is a direct path to cost recovery.
Stay Current on Enterprise AI Licensing
Subscribe to our GenAI knowledge hub for quarterly pricing updates, contract term analysis, and enterprise AI governance guidance.