| The Agentic Enterprise |
AK · Fri, Jul 3, 2026 · 7 min |
|
|
Happy Fourth of July 🇺🇸
|
|
Friday, July 3, 2026
FinOps is now where AI budgets live or die.
For two years, burning tokens felt like progress. This week the invoice arrived.
Budgets blown in a quarter, spend caps everywhere, and Gartner projecting AI coding will out-cost the developer by 2028. The constraint on enterprise AI is no longer capability or even deployment. It is the bill, and the new discipline is spending on purpose, not spending less.
|
|
The Lead
For two years the flex was consumption. That era ended this week.
Burn more tokens, run more agents, top the internal leaderboard, call it progress. Enterprises that treated token volume as a proxy for value are now staring at budgets exhausted in a quarter and asking the question they skipped: did any of it move a number?
The FinOps Foundation says the share of finance teams managing AI spend went from 31 percent to 98 percent in two years. Gartner projects that by 2028 the cost of AI coding will surpass the salary of the developer using it. The constraint on enterprise AI is no longer capability, and no longer even deployment. It is the bill, and it just came due.
|
|
Tokenmaxxing is over, and the bill just came due.
| T |
he defining enterprise-AI behavior of the last two years quietly collapsed this week. Call it tokenmaxxing: the belief that consuming more tokens meant getting more value, rewarded with internal leaderboards and expanding budgets. CNBC reported that OpenAI's and Anthropic's largest customers are now pivoting hard from burning tokens to tightening budgets and demanding measurable returns. The FinOps Foundation says teams managing AI spend jumped from 31 percent to 98 percent in two years, and 73 percent of enterprises now report AI costs that blew past their original projections. |
|
Token volume measures inputs, not results. The same billion tokens can be a hard problem solved or an agent running in circles.
|
The receipts are piling up with names attached. Uber exhausted its entire 2026 AI budget in four months and capped spend at $1,500 per tool per month. The FinOps Foundation's J.R. Storment says companies started calling in April already three times over their full-year token budget. Meta took down the internal tokenmaxxing leaderboard its staff had built. Microsoft cancelled Claude Code subscriptions across several product divisions. The vendors are responding in kind: Google Cloud shipped Spend Caps that auto-pause API traffic at a budget ceiling, and Anthropic added spend alerts to Claude Enterprise.
The reframe for buyers is sharp. The problem was never that AI is expensive; it is that consumption-based pricing makes cost invisible until the invoice lands, and no output benchmark has matched the 5-to-20x jump in per-developer consumption that agentic mode produces. Gartner's projection that AI coding will out-cost the developer by 2028 is only shocking if you were not watching the meter.
|
The Spearhead Take
The lesson is governance, not austerity. Do not cut AI to the bone; cut the blindness. Meter consumption per team and per use case, set caps before you scale not after, and tie every deployment to a baseline number it is supposed to move. The winners this year will not be the teams that spent the least or the most. They will be the teams that could tell you, in dollars, what their spend bought.
|
|
|
The Obvious & The Overlooked
The overrun is the headline. The wrong lesson is the risk.
|
The Obvious
AI costs are running past budget.
Nearly three-quarters of enterprises say AI spend exceeded original projections. Usage.ai
Consumption pricing broke seat-based budgeting.
Token metering bills nothing like a per-seat license, and finance models did not see it coming. Gartner
Everyone is slapping on caps.
Uber, Walmart, Amazon, and Cisco have all imposed per-tool spend limits. Inc.
|
The Overlooked
FinOps went from niche to universal in 24 months.
Teams managing AI spend jumped from 31% to 98%, the fastest discipline shift the function has seen. State of FinOps
The metric everyone optimized was the wrong one.
Tokenmaxxing rewarded inputs; the same tokens buy a solved problem or an agent looping uselessly. Fortune
Cheap open weights are the release valve.
Agent startup Lindy moved 100% of its traffic from Claude to DeepSeek to cut costs, a preview of where squeezed buyers go. Fortune
Vendors now sell the cure for their own pricing.
Google Cloud Spend Caps and Claude Enterprise spend alerts turn cost control into a product line. Google Cloud
|
|
|
Moving Pieces
Five developments worth a CIO's attention.
Product
Microsoft puts $2.5B behind turning pilots into paid outcomes
Microsoft launched Frontier Company, a $2.5 billion unit with 6,000 engineers who embed with customers to move AI from pilot to production, measured on business outcomes. Read against the cost reckoning, the timing is pointed: the same week enterprises admit they cannot prove ROI, Microsoft sells the missing implementation muscle directly, priced on outcomes rather than licenses. Early partners include London Stock Exchange Group, Unilever, and Land O'Lakes, with the Big Four lined up to scale it. Useful help, but define the outcome in your own metrics before you rent someone else's.
Deals
Together AI raises $800M to sell the cheaper way to run models
Together AI closed an $800 million Series C near an $8.3 billion valuation for a platform that lets enterprises train and run open-source models as a substitute for frontier APIs. The timing rides the cost story directly. As token bills land on CFO desks, "run a good-enough open model on dedicated capacity" stops being an ideology and becomes a line-item argument. Together is positioning as the FinOps-friendly counterweight to per-token premium pricing, and $800 million says investors expect that pitch to find a very large, very cost-conscious audience this year.
Product
Google Cloud ships Spend Caps that pause the meter
Google Cloud introduced Spend Caps that enforce automated budget ceilings and ultimately pause API traffic once a limit is hit, across its AI and agent platforms, Cloud Run, and Maps. This is the token-cost crisis becoming native infrastructure: not a dashboard that reports the overrun after the fact, but a hard stop that prevents it. For platform teams, a circuit breaker on runaway agent spend is exactly the control that was missing when Uber's budget evaporated. Expect every major cloud to ship an equivalent, because buyers are now asking for it by name.
Workforce
The talent is voting, and it is walking to Anthropic
Google delayed Gemini 3.5 Pro to July for quality work, and reporting ties the slip to a talent exodus, including senior researchers leaving for Anthropic. For enterprise buyers this is a supply signal, not gossip. Model roadmaps are only as reliable as the teams behind them, and when the people who build a flagship leave mid-cycle, committed availability dates get softer. If your 2026 plan assumes a specific Gemini capability on a specific date, treat that date as a forecast, not a contract, and keep a fallback provider wired into the architecture.
Governance
Salesforce and Databricks build a shared floor for agents
Salesforce and Databricks expanded their partnership to connect enterprise data with the permissions, approvals, and workflows agents need to act safely. The problem they name is real: the data that powers an agent is usually severed from the business context and security controls required to let it do anything consequential. This is the unglamorous plumbing that decides whether an agent updates a production record or just drafts a suggestion. Whoever owns that trusted connective layer owns real leverage over how enterprise agents get deployed, and increasingly, how their cost gets metered.
|
|
On the Radar
Nine signals, sharpened.
| Compute |
Meta staff burned an estimated 70 trillion-plus tokens in about 30 days. The scale of internal consumption, reported before Meta capped spend, shows how fast agentic usage compounds without governance. MLQ |
| Governance |
Firms were calling FinOps advisers in April already 3x over their 2026 token budget. Executive director J.R. Storment's account puts a timeline on how fast the overruns hit. Usage.ai |
| Product |
Google adds two Gemini image models. Gemini 3.1 Flash Image and a quality-first Gemini 3 Pro Image target design, marketing, and enterprise creative work, priced from $0.50 to $12 per million tokens. Build Fast with AI |
| Deployment |
Salesforce Agentforce crosses 29,000 deals for roughly $800M ARR. A rare hard revenue number in a category thick with pilot counts and unproven ROI claims. Salesforce |
| Deployment |
Microsoft Copilot Studio reports 160,000 organizations running 400,000-plus custom agents. The install base that Frontier Company is built to convert into measurable outcomes. State of AI Agents |
| Infrastructure |
Cohere will anchor a multibillion-dollar Canadian data center built by CoreWeave. A sovereignty-minded North American stack for buyers with data-residency requirements. Data Center Dynamics |
| Deals |
SoFi acquires Composer to add AI-built investing strategies. Another vertical folding agentic features into a regulated, revenue-bearing workflow rather than building from scratch. BetaKit |
| Policy |
EU AI Act's core obligations become applicable August 2, 2026. General-purpose model and transparency duties arrive on schedule, giving compliance teams about a month. artificialintelligenceact.eu |
| Deals |
MGX closes a $49B AI fund. The two-year-old Abu Dhabi firm now ranks among the sector's most consequential investors, concentrating sovereign capital into AI infrastructure. Bloomberg Law |
|
|
Quick Hits
The wider field, one line each.
| Uber capped AI-tool spend at $1,500 per tool per month after burning its full 2026 budget in four months. Fortune |
| Nearly a quarter of tech leaders now spend $200-$500 per developer per month on AI coding tokens; about 6% top $2,000. Gartner |
| Agent startup Lindy moved 100% of its traffic from Claude to DeepSeek to cut model costs. Fortune |
| 56% of CEOs report no clear revenue or cost benefit from AI yet, even at record adoption. Tech Funding News |
| Anthropic re-enabled Claude Mythos 5 for select US organizations alongside Fable 5's global return. Anthropic |
| Fable 5 stays free to Pro and Max tiers only through July 7, then shifts to usage credits. Build Fast with AI |
| GPT-5.6 remains gated to government-vetted partners with no committed broad launch date. Fortune |
| Together AI's $800M Series C was led by Saudi Aramco's venture arm at a roughly $8.3B valuation. Tech Startups |
| Legora raised $550M at a $5.55B valuation, pushing legal AI into real estate and healthcare. Tech Startups |
| Joulent landed $1.75B in strategic financing for AI-driven energy infrastructure. Crunchbase |
| Queue raised $12.6M for autonomous pharmacy robots; Probook raised $40M for home-services AI. Tech Startups |
| Q1 2026 venture funding hit roughly $300B, AI-led, with no sign of cresting. Crunchbase |
|
|
The Number
2028
The year Gartner projects the cost of AI coding will surpass the salary of the average developer using it, as token consumption surges.
That is not a distant warning. Nearly a quarter of tech leaders already spend $200 to $500 per developer each month on coding tokens, extreme cases hit $20,000, and developers optimize for speed, not cost. Without a governed operating model, the bill outruns the productivity it was supposed to buy.
|
|
Counter-Signal
Risk
The wrong lesson is to spend less.
The reflex this week is to clamp down, and for ungoverned budgets that is correct. But the failure was never spending; it was spending blind. The danger now is over-correction: enterprises so burned by a runaway invoice that they freeze, wait "12 to 18 months to prove ROI," and quietly cede ground to competitors who learned to spend well instead of little.
The uncomfortable truth underneath the cost panic is that value and consumption are not the same thing, in either direction. Just as heavy token use never guaranteed results, a slashed budget does not guarantee efficiency; it can just as easily starve the one deployment that was actually working. The teams that win the next year will not be the cheapest or the biggest spenders. They will be the ones who can name, in dollars and in outcomes, what each dollar of AI spend returned, and then move money toward what works. Discipline is a budget you can defend line by line, not a budget you are afraid to use.
|
|
From the Field
The story this week was not a model or a megadeal. It was the invoice, and the quiet realization that two years of AI enthusiasm had been measured in the wrong unit.
We have watched this movie before, in every technology cycle. A new capability arrives, usage explodes because usage feels like progress, and then someone in finance asks the unglamorous question: what did we get for it? The answer is rarely nothing. But it is almost never what the consumption numbers implied, because inputs were never the point. Outcomes were. Tokenmaxxing was just the latest way to confuse motion with progress.
The good news is that this is a solvable problem, and a familiar one. You meter it, you cap it before you scale it, you tie it to a baseline you can defend, and you move money toward the deployments that actually move a number. None of that requires spending less. It requires spending on purpose.
On a holiday weekend built around independence, the freedom that matters in enterprise AI is not freedom from cost. It is freedom from spending you cannot explain. Enjoy the Fourth 🇺🇸, and when you are back, pull up the meter.
Let's get to production, AK
|
|
|
|
Anthropic is a Spearhead technology partner, and its Claude model produced this edition under human editorial direction. Anthropic appears throughout, and the framing is analytical and frequently unflattering to the AI-vendor category, including Anthropic: the Big Story uses Anthropic's own Claude Code as a lead example of runaway enterprise cost, cites Microsoft cancelling Claude Code subscriptions, and notes a customer migrating from Claude to DeepSeek to save money. Gartner's 2028 developer-salary crossover is a projection; the FinOps 31%-to-98% and 73% overrun figures are from the FinOps Foundation / State of FinOps; Uber, Meta, and Microsoft behavior is from Fortune and Inc. Meta's ~70 trillion-token estimate (via SemiAnalysis/MLQ), Lindy's DeepSeek switch, and the 56%-of-CEOs figure are single- or lower-tier sourced and flagged as such. Vendor-reported adoption figures (Agentforce, Copilot Studio) are labeled. A one-time US-flag emoji appears in the July 4 greeting at the operator's explicit request, an exception to the no-emoji rule. No India-domiciled outlets were used. The reverse test was applied; all editorial decisions are human-directed.
|
|
The Agentic Enterprise
Know more about AI than 95% of your peers. By 7 AM.
A daily AI intelligence briefing for enterprise leaders, published by Spearhead. We build AI systems that work. Strategy. Engineering. Production. Outcomes.
© 2026 Spearhead. All rights reserved.
|
|