The Phantom Menace Budget-Buster: Agentic AI

Since the arrival of Claude Code and Cowork, agentic AI has seemingly gone mainstream. The seismic shock has been devastating to publicly traded software stocks and beyond. Ironically, it is also suddenly much clearer how OpenAI, Anthropic, and other AI powerhouses will pay for their massive capex spend—on the back of enterprise agentic AI. But unlike their costs, which will largely be capitalized on their balance sheets, enterprises that consume these tokens will see soaring metered costs running through their income statements. Without proper monitoring, it could blow a hole through many corporate budgets—particularly those focused on growth—shifting value away from traditional “old-school” technology companies and towards AI-native ones.

The shift from AI chatbots to autonomous AI agents marks a fundamental inflection point for financial stewards and leaders. Agentic AI systems—those that plan, execute multi-step tasks, call external APIs, and spawn sub-processes without human approval at each step—consume an incalculable number of tokens. Boards and business press forums (see Tomasz Tunguz’s recent blog post) will push companies in this direction. Meanwhile, workers are under enormous pressure to use these new tools to boost their productivity. A single workflow can consume millions of tokens in minutes—multiply that across several teams and dozens or hundreds of concurrent agents, and the CFO faces a new category of volatile spend.

We are about to witness an explosion in metered costs. Higher productivity is likely, but finance leaders will be forced to question the ROI with rigor.

1. Understanding Agentic AI and Token Economics

Until recently, AI interaction was dominated by chatbot prompting: a user asks a question, the model responds, the exchange ends. Agentic AI is fundamentally different. Given a high-level goal like, "build me a 3-statement financial model in Excel using the attached invoices, opex and payroll, and then compile an investor deck in PowerPoint," an agent will autonomously decompose that goal into subtasks, determine the tools it needs, call web search APIs, read documents, write intermediate results, reflect on its progress, and iterate until it delivers a final output. Each one of those steps consumes tokens.

The Token Cost Structure

AI providers charge for tokens in two directions: input tokens (everything the model reads, including its instructions, prior context, and retrieved documents) and output tokens (the text it generates). In agentic systems, input costs tend to dominate, because context windows carry the full history of the agent's reasoning across potentially dozens of steps.

Token consumption is not predictable from a traditional budgeting standpoint for several interconnected reasons:

Agent architectures are recursive. A poorly-worded prompt can send an agent into loops that will consume significantly more tokens than expected.
Context accumulation is cumulative. The longer an agent runs, the more expensive each subsequent step becomes because the growing context must be re-processed.
Model selection is opaque. Employees may default to the most capable (and most expensive) model for every task, when a smaller model would suffice.
Parallelism. Multi-agent frameworks designed for speed can spin up dozens of concurrent agents, each running their own full context. This will multiply costs.

The combination of these factors means unchecked agents can generate costs that dwarf budgets finance teams may not discover the overrun until it is too late.

2. Implications for Financial Leadership

CFOs are accustomed to managing SaaS subscriptions, cloud infrastructure, and headcount. AI token spend traverses all three buckets without cleanly fitting into any one of them. It is consumption-based like cloud computing, but its consumption is driven by autonomous decision-making, not human clicks. It is operationally integrated like SaaS, but its cost scales in ways unlike SaaS licenses. It replaces some human labor costs, but not in a 1:1 ratio that maps easily to headcount reduction.

Finance teams might consider:

Creating dedicated AI cost categories
Establishing a cost center structure whereby token usage can be attributed to business units
Not waiting for monthly cloud invoices, and instead building the reporting infrastructure to track this spend in real-time

3. A Framework for Measuring Agentic AI ROI

Agentic AI produces benefits that are diffuse and unobservable in isolation. An agent that compresses a competitive analysis from 3 days to 3 hours does not eliminate a headcount line. A coding agent that reduces the average time to write a unit test from 20 minutes to 2 minutes produces value, but that value is embedded in hundreds of engineers' days. Traditional ROI frameworks were built for capital projects with discrete inputs and outputs, while AI operates in the fabric of work itself.

Consider a layered measurement framework that captures value at multiple levels:

Task Efficiency: Task-level efficiency metrics track the time, effort, and error rate associated with specific, repeatable tasks before and after AI deployment. This is the most granular and most credible measurement layer. Examples include time-to-first draft for proposals, time-to-close for support tickets, and cycle time for code review.
Process Throughput: Process-level throughput metrics capture how AI changes the capacity of teams (e.g., how many reports a team can produce, how many customer inquiries can be resolved, how many contracts can be reviewed).
Business Outcomes: Organizational-level outcome metrics connect AI deployment to business results: revenue per FTE, customer satisfaction scores, time-to-market, and error rates in high-stakes processes. These are the most strategically meaningful but the hardest to attribute specifically to AI.
Cost Avoidance: Cost avoidance accounting quantifies the spend that did not occur because AI handled it — vendor fees displaced, outsourcing contracts reduced, headcount growth avoided. This is a legitimate component of ROI but must be rigorously defined to avoid overcounting.
Experimentation: Organizations are experimenting with AI tools, costs are unpredictable, ROI measurement is anecdotal, and governance is informal. The CFO's priority is to establish measurement infrastructure before costs scale.
Adoption: AI is embedded in specific workflows, spend is growing but partially attributed, and initial ROI data is available. The CFO's priority is formalizing cost centers, implementing tiered authorization, and establishing TER baselines.
Integration: Agentic AI runs core business processes, spend is substantial and well-attributed, ROI is measured systematically, and governance is embedded in procurement and IT processes. The CFO is optimizing the portfolio of AI investments.
Transformation: AI is a primary driver of competitive differentiation, AI cost efficiency is a strategic KPI, the organization has developed proprietary agent architectures tuned for its specific workflows, and the finance function itself is AI-augmented. The CFO is regulating AI as a strategic asset!

If the holy grail for companies is to become truly ‘AI-native,’ finance teams must establish a baseline before deployment, set a measurement window (see 90-day schedule and KPIs below), and review attribution assumptions with the relevant business units. AI ROI should be revisited quarterly as models improve, costs shift, and business context evolves.

4. Balancing Speed, Efficiency, and Control

The tension for CFOs is the speed and autonomy that make agentic AI valuable are also the properties that make it financially risky. Try not to choose between speed and control, but rather a governance structure that embeds financial controls into the agent's operating environment without limiting the potential upside. CEOs & CFOs should calibrate their expectations in stages:

The following actions can be taken to establish a foundation for AI financial governance:

Within 30 Days: Map workflows and establish visibility
Within 60 Days: Implement workflow budgets and model-tier policy
Within 90 Days: Tie AI investment to revenue and productivity KPIs

Click here for an example AI deployment scorecard integrating key revenue and productivity KPIs.

Conclusion

Agentic AI represents the most significant shift in enterprise cost structure since the move to cloud computing. Like any structural operating lever, if managed properly, it will enhance EBITDA trajectory and competitive advantage; but if managed poorly, it has the potential to become a margin-eroding monster. Organizations must have cost and utilization visibility, understand the ROI, and reallocate capital toward what works. The mandate is clear: get in front of this before it gets in front of you.

Frequently Asked Questions

1. What is agentic AI, and why does it increase enterprise token costs?

Agentic AI refers to autonomous AI systems that can plan, execute multi-step workflows, call external tools, and iterate toward a goal without continuous human input. Unlike chatbot interactions, these agents repeatedly process context, spawn subtasks, and generate intermediate outputs. Each step consumes input and output tokens, often compounding costs through recursion, context accumulation, and parallel agent execution. As a result, token spend can scale rapidly and unpredictably across teams and workflows.

2. How can CFOs measure ROI from agentic AI deployments?

CFOs can evaluate agentic AI ROI using a layered framework that captures value across operational and strategic levels. This includes task efficiency metrics (time savings and error reduction), process throughput improvements (capacity gains across workflows), business outcomes (revenue per FTE, customer satisfaction, time-to-market), and cost avoidance (reduced outsourcing or delayed hiring). Establishing baselines before deployment and reviewing performance quarterly helps organizations attribute impact accurately as AI capabilities evolve.

3. What governance strategies help control agentic AI spending and risk?

Effective agentic AI governance combines financial visibility with operational controls. Organizations should implement real-time token monitoring, create AI cost centers by business unit, define model-tier usage policies, and set workflow-level budgets. As adoption matures, governance should evolve to include authorization tiers, procurement integration, and performance scorecards that link AI spend to productivity and revenue KPIs. This approach enables innovation while preventing uncontrolled token consumption.

Email Us

The Phantom Menace Budget-Buster: Agentic AI

Viraj Parikh

Viraj Parikh

Quick Links

Company

Get Connected

Nashville, TN

Princeton, NJ