From Free Lunch to Infrastructure: The New Economics of AI

by Bill Malkes | Jul 7, 2025 | Business Buzz

Google’s Price Hike Signals the End of Subsidized Intelligence

A Founder’s Lens | 1,227 words | 5 min read

Last week, Google quietly doubled the price of Gemini Flash for API users. This was not a routine adjustment. It was a strategic signal. The era of cheap, subsidized AI has come to an end. From now on, true costs are determined by memory, bandwidth, and compute. Founders must treat AI as a core infrastructure and a recurring expense, not an optional add-on or a freemium layer. This shift reshapes the economics of every product built on these systems.

For every founder building in 2025, this is the dividing line. Every product roadmap, pricing model, and technical decision must now reflect the reality that AI costs are rising, not falling.

Google’s move is not a blip. It is a warning. The phase of “try everything” is finished. The facts of physics and resource demand now overtake the old optimism. The true cost of intelligence now appears line by line, on every founder’s balance sheet.

The End of Subsidy: Why This Is Happening Now

For three years, the Big Four spent billions subsidizing the adoption of their products. OpenAI, Google, Anthropic, and Microsoft buried losses beneath search profits and cloud bundles, each racing to become the default platform. Every startup rode this wave, believing the future would only grow faster and cheaper.

That belief has shattered against reality. Physical constraints dictate the true cost of intelligence: memory, bandwidth, hardware, and energy. Each leap forward demands more resources, not fewer. The vendors can no longer absorb these costs. The era of subsidized compute has ended.

What was once given away like venture capital is now priced like infrastructure. Compute has become freight, metered by load, measured by latency, mapped by geography. This is the new landscape. The line has moved. Startups must build for a world where every inference, every deployment, and every integration carries a real and rising cost.

Google’s price increase is just the first honest signal. Others will follow. This is not a temporary squeeze but the new normal. We have left behind the era of AI as a gimmick or marketing veneer. From this point forward, every decision about AI is a decision about system design and cost control. Every choice is an engineering choice, and every engineering choice now carries a measurable price.

The New Math: What AI Actually Costs

Using published token rates from vendor documentation and developer portals, we estimate the cost to summarize a 4,000-token document with a 500-token answer:

Model	Input / Output Rates Per 1M Tokens	~ Task Cost (4,000 in + 500 Out)	Monetization Notes
Claude 3 Opus	$15 in / $75 out	~$0.098	Most expensive, usage-based API with premium positioning for accuracy & reasoning
GPT-4o (API)	$2.50 in / $10 out	~$0.015	Versatile API with lower per-token costs; also available via subscriptions (Plus)
Gemini 1.5	$7 in / $21 out	~$0.008	Recently adjusted pricing via Google Cloud signals rising inference costs
Copilot	Bundled license	~$0.10 – $0.20*	Token costs are abstracted behind per-seat licensing; value varies based on usage intensity
Estimates based on publicly available pricing as of July 2025. Actual costs may vary by workload and vendor configuration.
*Copilot bills per user, not per token. Heavy users extract more value, but every company pays whether they use it or not.

At scale, these pennies become existential. The real danger isn’t in the individual API call. It’s in how modern AI systems compound costs through cascading operations.

Consider agentic AI stacks: A single user request triggers an initial prompt, which spawns multiple tool calls, retrieves memories, and loops through refinements. Each step generates new API calls. What starts as a simple 4,500-token task balloons to 45,000 tokens or more. A product processing modest traffic suddenly faces hundreds of thousands in annual costs.

But the true signal isn’t the absolute price. It’s the reversal. For three years, companies architected their systems assuming AI costs would follow the familiar tech trajectory: cheaper every quarter, approaching zero. They built sprawling, agentic workflows, liberal retry policies, and expansive context windows because the unit economics continued to improve.

Google’s price adjustment shatters that assumption. This isn’t a temporary spike but a structural shift. The era of subsidized intelligence is over. Companies that built for yesterday’s economics now face today’s physics: real compute, real energy, real costs.

If your architecture assumes AI is cheap, you’re building on sand.

When Cost Ripples Through the Stack

The cascade effect transforms pricing from linear to exponential. A modest 10% increase at the base layer compounds: systems retry failed calls, invoke fallback models, and trigger additional validation loops. That 10% hike becomes 30% by the second layer, 50% by the third.

In production, a single workflow might touch a dozen services. What begins as a quiet upstream change can cascade into outages, budget shocks, or system failure.

Most founders will not see it coming. Not until the stack breaks their margin.
Survival belongs to those who build for constraint, price for reality, and recover without flinching.

The Hidden Line Item

Your bill of materials (BOM) is lying to you. It’s missing your biggest cost: intelligence.

Every agent call burns compute.

Every inference consumes resources. Every retry adds cost and latency. Model selection, token usage, and inference patterns are quantifiable inputs with hard economics. If your BOM ignores these components, you’re not modeling your actual system.

The startups that survive will track AI like any other material input. Those treating it as a nebulous “platform cost” will discover too late that they’ve been burning cash they never counted.

Startup Playbook: How to Compete in the New AI Market

The winners will batch by default, reserving real-time inference for tasks that justify ten times the cost. They will build hybrid stacks, running routine work through open-source models like LLaMA or Mistral and saving premium APIs for critical moments. Every vendor becomes a fallback option, never a dependency.

They will track unit economics from the start. At current rates, AI costs compound fast: what starts as a few thousand per month can explode to six figures as usage scales. These numbers will shape pricing decisions, not growth fantasies.

Survivors will stress-test constantly. They know what happens when costs triple, when APIs fail, or when models degrade. They can swap models and batch tasks when prices spike, not after. They monitor spend at every layer, not just the final invoice.

This is engineering for constraint. The winners won’t simply use AI. They’ll architect it as infrastructure. Every API call becomes a budget item. Every inference belongs in the BOM. In this new era, you don’t just track physical inputs. You account for intelligence.

The New Survival Test

The Big Four will continue to raise prices. Your job is to build a company that thrives because it is ready, not one that dies waiting for AI to get cheap again.

When your primary provider changes course, your product should keep running. No pricing shift or service disruption should threaten your roadmap. Build systems that absorb shocks, not amplify them.

The winners have already started preparing. The losers do not even know they are in a war. Most will realize too late that the rules have changed.

Latest Musings

Tariffs Are Here. Make Them Your Advantage

Jul 16, 2025 | Business Buzz

A Founder’s Lens | 1,428 words | 6 min read If you build anything physical, August 1, 2025, should already be circled on your calendar. That is when 30% tariffs hit select goods from Mexico and the EU. Most hardtech startups aim for margins between 40% and 60%. If you...

Eroom’s Law or a Technopreneur in King Pharma’s Court: Part 3

Jul 13, 2021 | Business Buzz

A Technopreneur Lens 895 words | 3.2 minute read In Musing 1 we explored the foundation of Eroom’s Law. In Musing 2 we pirouetted our way through author reflections a decade after publication, dusted with contrasting views on trends in the cost of Drug Discovery...

Eroom’s Law or a Technopreneur in King Pharma’s Court: Part 2

Jul 5, 2021 | Business Buzz

Reverberation 577 words | 1.8 minute read In “The First Steps” Musing 1 we traveled Eroom’s Law, introduced in a 2012 piece for Nature Reviews Drug Discovery titled “Diagnosing the decline in pharmaceutical R&D efficiency”. Source Immense transformations in...

Eroom’s Law or a Technopreneur in King Pharma’s Court: Part 1

Jun 21, 2021 | Business Buzz

The First Steps 625 words | 2.1 minute read One Billion Dollars Because it was grassy and wanted wear A billion dollars doesn’t go as far in drug commercialization as it once did. Or does it? Arguments tilt each way, but it is well-defined that in seasons when...

« Older Entries