The Cheapest GPUs in the World – Timlig Engineering Notes

There is a fact about Indian AI infrastructure that, if you sit with it for a minute, is genuinely funny.

The Indian government, through its IndiaAI Mission, will currently sell you an hour of an H100 GPU for about ₹65. That is roughly 78 cents. If you happen to be a startup building an “indigenous foundational model,” the government will sell you that same H100 for zero rupees, because the compute subsidy in that case is 100%. One hundred percent. You bring the engineers, the state brings the silicon, the silicon costs you nothing.

Meanwhile, if you walked into a commercial Indian neocloud and asked for the same H100 SXM5 on-demand, the sticker price would be around ₹249/hr, or about $2.99. If you walked into the AWS Mumbai region and asked the same question, you’d be quoted something north of ₹330/hr.

So the price of one (1) H100-hour in India, depending on who you ask, is somewhere between zero and four dollars. This is a 4× spread on what is supposed to be the most fungible commodity in the entire AI stack. The same chip. The same hour. Different invoice.

The natural question is: which one of these prices is the real price? And the answer, which I want to spend the next 3,000 words on, is that none of them are. The real price is something else entirely, and almost nobody in India is currently set up to measure it, because we have collectively decided that GPUs are like roads — public goods you pour money into so that the next layer of the stack can do something interesting — rather than like, you know, business assets that are supposed to make money.

This is fine! It might even be smart industrial policy. But it does mean that when you read about the Indian AI ecosystem and someone tells you their AI startup has a 70% gross margin, you should know that you are reading a sentence with approximately the same epistemic content as the sentence “my house is worth a lot because I really like it.”

Let me explain.

1. The basic problem

The basic problem is that an AI company’s gross margin is almost entirely a function of two things you cannot see on the income statement.

The first is the utilization rate of the underlying GPU — what fraction of the time the chip is actually doing math that someone is paying for, as opposed to (a) sitting on, (b) reading and writing memory while the tensor cores are idle, (c) waiting for the next batch, (d) checkpointing because GPU #14,332 in the cluster just died, or (e) crunching numbers for an internal experiment that will be deleted in six weeks.

The second is the depreciation schedule — how many years you, the operator, have decided this $40,000 chip will keep earning revenue. If you say “two years,” you have to recognize $20,000 of expense per year and your gross margin looks bad. If you say “six years,” you only recognize $6,667 per year and your gross margin looks like SaaS. Same chip. Same revenue. Different number on the page. Investors trade these companies at different multiples based on the number on the page.

This is true everywhere. It’s true in the United States, where a leading hyperscaler quietly extended GPU useful life from 4 to 6 years between 2022 and 2024 and added something on the order of $3 billion to annual operating income — purely from the accounting change, not from any chip getting better at its job. It’s true in Europe.

But it is especially true in India, because in India there is a third variable that the US and Europe don’t have, which is the government writing checks for the GPU bill. And this turns out to do strange things to the math.

2. The IndiaAI subsidy and the breakeven that doesn’t exist

Here are the numbers, very quickly.

IndiaAI Mission’s total budget is ₹10,371.92 crore — about $1.14 billion — over five years. Of that, ₹4,563 crore (~$500M) is earmarked for compute. As of February 2026, the Mission has onboarded 38,000+ GPUs across 14 empanelled providers, with another 20,000 in the pipeline. The lowest accepted bid in the most recent tender was ₹65/GPU-hour as a baseline rate; H100s specifically came in around ₹92/hr. The government then layers on a 40% subsidy for general approved users and a 100% subsidy for a select group of startups developing foundational models. The Minister has called it “the cheapest compute facility in the world,” which is the kind of thing politicians say but which, in this case, is approximately true.

It is also, to a first approximation, free money for compute. And free money for compute does what free money always does: it shifts the breakeven analysis from “at what utilization does this GPU pay for itself” to “how much of this should I grab before they notice.”

Let me be more precise. A leveraged commercial Indian neocloud — no subsidy, debt-financed, importing the chip with 18% IGST stacked on top (creditable for GST registrants but a working capital drag) — needs roughly 75–90% utilization at $2.20/hr realized rates to break even, depending on whether you depreciate the H100 over 4 or 6 years. Skinny margin, fragile to power outages, fragile to a single customer leaving, fragile to NVIDIA shipping Blackwell at scale.

A subsidized IndiaAI provider, by contrast, has a “breakeven” that is essentially whatever rate the government has agreed to pay, plus whatever they can sell on the side. The economics are not “is this asset productive” but “did I win the tender.” Which is fine for a public-goods program, except for one small detail, which is that nobody — not the government, not the providers, not the startups burning the free hours — has any structural incentive to measure goodput. The chips are paid for. The hours are paid for. Why would you care if your MFU is 22%?

The cleanest illustration of this incentive structure is the fact that, when the IndiaAI tender for 2,400 H100s went out, the AWS Managed Service Providers declined to match the lowest bid. This is in some ways the most interesting data point in the entire program. AWS has these chips. AWS would presumably like the revenue. But AWS knew that participating at the floor price would set a reference rate that would haunt the rest of its commercial book in India forever, and they walked away. The dominant US hyperscaler has decided that the marginal IndiaAI tender is worth less to it than not signaling that an H100-hour can ever cost less than $2. Which is worth thinking about, because it tells you what they think the real price is.

3. The thing nobody is measuring

Okay. So now, having established that nobody in the Indian AI ecosystem has a strong economic reason to measure GPU usefulness, let’s talk about what that usefulness actually looks like when someone bothers to measure it.

The single most important number in this entire essay is 38–43%.

That is the Model FLOPs Utilization achieved by one of the world’s most sophisticated AI infrastructure teams on a 16,384-H100 training cluster, running a flagship 405-billion-parameter model, over a 54-day continuous training window. It is one of the highest publicly disclosed MFU figures for a frontier-scale run. It is the good number.

To translate: the chips were doing the floating-point operations the model needed for about 38–43% of the wall-clock time the customer was paying for. The other 57–62% was spent reading memory, waiting on communication, recovering from failures, and doing accounting work that doesn’t show up in the loss curve. This is not a bug. This is the state of the art. Most production inference deployments run at 20–40% utilization. Most enterprise fine-tuning workloads, when measured honestly, run lower than that.

There is a separate, related number that I find delightful, which is the gap between what nvidia-smi reports and what is actually happening. You can read 100% GPU utilization from nvidia-smi while doing zero floating-point operations, because the metric counts memory reads as “utilization.” A consulting engagement at one foundation model company found a workload showing 100% nvidia-smi utilization and 20% MFU. After optimization: 38% MFU, 4× wall-clock speedup, same dashboard reading. The dashboard was lying. The dashboard is always lying.

And then there’s the failure rate. The same 16,384-GPU training run mentioned above logged 419 unexpected interruptions in 54 days — one every three hours — and 58.7% of those interruptions were GPU-related. The CPUs failed twice in 54 days. The $30,000 accelerators failed 246 times. The cheap part is reliable. The expensive part is the unreliable part. This is the thing you bought.

For Indian operators, this matters more than it does in the US, for a reason that is genuinely structural and that I want to dwell on.

4. The tropical PUE tax

Power Usage Effectiveness — PUE — is the ratio of total energy a data center consumes to the energy that actually goes into the compute. A PUE of 1.0 means every joule of electricity is being turned into computation. A PUE of 2.0 means half your power bill is air conditioning.

Here are some PUE numbers, roughly:

Nordic hyperscaler data center (Finland): ~1.10
US hyperscaler data center (Texas/Iowa): ~1.15–1.20
Frankfurt commercial colocation: ~1.30–1.35
Mumbai industrial data center: 1.55–1.70
Bengaluru/Chennai (renewable-optimized): ~1.45

What this means in practice is that for every 700-watt H100 you run in Mumbai, you pay for 1,085 watts of grid power, where the same chip in Stockholm runs on 770 watts. The Indian “cheap power” story — Maharashtra HT industrial tariff at around ₹8.36/kWh, about $0.10/kWh, genuinely cheaper than Frankfurt — gets almost entirely consumed by the cooling penalty. Run the math: cheaper electrons, more electrons needed, net result roughly a wash with US Texas and meaningfully worse than Nordic.

This is the part that I think gets underappreciated in Indian AI infrastructure conversations. The bull case for India is: cheap power, cheap labor, sovereign demand. The bear case is: the cheap-power claim is half-true at best because the cooling overhead eats it, the cheap-labor claim works for FinOps and engineering but not for the chip itself (you are paying world prices, in dollars, after an 18% IGST detour), and the sovereign-demand claim works only as long as the government keeps writing the checks. Which it might. But it might not.

The single biggest cost lever an Indian operator actually has — the one that is genuinely structural and not vulnerable to subsidy reform — is labor. A senior ML platform engineer in Bengaluru costs, on the high end, about $52,000 a year. The equivalent role at a US frontier lab is somewhere between $300,000 and $800,000 in total comp. A five-person Indian FinOps team costs ~$200,000 fully loaded. The American equivalent is ten times that.

Which leads to a question: if FinOps headcount is essentially free in India relative to compute, why does almost no Indian AI company run a serious goodput-measurement program? And the answer, again, is that nobody is asking them to. The hours are free. The chips are subsidized. The customers don’t know the difference between MFU 20% and MFU 40%, and nobody on the cap table is making them care. Yet.

5. The IT services time bomb

While we’re here, I want to talk about the single biggest distortion in the Indian AI economic story, which is the IT services sector.

Top-five Indian IT services firms collectively lost more than $150 billion in market capitalization in the first nine months of 2025. One major firm announced its first-ever mass layoff in July 2025: 12,000 jobs. ICICI Direct estimates AI may cause 2–3% annual deflation in traditional IT services revenue going forward. The traditional Indian IT model is the world’s largest, longest-running labor arbitrage trade — Indian salaries, US bill rates, take the spread for forty years — and Copilot is currently eating the spread.

And here is the cruel irony: the Indian IT services sector is built to do FinOps. It has the people, the processes, the SLA-discipline DNA, the fluency in client billing structures. Indian GCCs of foreign multinationals genuinely lead the world in some of this. The skill is there. But the IT services companies themselves have the wrong economic structure to capture AI infrastructure margin — they sell hours, not outcomes, and the hours are getting cheaper because their tooling is getting better at writing code, which is the thing the hours used to do.

The companies that can capture the margin — the AI-native Indian startups — are the ones currently being subsidized to not worry about margin. Which is a very strange equilibrium.

If I had to pick a single sentence to summarize the strategic situation of Indian AI infrastructure in 2026, it would be: the entity best equipped to discover the real cost of AI is being killed by AI, and the entity in the best position to ignore the real cost of AI is being kept alive by the government on the explicit condition that it ignore the real cost of AI.

6. The depreciation schedule, briefly

I should at least mention the depreciation thing, because it’s the part of this story that is going to matter when, eventually, the Mission scales back, or the subsidy expires, or the government runs another tender at a tighter price, and somebody actually has to write down the asset.

H100 cards in India today are being booked across operators at 5–6 year useful lives, mostly mirroring US neocloud and hyperscaler practice. A handful of Indian commercial neoclouds are honest enough to use 4 years, which is roughly the longest defensible life given that NVIDIA’s Hopper-to-Blackwell-to-Rubin cadence is shipping a new generation roughly every 12–18 months and that H100 spot rental rates have already fallen ~70% from peak.

The reality, as best as I can tell from talking to people who do this professionally and from looking at how lenders price the same hardware (lenders demand 3.5–5 year amortization on GPU-collateralized loans, which is what they actually believe), is that the real economic life of an H100 in India is probably 2.5 to 3.5 years. If you re-do every Indian neocloud’s gross margin with that assumption, instead of the 5–6 year assumption that’s currently in their books, the numbers compress by 15 to 25 percentage points. A “70% gross margin” Indian AI infrastructure business at 2.5-year economic life is probably a 45–50% gross margin business. Which is fine — that’s a perfectly respectable infrastructure business. It is not, however, a software business. And right now most of these companies are being valued like software.

The IndiaAI providers don’t have this problem in the same way, because their P&L is dominated by subsidy receipts rather than chip economics. But when the program tapers — and these programs always taper — the operators who have spent five years optimizing for “win the tender” rather than for “run the chip well” are going to discover, all at once, that the chip they win the next tender against is a Blackwell that does the same workload using a quarter of the power on the same Maharashtra grid. 2027. Three years from now. Roughly.

7. What does it actually mean

So what does it mean for you, if you are running an Indian AI startup, or an enterprise AI program at a BFSI firm in Mumbai, or a GCC of a US tech company, or you’re just trying to make sense of why every Indian AI deck you see has a 70% gross margin in cell B14?

A few things.

One: if you are taking IndiaAI Mission compute, take it. It’s free money. But run it on a separate set of books from your commercial workload, and measure your actual cost-per-useful-token as if you were paying ₹249/hr for it, because in 2027 you might be.

Two: measure goodput, not uptime. The single highest-leverage FinOps move you can make as an Indian AI operator is to instrument your training and inference workloads against MFU and against $/successful-inference, not against nvidia-smi. The cost of doing this in India is comically low — a five-person team is $200K — and the savings, if industry-typical waste figures (~30–50% of AI spend) translate, run into the crores.

Three: if you are an investor, the gross margin on the deck is not the gross margin on the business. Re-do it at 3-year depreciation. Re-do it at the commercial GPU rate, not the subsidized one. Re-do it at 30% utilization, which is the realistic frontier number, not 80%, which is the underwriting fiction. The companies that still look good after those three adjustments are the real ones.

Four: the most undervalued asset in Indian AI right now is a senior infrastructure engineer who knows what MFU is and can get a workload from 22% to 40%. That person, in the US, costs $500K. In Bengaluru, that person costs $50K. The arbitrage is enormous, and almost nobody is running it, because the people writing checks have not yet figured out that this is the lever.

The Mission is, on balance, a very good thing. It is buying India a seat at a table the country would otherwise not have a seat at. But it is also, as a side effect, teaching an entire generation of Indian AI builders that compute is free, and compute is not free, and the bill is going to come due, and when it does the people who will survive are the ones who have spent the subsidy era pretending they were paying full price.

The cheapest GPUs in the world are not free. They’re just being paid for by somebody else, for now.

Welcome to the queue.

Sources

Bedrock AI · Behind the Balance Sheet · Bizety · BW Businessworld · Cast AI · Cerno Capital · CIO Dive · CloudZero · Cybernews · Data Center Dynamics · Deep Quarry · EE Times · Eurostat · Flexera (State of the Cloud 2026) · FinOps Foundation (State of FinOps 2024–2026) · GetDeploying · GMI Cloud · ICICI Direct · IEA Electricity 2026 · Inc42 · Interface · IntuitionLabs · Introl · Jarvis Labs · Latent Space · Levelheaded Investing · Llama 3 technical report · McKinsey · Mercom India · MIT NANDA (GenAI Divide) · neXt Curve · Outlook Business · Oplexa · PaLM technical report (arXiv 2204.02311) · Press Information Bureau · Princeton CITP · RawCompute · SDxCentral · SemiAnalysis (H100 Index, GB200 Benchmarks, Goodput Theory) · Silicon Data · SiliconANGLE · SME Futures · Spheron · Stanford 2025 AI Index · Tech Insider · The Information · theCUBE Research · Thundercompute · Tom’s Hardware · Trainy · Varindia · vLLM/PagedAttention paper (arXiv 2309.06180)