Sovereign AI in India: A Word in Search of a Definition

I. The Word

Here is a fun thing about “sovereign AI” in India: nobody knows what it means.

At a major AI summit in Delhi in February 2026, the word was attached to: a GPU rental service, a cloud product from a large American software company, a 105-billion-parameter language model trained on chips imported from Taiwan via California, a state-government industrial park, and the entire stack of Indian digital public infrastructure that has been around since well before anyone was calling things sovereign. A startup booth advertised “instant, sovereign GPU access.” The GPUs were made in Taiwan. A large American software company offered a “Sovereign Cloud,” which is the same software they sell everywhere else, with a sticker on it. The 105-billion-parameter model is, I should say, a very good model. It is also sovereign.

The thing you would normally do here is look up the legal definition. There isn’t one. The Information Technology Act does not define sovereign AI. The 2023 data protection act does not define sovereign AI. The Cabinet note that approved a roughly $1.25 billion mission for AI in March 2024 mentions “tech sovereignty” the way a wedding invitation mentions the weather. The 2018 national AI strategy paper does not use the term at all; it calls India the “AI Garage for 40% of the world.” A garage, you’ll notice, is where someone else parks their car.

So the situation is: a word everyone is using, attached to a great deal of capital, with no operational definition. To the IT ministry, sovereign AI means national capability. To a cloud provider, it means a data center inside Indian borders. To a chip startup, it means RISC-V. To a foundation-model company, it means tokens that handle Tamil better than GPT-4. To a procurement officer, it means a category they can buy from. To a venture capitalist, it means a category they can sell into. Everyone agrees sovereign AI is a good thing, while disagreeing about what it is. The agreement holds because of the disagreement.

The closest thing to an authoritative definition came, fittingly, from a chip company. The CEO of a large American GPU manufacturer gave a speech in Dubai in early 2024 saying every country needs to “own the production of its own intelligence” and helpfully suggested codifying one’s national language and culture into a large language model. This is, if you squint, a perfectly reasonable thing for a chip CEO to say, in the same way it would be perfectly reasonable for the CEO of a cement company to argue that every country needs to own the production of its own roads. It is, conveniently, a definition under which sovereign AI is sold by the speaker’s company. It has been quietly adopted, in various forms, by basically every AI policy document worldwide that does not have its own definition. India’s documents are among them.

The fuzziness is not a bug. It is the strategy. As long as everyone can read their preferred meaning into the word, everyone is on the team. What I want to do in this piece is walk down the AI stack, layer by layer, and ask where India is actually building something, where it isn’t, and where the money is going. The answer is uneven, but in a more interesting way than “uneven” suggests.

II. The Stack

It is convenient to think of AI as a stack. The lowest layer is electrons. The highest layer is whatever the AI is doing for you. We will go bottom to top.

Energy

Electricity is not usually part of the AI sovereignty conversation, which is strange because every other layer assumes it. India’s installed capacity crossed 500 GW at the end of 2025, with renewables around 190 GW. Building a data center in India costs about $7 per watt — against $10 in the United States, $11 in the United Kingdom, and roughly $6 in China. India is the second-cheapest large economy in the world to build AI compute capacity in. Nobody at the summit was selling sovereign electrons, but every gigawatt-scale AI campus that gets announced is, on inspection, a bet on the cheap-power thesis. The grid is uneven across states, AI workloads want 99.999% uptime, and storage is thin, but as foundations go, “we have lots of cheap electricity” is a pretty good one.

Silicon

Here is where the rhetoric and the reality have the most fun together.

India has approved more than ten semiconductor projects since 2021, with cumulative committed investment north of $18 billion. The flagship is a roughly $11 billion fab in Gujarat, built by a domestic conglomerate with a Taiwanese partner, planned at 50,000 wafer starts per month. There is a $3.2 billion assembly-and-test facility in Assam, smaller units from an American memory company, two Indian conglomerates, a Taiwanese contract manufacturer, and an Israeli foundry. First chips from the flagship are targeted for late 2025; stable production by 2026.

The chips will not be AI chips. Every approved Indian fab is at mature nodes — 28 nm and above. Frontier AI silicon is fabricated at 5, 4, and now 3 nanometers, at facilities in Taiwan and South Korea, on equipment from a Dutch monopolist that the United States has restricted from selling to China and that nobody has tried to sell to India because India has not asked. The gap is roughly five process generations. Nobody seriously expects India to close it this decade.

This sounds like a problem and is mostly not. The fabs being built are the right fabs — they will supply the chips that go into cars, appliances, telecom equipment, defence, and the enormous ecosystem of devices that does not need bleeding-edge silicon. They are not, in any meaningful sense, AI fabs, and the official position is admirably honest about this. Any Indian-designed AI accelerator, including the indigenous GPU programme announced under the AI mission with a 2029 production target, will be fabricated abroad. The mission is buying option value at this layer, not catching up.

Chips and Accelerators

One layer up is chip design, where India has been doing real work for a while.

The open-source RISC-V instruction set, for reasons that are partly technical and partly geopolitical, has become the default architecture for Indian processor design. Two academic-led cores anchor the ecosystem; both have had multiple successful tape-outs at older nodes. A clutch of fabless startups now constitutes the credible Indian footprint in AI hardware: a microcontroller company that shipped the first commercially designed Indian chip in May 2024 ($8 million Series A), a RISC-V core company spun out of the same institute, a gallium nitride defence-and-telecom company in Bangalore, a neuromorphic accelerator company building reconfigurable AI hardware, and a consumer-internet founder who has committed roughly $230 million of family-office capital to an AI venture planning to tape out an indigenous AI chip by 2026.

None of these companies is competing with the frontier. The frontier in 2026 is roughly 80-billion-transistor accelerators with millions of high-bandwidth memory stacks attached, on a process that one company in Taiwan can run. Indian startups are building the tier below — application-specific accelerators, edge AI inference, low-power IoT controllers, vision processors. Taiwan, you will recall, started in the lower tiers and climbed up. India has the option. The question is whether it takes the climb seriously.

In the meantime, every GPU in every Indian AI training cluster — including every GPU in the much-celebrated subsidised national compute portal — is American. There are roughly 38,000 of them, supplied primarily by one company in Santa Clara and secondarily by a competitor in Sunnyvale. They are foreign silicon, fabricated in Taiwan, running CUDA software that is also American. Calling this stack sovereign requires a generous definition of sovereign — basically the definition under which “buying things” is a form of “owning things” — but the deployment is sovereign, in that India decides who gets to use the GPUs and at what price. This is a meaningful kind of sovereignty. It is not the kind the word evokes.

Compute Infrastructure

This is the layer where the headline numbers look best.

The national AI compute portal, approved in March 2024, originally targeted 10,000 GPUs. By late 2025 it had onboarded roughly 38,000 across fourteen empanelled providers. Pricing is the striking part. About ₹65 per GPU-hour on average, after a 40% government subsidy. A startup or researcher can rent an H100 in India for under $1.50 an hour, several multiples below global commercial benchmarks. The Indian government has effectively created the cheapest production AI compute environment in the world for domestic users by combining cheap electricity, public-private financing, and a procurement subsidy. A country with a small AI public budget cannot finance frontier model training. It can make frontier compute affordable to people who would otherwise be locked out of it. India did the second thing.

The data centre buildout is, separately, dramatic. Commercial capacity stood at about 950 MW at the end of 2024 and is projected to roughly double by the end of 2026 and reach 9 GW by 2030. A domestic energy and telecom conglomerate has announced a 1 GW campus in Gujarat, scalable to 3 GW. An American search company in partnership with an Indian infrastructure conglomerate has announced a $15 billion gigawatt-scale build on the eastern coast. The three large American hyperscalers have together committed more than $50 billion through 2030. The Union Budget for 2025–26 extended the data-centre tax holiday to 2047. Tax-advantaged power-hungry capital infrastructure with 22-year fiscal certainty is, it turns out, a terrific business.

There is, however, a question the marketing materials skirt: when an American hyperscaler operating in India receives a legal demand from its home government for data stored on Indian soil, what happens? The hyperscaler will say it complies with both jurisdictions. The jurisdictions may give conflicting instructions. The “sovereign” suffix on the cloud product is a self-certification with no Indian regulatory standard to certify against. This is unresolved, and it is also unresolved in the EU under the CLOUD Act, where they have been working on it for years. You cannot regulate a layer that does not exist yet. First the data centres, then the doctrine.

Networking

India is structurally well-placed here. 5G commercial rollout took roughly two years and produced the world’s second-largest subscriber base. The rural fibre programme has connected hundreds of thousands of villages. Submarine cable landing stations are being added. The 2023 data protection act establishes a soft data-localisation regime with substantial extraterritorial reach. None of this is glamorous. All of it works.

Data

Here is where India is genuinely distinctive.

Three asset classes matter. The official datasets platform launched under the AI mission now hosts more than 5,500 datasets across 20+ sectors and has a five-year allocation of about ₹200 crore. The national language platform hosts more than 350 AI models, has had over a million downloads, and has signed institutional MoUs with the Indian railway system for voice-based translation. The most important asset, for my money, is the academic-led open language ecosystem from a southern technical institute, which has produced a 251-billion-token pretraining corpus across 22 languages, a 74.7-million-pair instruction-tuning dataset, and multilingual TTS datasets covering all 22 constitutionally recognised Indian languages. This is real data, generously licensed, in languages that are otherwise badly underserved by global AI.

Here is the wonky part, which I find genuinely cool. Global frontier models are bad at Indian languages — not because they cannot learn them, but because their tokenizers were trained on English-heavy corpora and consequently chop Indic-language inputs into 4 to 8 tokens per word, against 1.4 for English. Inference in Hindi or Tamil or Bengali is therefore roughly five times more expensive than inference in English on the same model. People have started calling this “the token tax.” It is the single most concrete economic argument for an Indic-first foundation model: the moat is not parameter count, it is tokenizer design plus high-quality language corpora plus inference pricing. India has all three.

The 2023 data protection act is awkwardly aligned with all this — it is consent-centric, written for individual transactional data, not for trillion-token web scrapes. The AI industry’s lobbying body has formally asked for a research and training exemption for “publicly available data.” The request is pending. Whether the act gets amended, a separate framework gets written, or the data fiduciaries figure out a workable interpretation — these are the eighteen-month questions to watch.

Foundation Models

This is where the political bet is concentrated.

The AI mission’s foundation-models pillar received more than 500 proposals in its first year. Twelve teams have been selected across two phases, with allocations ranging from a few hundred thousand dollars to about $125 million for the largest awardee, an academic-led consortium based at a Mumbai technical institute. The largest commercial winner received about ₹247 crore for a 120-billion-parameter open-source model branded as India’s “sovereign LLM ecosystem.” Other awardees cover speech recognition, healthcare reasoning, generative video, and Indic translation.

Models actually shipped at the February 2026 summit included a 30-billion and 105-billion-parameter mixture-of-experts pair from the largest commercial awardee (32K and 128K context windows respectively), a 17-billion-parameter multimodal model from the academic consortium supporting all 22 official languages, a low-latency speech-to-text and text-to-speech pair with claimed character error rates below 0.6%, and a verticalised health-reasoning model. Outside the AI mission, a consumer-internet founder’s AI venture released two models in late 2024 and early 2025 — a 7-billion and a 12-billion-parameter model — and has committed roughly ₹10,000 crore over the year to its AI program.

None of these models is competitive with the global frontier on parameter count, training compute, or general reasoning benchmarks. The gap is widening, as frontier labs train on tens of thousands of GPUs for months at a time with budgets larger than the entire AI mission. If the goal is to win every leaderboard against GPT-5 and Claude 4 and Gemini Ultra, India will lose, by an enormous margin, and so will every other country, including most American countries, which are also America.

But that is not the goal. The official thesis is that foundation models are becoming commodities, that 50-billion-parameter models will handle 95% of Indian use cases, and that the strategic value is in Indic-language reasoning, voice-first interfaces, and domain-specific deployments rather than chasing the frontier. This is plausible, and it is also, conveniently, the bet India can afford. If it works, it is better than chasing the frontier, because chasing the frontier is a game where the second-place finisher gets nothing. Specialisation is a more durable competitive position than scale at any given price point. The entire history of computing tells you this. The Indian bet is on specialisation.

A separate thing worth flagging. The largest commercial awardee disclosed in April 2025 that a central government body would take an equity stake in the company in exchange for compute resources. By global AI funding standards this is unusual, and other AI founders have asked, in essence, why this firm and not theirs. The answer is presumably that the firm in question had the right team and architecture at the right time, but it would help — and the policy community has been gently suggesting this — for future rounds to publish selection criteria up front. This is a normal piece of process improvement.

Applications

If you only read one section of this article, read this one.

India has built, over the last decade, a population-scale digital public infrastructure. Identity (a biometric system covering 1.3 billion people). Payments (a real-time rail running tens of billions of transactions a month). Documents (a government cloud document wallet). Commerce (an open digital commerce protocol). Languages (the multilingual platform mentioned above). Account aggregation (a consent-based financial data sharing framework). None of this was built for AI. All of it is now AI-relevant, in the way that the Roman road network turned out to be Christianity-relevant several centuries after it was built. A health-AI startup deploying in India does not have to build identity, payments, or document verification — it plugs into existing rails. The friction cost of deploying AI to a billion people is, in India, dramatically lower than in any other country in the world. This is enormous, and it is permanently true.

The AI mission’s application pillar (₹689 crore) has selected 30 applications across agriculture, health, climate, and disaster response. Sectoral hackathons have been run with the cyber crime coordination centre, the geological survey, the alternative medicine ministry, the small-enterprise ministry, the financial reporting authority, and the national cancer grid. There is an AI Centre of Excellence for Education with a ₹500 crore allocation in the 2025–26 budget. The defence ministry is integrating Indian-designed RISC-V chips into satellite and avionics systems for fault tolerance. None of this is AI sovereignty in the chip sense. All of it is AI sovereignty in the more useful sense — a country building things, on its own rails, in its own languages, for its own users, faster than any other country at this scale could. This is the third path between American private platforms and Chinese state platforms, and the rest of the Global South is watching.

Governance

India has explicitly chosen not to enact a horizontal AI Act of the European kind, and this is a defensible policy choice rather than a missing piece of homework.

The architecture is four-pillared. Existing horizontal law applied to AI (the IT Act, the criminal code, the consumer protection act, the copyright act, the data protection act). Ministry-level advisories — the most famous of which was issued on March 1, 2024 and rescinded on March 15, 2024 after the AI startup community made its views known. The November 2025 Governance Guidelines, organised around seven principles, principle-based and explicitly non-statutory, with an AI Governance Group, a Technology and Policy Expert Committee, and an AI Safety Institute. And sectoral regulators retaining domain oversight — the central bank for finance, the markets regulator for securities, the medical research council for health.

The case for this architecture is straightforward: AI is moving too fast for primary legislation, sectoral regulators understand their sectors, and a principles-based framework leaves room to adapt. The case against is that voluntary guidelines have no teeth and that something will eventually go wrong. The European Union has the world’s most comprehensive AI Act and also, by most measures, the world’s least competitive AI industry. Causation is hard to establish, but the correlation should give pause to anyone who thinks “more rules” is the obvious answer.

India hosted the 2026 summit, renamed from “AI Safety Summit” to “AI Impact Summit,” which is a substantive choice — the renaming reflects the global shift away from existential-risk framing toward implementation and deployment. The Delhi Declaration is organised around People, Planet, and Progress. It is the first major AI governance document from the Global South, and it signals a real reorientation of the international conversation.

III. The Money

Let’s talk about the budget for a minute, because it tells you what the strategy actually is.

The five-year mission allocation, approved in March 2024, is about ₹10,372 crore — call it $1.25 billion. It is divided across seven pillars: 44% for compute capacity, 19% for foundation models, 19% for startup financing, 9% for skills, 7% for application development, 2% for the datasets platform, 1% for overheads, and 0.2% for safe and trusted AI. The safe-and-trusted-AI line item is one-fifth the size of the overheads-and-contingency line item. The signal is: India is funding capability, not governance, and is doing so on purpose. You may agree or disagree. It is at least clear.

As of an analysis published in April 2026, roughly ₹400 crore had actually been released — about 4% of the five-year outlay, in two years. This can be read two ways: either the mission is underspending, which is a problem, or the mission is ramping into capability rather than dumping money into a market that has not yet absorbed it, which is good fiscal hygiene. I lean toward the second. The next phases require talent, institutions, and deployment, none of which are accelerated by spending faster than the ecosystem can use.

Meanwhile, the three large American hyperscalers have together announced more than $50 billion of Indian cloud and AI investment over the next several years. That is roughly forty times the central mission’s full five-year corpus. Public investment in the sovereign-AI agenda is dwarfed by foreign private investment in Indian compute infrastructure. This is the comparison that gets made to suggest the mission is undersized, but the mission was never going to fund the build-out of the compute layer. That was always going to be private capital. The mission was always going to be a coordinator and a subsidiser.

What the mission is funding directly is the layers where private capital won’t go: public datasets, Indic-language model training, application-layer hackathons in the alternative medicine ministry, skilling programs at tier-2 polytechnics, the 27 IndiaAI Data and AI Labs in tier-2/3 cities. None of these is a margin business. None of these gets built by AWS or Microsoft or Google. The mission is doing exactly what an industrial policy mission is supposed to do — funding the public goods that the private market won’t, and letting the private market handle the layers where it has the comparative advantage. Dressed up in maximalist sovereignty rhetoric, this is, at its core, remarkably orthodox industrial policy.

IV. So What Does It Mean

Sovereign AI in India is not a definition. It is a direction of travel. The travel is uneven because different layers of the stack have different physics. India will not be making 3-nanometer logic this decade, and pretending otherwise would be silly, and nobody important is pretending otherwise. India is building the data, the languages, the digital public infrastructure, the deployment rails, and an increasingly capable indigenous foundation-model and chip-design ecosystem on top.

The honest framing is that full-stack AI sovereignty is structurally infeasible for any country other than perhaps the United States and China, and the realistic objective for India is layered sovereignty: own the layers you can credibly own, partner on the layers where partnership is strategic, accept dependence where dependence is honest, and preserve switchability across providers and jurisdictions. This is not autarky. It is not the Chinese model, which is not replicable anyway. It is something genuinely new, and it has the unusual property of being better than the alternatives because India has DPI and 22 official languages and 1.4 billion people and cheap power and a deep talent pool, and none of those things are accidents of strategy. They are the inputs.

The slogan is sovereign. The stack is interdependent. The capital is mostly private, mostly foreign, mostly in compute. The genuinely sovereign assets — the languages, the data, the digital public infrastructure, the application-layer policy choices — are the ones built over decades, often by people who were not using the word sovereign when they built them. The AI mission is now layering on top.

If you ask what sovereign AI in India means in 2026, you get a different answer from every person you ask. If you ask what it will look like in 2030 — what will exist that does not exist today — most of the answers converge. There will be Indic-language models running on Indian compute, embedded in Indian DPI, used by Indian citizens, with broadly Indian governance, at prices no other country can match. You can call it sovereign if you like. You can call it whatever you like. The country is going to build it either way.

Sources

NVIDIA blog and corporate materials, including the January 2024 World Governments Summit address in Dubai.
Press Information Bureau release on the Cabinet approval of the IndiaAI Mission, March 7, 2024 (PRID 2012355).
Press Information Bureau release on IndiaAI Compute Capacity crossing 34,000 GPUs (PRID 2132817).
Press Information Bureau release on the Tata Electronics–ISM Fiscal Support Agreement, March 5, 2025 (PRID 2108602).
NITI Aayog, National Strategy for Artificial Intelligence (2018).
NITI Aayog, Responsible AI for All discussion papers (2021–22).
MeitY, India AI Governance Guidelines (November 2025).
Office of the Principal Scientific Adviser, Democratising Access to AI Infrastructure (white paper, 2025).
RBI FREE-AI Committee Report (August 2025).
MeitY AI Advisories of March 1, 2024 and March 15, 2024.
Draft IT Intermediary Rules amendments on synthetic content and deepfakes (October 22, 2025).
Digital Personal Data Protection Act, 2023, and DPDP Rules notified in 2025.
Union Budget documents, 2024–25, 2025–26, and 2026–27.
Observer Research Foundation, Operationalising India’s Sovereign AI Stack: From Intent to Capability (2025).
Carnegie India / Carnegie Endowment for International Peace, work on India’s semiconductor mission and AI compute strategy (2024–2025).
Takshashila Institution, Building India’s Data Centres (October 2025).
Tony Blair Institute for Global Change, Sovereignty in the Age of AI: Strategic Choices, Structural Dependencies and the Long Game Ahead.
Brookings Institution, Sovereignty, Safety, and Scale: Takeaways from the India AI Impact Summit (2026).
Chatham House, How Middle Powers Can Weather US and Chinese AI Dominance (February 2026).
European Commission documents on the AI Continent Action Plan and Strategic Compass.
EY India, The AIdea of India 2026: Sovereign AI in India.
The Ken, India Called Its AI Sovereign. The US Government Can Still Access It.
MediaNama, IndiaAI Mission: Only Rs 400 Crore Released in Two Years (April 2026).
Rest of World, The Myth of Sovereign AI: Countries Rely on US and Chinese Tech.
Avasant, The Illusion of AI Sovereignty: Washington and Beijing Still Pull the Strings.
Lawfare, Sovereign AI in a Hybrid World: National Strategies and Policy Responses.
Inc42, From LLMs to Verticalisation: India’s Sovereign AI Models Take Shape.
AI4Bharat technical reports and dataset releases (IIT Madras), including IndicBERT, IndicBART, Airavata, Bhasha-Abhijnaanam, Rasa, and Setu.
Bhashini platform documentation and partnership MoUs.
AIKosh / IndiaAI Datasets Platform documentation.
Public materials from foundation-model awardees and AI ventures.
C-DAC documentation on the National Supercomputing Mission, AIRAWAT-PSAI, and PARAM Rudra systems; Top500 list, November 2025 edition.
Press releases and funding announcements from Mindgrove Technologies, InCore Semiconductors, AGNIT Semiconductors, and Morphing Machines.
Future of Privacy Forum, Five Ways in Which the DPDPA Could Shape the Development of AI in India.
Internet Freedom Foundation, Analysis of the 2026–2027 Budget.
India AI Impact Summit 2026 reportage and the text of the Delhi Declaration.
News reports from Business Standard, Business Today, The Tribune, TechCrunch, Storyboard18, Dholera Times, Indian Masterminds, and Trade Brains.