What Is an AI Factory, and Why Is NVIDIA Franchising Them Out
An AI factory is a data center facility with GPUs for AI workloads. NVIDIA coined the term at GTC 2024 to rebrand GPU-dense data centers.[1] The rebranding has a business purpose: NVIDIA now sells the complete factory configurations, from chips to full data center blueprints, bundled with $4,500/GPU/year software licensing. In some ways, the model resembles McDonald's franchising more than just chip sales.
Cutting through the jargon
"Compute" and "AI factory" existed before their rebranding. Compute is processing power, nowadays specifically GPU servers running AI workloads: training (creating an AI model) or inference (running a trained model). An AI factory is a data center facility with GPUs for AI workloads.
The product ladder
NVIDIA started as a GPU chip designer. Over time, it expanded into networking (InfiniBand adapters, NVLink switches), DPUs (data processing units for offloading network and storage tasks), and CPUs (the Grace ARM processor). It now sells at seven levels of the stack, each one capturing more of the total bill of materials.
- Chip. The B200 GPU runs $45-$50k per module. The upcoming Rubin R200 ships in the second half of 2026.
- Baseboard. The HGX baseboard holds 8 SXM GPUs wired together via NVLink and NVSwitch, NVIDIA's proprietary interconnect to speed up communication between GPUs.
- Networking. InfiniBand adapters (ConnectX), switches (Quantum), NVLink switches, and BlueField DPUs. NVIDIA acquired Mellanox in 2020 for $6.9 billion to own this layer.
- Server. OEMs (Dell, Supermicro, Lenovo, HPE) build servers around NVIDIA baseboards. But NVIDIA also makes full servers too, such as the DGX B200.[2]
- Rack. The GB200 NVL72 is a full liquid-cooled rack: 72 Blackwell GPUs and 36 Grace CPUs.[3] Every component in the rack is NVIDIA-designed silicon.
- SuperPOD. A DGX SuperPOD with GB200 systems connects 8 or more racks (576+ Blackwell GPUs).[4] The next-generation SuperPOD with Vera Rubin NVL72, announced at GTC 2026, packs 14 racks with 1,008 Rubin GPUs.[5]
- AI factories. NVIDIA partnered with Bechtel, one of the largest construction firms in the world, to modularize the AI factory blueprint for repeatable, faster builds.[6] NVIDIA provides reference architectures through Omniverse (its simulation and digital twin platform) and has designed an 800VDC power standard for AI factories.
NVIDIA's product ladder
Each step up the ladder shifts more of the customer's spend to NVIDIA. At the chip level, GPUs account for 60-70% of server hardware cost. At the rack level, NVIDIA components (GPUs, NVLink switches, InfiniBand adapters) account for 76% or more of total spend.
In fiscal year 2025 (ended January 2025), NVIDIA posted $130.5 billion in revenue, with data center sales accounting for $115.2 billion.[7] In fiscal year 2026, total revenue grew to $215.9 billion. Data center revenue hit $193.7 billion, 90% of the total, up 68% year over year.[8]
By GTC 2026, Jensen stated it plainly: "We are a vertically integrated computing company. There is no other way."[9] He projected at least $1 trillion in cumulative revenue from 2025 through 2027.
How the franchise works
NVIDIA provides more than just hardware:
- The software stack: CUDA, TensorRT, NIM (NVIDIA Inference Microservices), and more.
- The blueprint: reference architectures, validated configurations, partnerships for digital twins, partnerships for modular construction
- The brand: "Dell AI Factory with NVIDIA," NVIDIA branding on every partner deployment
Partners bring the capital, sourced from private equity, sovereign wealth funds, family offices, or corporate balance sheets. They supply the real estate, either owned data center space or capacity leased from colocation providers. They handle power and cooling: electrical infrastructure and utility contracts. And they bring the customer relationships, signing agreements with AI companies, enterprises, and cloud buyers.
| NVIDIA AI Factory | |
|---|---|
| Franchisor provides | Silicon, networking, software stack, reference architectures, validated designs |
| Franchisee provides | Capital, data center space, power, cooling, customer relationships |
| Royalty mechanism | $4,500/GPU/year software license + hardware margins |
| Lock-in | CUDA ecosystem, AI Enterprise, 2-year architecture cycles |
| Scale | 67 AI-linked deals in 2025, 80-90% accelerator market share |
| Revenue | $193.7B data center revenue (FY2026) |
NVIDIA is also increasingly getting equity stakes in its largest "franchisees": $2 billion in CoreWeave (January 2026),[10] $2 billion in Nebius for an 8.3% stake (March 2026),[11] and participation in xAI's $6 billion round (December 2024).[12]
The franchising is formalized through NVIDIA's AI Factory program.[13] NVIDIA certifies partners at every layer of the stack:
- OEMs: Dell, HPE, and Lenovo build servers around NVIDIA HGX boards.[14] ODMs like Supermicro and QCT (Quanta) serve hyperscalers and cloud providers.
- Colocation: NVIDIA maintains a DGX-Ready Colocation directory of certified data centers with the power density, cooling, and structural capacity for GPU racks.[15] In the Americas: Equinix, Digital Realty, QTS, and others.
- Managed services: DGX-Ready Managed Services partners handle deployment and operations for operators who want the hardware without building a full technical team.[16]
- Cloud: NVIDIA Cloud Partners (NCPs) run NVIDIA hardware as a service. Reference platform NCPs include Lambda, Nebius, and Crusoe Cloud.
- Construction and power: Bechtel handles physical builds. Schneider Electric and Vertiv provide power and cooling infrastructure. NVIDIA published a Vera Rubin DSX reference design at GTC 2026 with an Omniverse digital twin blueprint for planning gigawatt-scale AI factories.[17]
The NVIDIA Marketplace lists the full partner directory, filterable by category.[18] The ecosystem resembles a manual for potential franchisees: certified suppliers for every component, validated configurations at every scale, and NVIDIA branding on the finished product.
Sovereign AI factories
NVIDIA is franchising their AI factories to sovereign governments the same way: provide the playbook, while letting local partners provide the capital, land, power, and local regulatory access.
Saudi Arabia. HUMAIN, backed by the Saudi Public Investment Fund, is deploying 18,000 GB300 Grace Blackwell GPUs with InfiniBand networking.[19] SDAIA (the Saudi Data and AI Authority) is building a sovereign AI factory with up to 5,000 Blackwell GPUs.[20]
Japan. SoftBank is building Japan's most powerful AI supercomputer on the Blackwell platform.[21]
India. Reliance Industries is building a 3 GW data center in Jamnagar, Gujarat, as part of a $110 billion AI infrastructure investment.[22] Yotta Data Services is building a $2 billion AI hub with NVIDIA GPUs.[23]
Europe. Germany is deploying 10,000 Blackwell GPUs as the world's first industrial AI cloud, operated by Deutsche Telekom.[24]
Beyond picks and shovels
For operators building sub-$100 million clusters, buying into the NVIDIA franchise means predictable deployment at the cost of NVIDIA capturing most of the hardware margin.
Operator economics: 576-GPU B200 cluster
Validated configurations reduce deployment risk. A GB200 NVL72 cluster built to NVIDIA's reference architecture is a known quantity: 120-132 kW per rack, liquid cooling required, CUDA + NVIDIA's software stack, published performance benchmarks.[3] Lenders can underwrite it against published specs rather than custom engineering.
NIM (NVIDIA Inference Microservices) further simplifies model deployment for operators without deep AI teams. NVIDIA's DGX-Ready Colocation and Managed Services directories point operators to pre-certified facilities and deployment partners.[15][16]
For investors and lenders, the franchise model standardizes the asset class. A GPU cluster built to NVIDIA's reference specs is underwritable the same way a McDonald's franchise is: the brand, the playbook, and the demand profile are known quantities. A lender can model expected utilization and revenue against published benchmarks. A one-off cluster with custom engineering is a harder bet.
CoreWeave validates the approach: $5.1 billion in 2025 revenue, a $66.8 billion contracted backlog, and 60% adjusted EBITDA (earnings before interest, taxes, depreciation, and amortization) margins, all built on NVIDIA hardware with NVIDIA software.[25]
NVIDIA's two-year architecture cycle (Hopper to Blackwell to Rubin) means the franchise playbook updates regularly. Each generation comes with validated designs from day one. As of early 2026, NVIDIA holds roughly 85% market share in AI accelerator chips,[26] with over 90% in training workloads.[27]
Being under the NVIDIA brand allows you to benefit from their demand generation. NVIDIA's CUDA framework launched in 2006. Nineteen years and 4 million developers later, it is what every major AI framework relies on. NVIDIA is also investing into open-source models. Every open-source model that gains adoption creates inference demand, and every inference deployment needs GPUs. NVIDIA released the Llama Nemotron model family in March 2025[28] and partners with Mistral AI on frontier open-source models.[29]
References
- TechCrunch, "NVIDIA CEO wants enterprise to think 'AI factory,' not data center" (March 2024)
- NVIDIA, DGX B200 product page and user guide (2025)
- NVIDIA, GB200 NVL72 product page (accessed March 2026)
- NVIDIA, DGX SuperPOD product page (accessed March 2026)
- NVIDIA Newsroom, Vera Rubin platform announcement (March 2026)
- IndustryTap, "Bechtel to Modularize NVIDIA's AI-Factory Blueprint" (2026)
- NVIDIA Q4 FY2025 and full-year earnings (January 2025)
- NVIDIA Q4 FY2026 and full-year earnings (January 2026)
- SiliconANGLE, "NVIDIA CEO Jensen Huang bids to own the entire AI factory stack" (March 2026)
- CIO Dive, "Nvidia backs CoreWeave with $2B" (January 2026)
- Reuters, "Nvidia to invest $2 billion in AI cloud firm Nebius" (March 2026)
- TechCrunch, "NVIDIA's AI empire: a look at its top startup investments" (January 2026)
- NVIDIA, "AI Factories" solutions page (accessed March 2026)
- NVIDIA Newsroom, "Computer Industry Joins NVIDIA to Build AI Factories" (2024)
- NVIDIA, DGX-Ready Colocation Data Centers (accessed March 2026)
- NVIDIA, DGX-Ready Managed Services (accessed March 2026)
- NVIDIA Newsroom, Vera Rubin DSX AI Factory Reference Design (March 2026)
- NVIDIA Marketplace, AI Factory partners (accessed March 2026)
- NVIDIA Newsroom, "HUMAIN and NVIDIA announce strategic partnership to build AI factories in Saudi Arabia" (2025)
- NVIDIA Newsroom, "Saudi Arabia and NVIDIA to build AI factories" (2025)
- NVIDIA Newsroom, "NVIDIA and SoftBank accelerate Japan's journey to global AI powerhouse" (November 2024)
- TechCrunch, "Reliance plans world's biggest AI data centre in India, report says" (January 2025)
- CNBC, "India's Yotta plans $2 billion AI hub with Nvidia GPUs" (February 2026)
- NVIDIA Blog, "NVIDIA and Deutsche Telekom Launch World's First Industrial AI Cloud to Advance European Manufacturing" (June 2025)
- CoreWeave Q4/FY2025 earnings report and SEC S-1 filing
- The Motley Fool, "Nvidia's 85% GPU Market Share Faces Growing Competition" (January 2026)
- Mizuho Securities, NVIDIA AI chip market share analysis (October 2025)
- NVIDIA Newsroom, Llama Nemotron family launch (March 2025)
- Mistral AI, "Mistral AI and NVIDIA partner to accelerate open frontier models" (2026)
Frequently Asked Questions
What is an AI factory?
An AI factory is a data center facility with GPUs for AI workloads. NVIDIA coined the term at GTC 2024 to rebrand GPU-dense data centers. The rebranding has a business purpose: NVIDIA now sells complete factory configurations, from chips to full data center blueprints, bundled with $4,500/GPU/year software licensing. In some ways, the model resembles McDonald's franchising more than just component sales.
What is NVIDIA's product ladder?
NVIDIA historically focused on a single product: a GPU chip. It now sells at six levels of the stack, each one capturing more of the total bill of materials: chip (B200 GPU at $45,000-$50,000), baseboard (HGX with 8 SXM GPUs via NVLink), server (DGX B200), rack (GB200 NVL72), SuperPOD (576+ Blackwell GPUs), and full AI factory blueprints. At the chip level, GPUs account for 60-70% of server hardware cost. At the rack level, NVIDIA components account for 76% or more of total spend.
How does NVIDIA's AI factory franchise model work?
NVIDIA provides the hardware, the software stack (CUDA, TensorRT, NIM), the blueprint (reference architectures, validated configurations, partnerships for modular construction), and the brand. Partners bring the capital, the real estate, power and cooling, and customer relationships. NVIDIA certifies partners at every layer: OEMs build servers around NVIDIA HGX boards, colocation providers meet DGX-Ready certification, and cloud partners run NVIDIA hardware as a service.
What is the NVIDIA GB200 NVL72?
The GB200 NVL72 is a full liquid-cooled rack: 72 Blackwell GPUs and 36 Grace CPUs. Every component in the rack is NVIDIA-designed silicon. A GB200 NVL72 cluster built to NVIDIA's reference architecture is a known quantity: 132 kW per rack, liquid cooling required, CUDA + NVIDIA's software stack, published performance benchmarks.
Coverage creates a minimum value for what your GPUs are worth at a future date. If they sell below the floor, the policy pays you the difference.
Learn how it works →