What Is an AI Factory, and Why Is NVIDIA Franchising Them Out

Mar 30, 2026·Bernie Margulies, CEO·AC Research

An AI factory is a data center facility with GPUs for AI workloads. NVIDIA coined the term at GTC 2024 to rebrand GPU-dense data centers.^[1] The rebranding has a business purpose: NVIDIA now sells the complete factory configurations, from chips to full data center blueprints, bundled with $4,500/GPU/year software licensing. In some ways, the model resembles McDonald's franchising more than just chip sales.

Cutting through the jargon

"Compute" and "AI factory" existed before their rebranding. Compute is processing power, nowadays specifically GPU servers running AI workloads: training (creating an AI model) or inference (running a trained model). An AI factory is a data center facility with GPUs for AI workloads.

The product ladder

NVIDIA started as a GPU chip designer. Over time, it expanded into networking (InfiniBand adapters, NVLink switches), DPUs (data processing units for offloading network and storage tasks), and CPUs (the Grace ARM processor). It now sells at seven levels of the stack, each one capturing more of the total bill of materials.

Chip. The B200 GPU runs $45-$50k per module. The upcoming Rubin R200 ships in the second half of 2026.
Baseboard. The HGX baseboard holds 8 SXM GPUs wired together via NVLink and NVSwitch, NVIDIA's proprietary interconnect to speed up communication between GPUs.
Networking. InfiniBand adapters (ConnectX), switches (Quantum), NVLink switches, and BlueField DPUs. NVIDIA acquired Mellanox in 2020 for $6.9 billion to own this layer.
Server. OEMs (Dell, Supermicro, Lenovo, HPE) build servers around NVIDIA baseboards. But NVIDIA also makes full servers too, such as the DGX B200.^[2]
Rack. The GB200 NVL72 is a full liquid-cooled rack: 72 Blackwell GPUs and 36 Grace CPUs.^[3] Every component in the rack is NVIDIA-designed silicon.
SuperPOD. A DGX SuperPOD with GB200 systems connects 8 or more racks (576+ Blackwell GPUs).^[4] The next-generation SuperPOD with Vera Rubin NVL72, announced at GTC 2026, packs 14 racks with 1,008 Rubin GPUs.^[5]
AI factories. NVIDIA partnered with Bechtel, one of the largest construction firms in the world, to modularize the AI factory blueprint for repeatable, faster builds.^[6] NVIDIA provides reference architectures through Omniverse (its simulation and digital twin platform) and has designed an 800VDC power standard for AI factories.

NVIDIA's product ladder

Chip

B200 GPU$45-$50k

Board

HGX B200Inside server

Networking

InfiniBandVaries

Server

DGX B200~$515,000

Rack

GB200 NVL72$2-3 million

Multi-rack

DGX SuperPOD$16-24 million+

Data center

AI FactoryVaries

Each step up the ladder shifts more of the customer's spend to NVIDIA. At the chip level, GPUs account for 60-70% of server hardware cost. At the rack level, NVIDIA components (GPUs, NVLink switches, InfiniBand adapters) account for 76% or more of total spend.

In fiscal year 2025 (ended January 2025), NVIDIA posted $130.5 billion in revenue, with data center sales accounting for $115.2 billion.^[7] In fiscal year 2026, total revenue grew to $215.9 billion. Data center revenue hit $193.7 billion, 90% of the total, up 68% year over year.^[8]

By GTC 2026, Jensen stated it plainly: "We are a vertically integrated computing company. There is no other way."^[9] He projected at least $1 trillion in cumulative revenue from 2025 through 2027.

How the franchise works

NVIDIA provides more than just hardware:

The software stack: CUDA, TensorRT, NIM (NVIDIA Inference Microservices), and more.
The blueprint: reference architectures, validated configurations, partnerships for digital twins, partnerships for modular construction
The brand: "Dell AI Factory with NVIDIA," NVIDIA branding on every partner deployment

Partners bring the capital, sourced from private equity, sovereign wealth funds, family offices, or corporate balance sheets. They supply the real estate, either owned data center space or capacity leased from colocation providers. They handle power and cooling: electrical infrastructure and utility contracts. And they bring the customer relationships, signing agreements with AI companies, enterprises, and cloud buyers.

	NVIDIA AI Factory
Franchisor provides	Silicon, networking, software stack, reference architectures, validated designs
Franchisee provides	Capital, data center space, power, cooling, customer relationships
Royalty mechanism	$4,500/GPU/year software license + hardware margins
Lock-in	CUDA ecosystem, AI Enterprise, 2-year architecture cycles
Scale	67 AI-linked deals in 2025, 80-90% accelerator market share
Revenue	$193.7B data center revenue (FY2026)

NVIDIA is also increasingly getting equity stakes in its largest "franchisees": $2 billion in CoreWeave (January 2026),^[10] $2 billion in Nebius for an 8.3% stake (March 2026),^[11] and participation in xAI's $6 billion round (December 2024).^[12]

The franchising is formalized through NVIDIA's AI Factory program.^[13] NVIDIA certifies partners at every layer of the stack:

OEMs: Dell, HPE, and Lenovo build servers around NVIDIA HGX boards.^[14] ODMs like Supermicro and QCT (Quanta) serve hyperscalers and cloud providers.
Colocation: NVIDIA maintains a DGX-Ready Colocation directory of certified data centers with the power density, cooling, and structural capacity for GPU racks.^[15] In the Americas: Equinix, Digital Realty, QTS, and others.
Managed services: DGX-Ready Managed Services partners handle deployment and operations for operators who want the hardware without building a full technical team.^[16]
Cloud: NVIDIA Cloud Partners (NCPs) run NVIDIA hardware as a service. Reference platform NCPs include Lambda, Nebius, and Crusoe Cloud.
Construction and power: Bechtel handles physical builds. Schneider Electric and Vertiv provide power and cooling infrastructure. NVIDIA published a Vera Rubin DSX reference design at GTC 2026 with an Omniverse digital twin blueprint for planning gigawatt-scale AI factories.^[17]

The NVIDIA Marketplace lists the full partner directory, filterable by category.^[18] The ecosystem resembles a manual for potential franchisees: certified suppliers for every component, validated configurations at every scale, and NVIDIA branding on the finished product.

Sovereign AI factories

NVIDIA is franchising their AI factories to sovereign governments the same way: provide the playbook, while letting local partners provide the capital, land, power, and local regulatory access.

Saudi Arabia. HUMAIN, backed by the Saudi Public Investment Fund, is deploying 18,000 GB300 Grace Blackwell GPUs with InfiniBand networking.^[19] SDAIA (the Saudi Data and AI Authority) is building a sovereign AI factory with up to 5,000 Blackwell GPUs.^[20]

Japan. SoftBank is building Japan's most powerful AI supercomputer on the Blackwell platform.^[21]

India. Reliance Industries is building a 3 GW data center in Jamnagar, Gujarat, as part of a $110 billion AI infrastructure investment.^[22] Yotta Data Services is building a $2 billion AI hub with NVIDIA GPUs.^[23]

Europe. Germany is deploying 10,000 Blackwell GPUs as the world's first industrial AI cloud, operated by Deutsche Telekom.^[24]

Beyond picks and shovels

For operators building sub-$100 million clusters, buying into the NVIDIA franchise means predictable deployment at the cost of NVIDIA capturing most of the hardware margin.

Operator economics: 576-GPU B200 cluster

Illustrative annual economics at 80% utilization and $3.44/GPU-hr. Hardware amortized over 3 years. Facility includes power and colocation. Ops includes staffing, insurance, software, and distribution.

Validated configurations reduce deployment risk. A GB200 NVL72 cluster built to NVIDIA's reference architecture is a known quantity: 120-132 kW per rack, liquid cooling required, CUDA + NVIDIA's software stack, published performance benchmarks.^[3] Lenders can underwrite it against published specs rather than custom engineering.

NIM (NVIDIA Inference Microservices) further simplifies model deployment for operators without deep AI teams. NVIDIA's DGX-Ready Colocation and Managed Services directories point operators to pre-certified facilities and deployment partners.^[15]^[16]

For investors and lenders, the franchise model standardizes the asset class. A GPU cluster built to NVIDIA's reference specs is underwritable the same way a McDonald's franchise is: the brand, the playbook, and the demand profile are known quantities. A lender can model expected utilization and revenue against published benchmarks. A one-off cluster with custom engineering is a harder bet.

CoreWeave validates the approach: $5.1 billion in 2025 revenue, a $66.8 billion contracted backlog, and 60% adjusted EBITDA (earnings before interest, taxes, depreciation, and amortization) margins, all built on NVIDIA hardware with NVIDIA software.^[25]

NVIDIA's two-year architecture cycle (Hopper to Blackwell to Rubin) means the franchise playbook updates regularly. Each generation comes with validated designs from day one. As of early 2026, NVIDIA holds roughly 85% market share in AI accelerator chips,^[26] with over 90% in training workloads.^[27]

Being under the NVIDIA brand allows you to benefit from their demand generation. NVIDIA's CUDA framework launched in 2006. Nineteen years and 4 million developers later, it is what every major AI framework relies on. NVIDIA is also investing into open-source models. Every open-source model that gains adoption creates inference demand, and every inference deployment needs GPUs. NVIDIA released the Llama Nemotron model family in March 2025^[28] and partners with Mistral AI on frontier open-source models.^[29]

References

Frequently Asked Questions

What is an AI factory?

An AI factory is a data center facility with GPUs for AI workloads. NVIDIA coined the term at GTC 2024 to rebrand GPU-dense data centers. The rebranding has a business purpose: NVIDIA now sells complete factory configurations, from chips to full data center blueprints, bundled with $4,500/GPU/year software licensing. In some ways, the model resembles McDonald's franchising more than just component sales.

What is NVIDIA's product ladder?

NVIDIA historically focused on a single product: a GPU chip. It now sells at six levels of the stack, each one capturing more of the total bill of materials: chip (B200 GPU at $45,000-$50,000), baseboard (HGX with 8 SXM GPUs via NVLink), server (DGX B200), rack (GB200 NVL72), SuperPOD (576+ Blackwell GPUs), and full AI factory blueprints. At the chip level, GPUs account for 60-70% of server hardware cost. At the rack level, NVIDIA components account for 76% or more of total spend.

How does NVIDIA's AI factory franchise model work?

NVIDIA provides the hardware, the software stack (CUDA, TensorRT, NIM), the blueprint (reference architectures, validated configurations, partnerships for modular construction), and the brand. Partners bring the capital, the real estate, power and cooling, and customer relationships. NVIDIA certifies partners at every layer: OEMs build servers around NVIDIA HGX boards, colocation providers meet DGX-Ready certification, and cloud partners run NVIDIA hardware as a service.

What is the NVIDIA GB200 NVL72?

The GB200 NVL72 is a full liquid-cooled rack: 72 Blackwell GPUs and 36 Grace CPUs. Every component in the rack is NVIDIA-designed silicon. A GB200 NVL72 cluster built to NVIDIA's reference architecture is a known quantity: 132 kW per rack, liquid cooling required, CUDA + NVIDIA's software stack, published performance benchmarks.

Bridging GPU operators and financing partners

We help emerging neoclouds find financing partners, and help financing partners enhance story credit with GPU collateral management and residual value insurance solutions.

Learn how it works →

ShareLinkedIn X Facebook