Who Has NVIDIA's Blackwell GPUs: Market Size & Fragmentation (Q2 2026)

Apr 20, 2026·Bernie Margulies, CEO·AC Research

NVIDIA shipped approximately 3.2 million Blackwell GPU packages by end of 2025. Ten operators control 67 percent of Blackwell compute: five hyperscalers (Microsoft, Meta, Google, Amazon, Oracle) hold 49 percent, and five anchor neoclouds (CoreWeave, Nscale, Lambda, Nebius, Crusoe) hold 18 percent. The rest is spread across enterprises, sovereign AI, and smaller neoclouds.

NVIDIA has orders for another 4.3 million packages, which we estimate at 2.7 million after discounting for timing and cancellations.

3.2 million Blackwell packages

3.2M - Packages shipped

Globally through calendar year-end December 31, 2025

2.9M - Deployed or staging

Operationally deployed, or being configured and connected to the network

1.9M - In large clusters

256+ interconnected GPUs

1.0M - In small clusters

Smaller deployments

Blackwell is NVIDIA's current GPU architecture generation, succeeding the Hopper generation (H100, H200) that dominated 2023 and 2024. Where a single Hopper chip contains one silicon die, each Blackwell package fuses two dies into a single module, roughly doubling the transistor count and training throughput. When Jensen Huang said "six million Blackwell" shipped at his October 2025 GTC keynote, he was counting dies. The actual package count is approximately three million.^{[1]Jensen Huang, GTC Washington DC keynote, session DC1204, October 28, 2025. Slide footnote clarified that the Blackwell architecture uses two GPU dies per chip; the "six million Blackwell" figure thus refers to dies, implying approximately 3 million complete packages.https://www.nvidia.com/en-us/on-demand/}

Blackwell ships in two main SKUs. The B200 is the earlier variant, shipping from mid-2025 in both the HGX server form factor (eight GPUs per board) and the NVL72 system (72 GPUs wired together in a rack). When a B200 is paired with NVIDIA's Grace CPU in the NVL72 rack, the combined module is sold as GB200; the standalone HGX version is just B200. The same convention applies to the B300/GB300. The B200 can be air-cooled but easily throttles without sufficient cooling.

The B300 is the higher-clocked successor and cannot be air-cooled. It began volume shipments around September 2025 and crossed over B200 in quarterly production by November 2025.^{[2]NVIDIA CFO Colette Kress, Q3 FY2026 earnings call, November 19, 2025: "GB300 crossed over GB200" during that quarter; cumulative B200 still dominates through October.} The cumulative installed base through the end of 2025 skews approximately 75 percent B200 and 25 percent B300.^{[3]TrendForce CY2025 Blackwell rack mix estimate of approximately 81 percent GB200 versus 19 percent GB300 (July 24, 2025).}

NVL72 racks account for approximately 2.0 million of the 3.2 million packages shipped. The remaining 1.2 million ship as HGX boards and DGX systems (NVIDIA's pre-integrated servers), which deploy in a wider range of configurations, from large training clusters down to single-server setups.^{[4]Morgan Stanley supply chain research, NVL72 rack count analysis, 2025. At 72 packages per rack, yields 2.0 million packages in NVL72 form factor.}

From shipped to clustered

Roughly 10 percent of shipped packages sit in staging at any given time. The path from factory to operational cluster runs through installation, power commissioning, and network fabric integration, and typically takes 4 to 12 weeks.

"Clustered" means 256 or more GPUs tied together over a high-bandwidth, low-latency fabric: InfiniBand, NVIDIA's Spectrum-X Ethernet, standards-based RoCEv2 Ethernet (what Meta publicly runs for its AI training clusters), or AWS's proprietary EFA on Ethernet (Project Ceiba and the P5 UltraCluster family).

NVLink alone only scales to 72 GPUs inside a single NVL72 rack, so crossing the 256 threshold always requires four or more racks tied together by one of these scale-out fabrics.

3.2M Blackwell GPU packages shipped in 2025, disaggregated

Of the packages that are deployed, roughly a third (about 0.98 million) run as standalone nodes or micro-clusters rather than in 256-plus GPU clusters. That split is structural. Four drivers account for the pool:

Single-node serving at Tier 2 clouds and marketplaces. RunPod, Vast.ai, Vultr, Paperspace, Hydra Host, and roughly 40 providers catalogued by cloud-gpus.com rent single 8-GPU HGX instances by the hour. One HGX B200 server can serve a 70B-parameter model on its own, which covers most production inference. Per-hour pricing does not support the CapEx of cluster-scale fabric.
Regional and edge deployments at hyperscalers and anchor neoclouds. Azure, AWS, and Google place HGX and DGX nodes in secondary regions to serve latency-sensitive inference without building a training fabric there. Anchor neoclouds operate satellite sites below the cluster threshold to establish regional presence. These nodes show up in capex disclosures but not in named cluster counts.
Enterprise on-premise pilots and small production. Financial, pharmaceutical, defense, and telco operators typically begin with one to four DGX or HGX nodes (8 to 32 GPUs) before any scale-out decision. Block's first North American GB200 DGX SuperPOD and SoftBank's approximately 4,000-package DGX B200 pool (distinct from its 1,224-GPU GB200 NVL72 cluster) are named examples.
Development, fine-tuning, and evaluation. Fine-tunes of open-weight models, internal testing and red-teaming environments, agent development, and hyperscaler "scratch" GPU pools rarely exceed a single 8-GPU server.

Cluster counts understate the scale of operational Blackwell infrastructure. About a third of deployed packages are doing real work outside of what any cluster census can see.

Who owns what

Clustered Blackwell compute is distributed across eight segments, attributed by owner-operator rather than end seller to prevent double-counting. When CoreWeave leases GPUs to Microsoft, that capacity counts in CoreWeave's segment.

Blackwell ownership by operator segment (click a box)

Share of ~1.9M clustered packages, Dec 31, 2025

Across all eight segments, roughly 125 operators hold clustered Blackwell compute: 5 hyperscalers, 5 anchor neoclouds, 22 to 40 enterprise deployments, 1 frontier AI lab (xAI), 20 to 40 Tier 2 cloud and marketplace providers, 8 to 12 small neoclouds, 6 to 8 sovereign programs, and 10 to 15 research, government, and academic sites. This operator count is only for operational clusters of 256 or more GPUs. When including smaller clusters, the number of total operators might be three times larger.

Cluster sizes

Blackwell packages by cluster size (click a tile)

Each tile = ~10K packages. 12 mega-clusters (23%) vs. 170-215 smaller clusters (77%), Dec 31, 2025

Hyperscaler

Neocloud

Frontier lab

Enterprise

Other mega

Long tail

Twelve operational large clusters exist globally at December 31, 2025. A "large cluster" here is 10,000 or more Blackwell packages under a single operator, which may span multiple physical sites for fleet-aggregate operators like CoreWeave and Meta.

The long tail sits at the other end: 170 to 215 operational clusters below the $100 million hardware threshold (roughly 2,000 packages at $50,000 blended system-level ASP). Between the long tail and the large clusters sit roughly 140 medium clusters of 1,000 to 9,999 GPUs. The long tail is the hardest to size because enterprise, sovereign, and small-provider deployments are less observable than hyperscaler buildouts.

Projected vs. realized

Eighteen named pipeline deals total approximately 4.3 million announced Blackwell packages in commitments. Realization rates are the share of announced deals that actually deliver. Applying historical rates by buyer type to the pipeline produces an expected figure of approximately 2.7 million packages.^{[5]NVIDIA reported approximately 3.6 million unit backlog across major cloud providers in December 2025 press coverage.}

Top 10 contracted pipeline deals (click a bar)

Thousands of GPU packages. Outlined bar = announced, filled bar = expected after realization rate.

Realization rates vary by buyer. Hyperscaler internal capex (GPUs funded from a company's own balance sheet) converts at 85 to 95 percent. Anchor neocloud offtakes (contracts where a single dominant customer commits to most of a cluster's capacity) convert at 70 to 85 percent. Sovereign AI programs convert at 40 to 70 percent.

Forward projections

Three inputs drive the 2026-2027 outlook: realization of the contracted pipeline, organic new orders, and expanding capacity at CoWoS-L. CoWoS-L is TSMC's advanced packaging process that fuses the two Blackwell dies onto a single substrate, and it is the primary supply bottleneck.

Metric	EOY 2025	EOY 2026	EOY 2027
Cumulative packages shipped	3.2M	~6.7M	~9.1M
Range (bear to bull)	3.0-3.4M	5.8-7.3M	8.0-10.3M
Net-new shipments in year	3.2M	~2.2M	~2.4M
Operational clusters (256+ GPUs)	220-330	400-500	550-700
Large clusters (10,000+ GPUs)	12	25-40	40-60
NVL72 racks cumulative	~28,000	~55,000	~90,000

Cumulative shipments approximately double by end of 2026 and grow by another 2.4 million in 2027, while large clusters more than triple from 12 today to 40-60 by end of 2027. Rubin architecture begins shipping in H2 2026 per NVIDIA's disclosed production schedule, adding to Blackwell demand rather than replacing it.

Risks to realization

Five primary risks could compress the forward projections, ordered from most to least concerning.

Grid queues

Two-to-four-year interconnection queues in several major data center markets, particularly Virginia and parts of California.

HBM and packaging

HBM (high-bandwidth memory) supply constraints if 2026 yield curves underperform or Samsung qualification slips. Similar issues for TSMC packaging.

Export controls

Expanded restrictions affecting sovereign programs, especially because of third-country routing to China, could reduce or delay sovereign pipeline realization.

Rubin transition timing

If Blackwell delays overlap with Rubin production, Rubin could pull over Blackwell-specific count while total NVIDIA data center shipments remain on plan.

AI demand moderation

If major customers reassess training compute requirements, this would extend the deployment window rather than cancel orders. The installed base would still grow, but the rate at which new clusters come online would slow.

Calculations

3.2M packages shipped

Three sources triangulate. Jensen Huang stated 3.0M packages cumulative through October 2025 at GTC Washington DC.^{[1]Jensen Huang, GTC Washington DC keynote, session DC1204, October 28, 2025. Slide footnote clarified that the Blackwell architecture uses two GPU dies per chip; the "six million Blackwell" figure thus refers to dies, implying approximately 3 million complete packages.https://www.nvidia.com/en-us/on-demand/} Morgan Stanley channel checks counted 28,000 NVL72 racks shipped in CY2025, which at 72 packages per rack yields 2.0M in NVL72 form factor.^{[4]Morgan Stanley supply chain research, NVL72 rack count analysis, 2025. At 72 packages per rack, yields 2.0 million packages in NVL72 form factor.} The remaining 1.2M ships as HGX and DGX boards. Epoch AI's chip sales model gives 3.4M median with a 90 percent confidence interval of 2.9 to 4.0M.

10 percent in staging

Q4 CY2025 was the peak shipment quarter at approximately 1.0M packages, of which 35 percent remained in staging or transit at December 31. Earlier quarterly cohorts deployed fully. The weighted lag across the installed base is approximately 10 percent.

1.9 million clustered (59 percent of shipped)

NVL72 systems are purpose-built for multi-rack clusters: 85 percent deploy in clusters of four or more racks. HGX and DGX systems are more varied: 35 percent deploy in 256-plus GPU clusters, with the majority in single-node inference, development, or edge configurations. Weighted: (2.0M × 0.85) + (1.2M × 0.35) = 2.12M in clusters before staging lag. After applying the 10 percent staging lag, roughly 1.9 million packages are operationally clustered, or 59 percent of the 3.2 million shipped.

49 percent hyperscaler share

Jensen Huang stated cloud service providers represent approximately 50 percent of data center revenue at the Q4 FY2026 earnings call.^{[6]NVIDIA Q4 FY2026 earnings call, CFO commentary, February 25-26, 2026. Jensen Huang statement on data center revenue split by customer type: approximately 50 percent cloud service providers.} After accounting for 8 to 15 percent volume discounts on hyperscaler purchases, this maps to roughly 49 percent of clustered Blackwell units.

170 to 215 sub-$100M clusters

Approximately 48 percent are directly observed: Tier 2 cloud providers tracked by cloud-gpus.com (48 clusters), named small neoclouds (34), and anchor neocloud satellite sites (10).^{[7]cloud-gpus.com, reserved 1-year B200 pricing as of April 2026. 31 providers tracked, 15 or more offering B200.https://cloud-gpus.com} Another 17 percent is partially observed through enterprise deployments at Block, IBM, Tesla, and smaller Mistral nodes. The remaining 35 percent is modeled from segment share, capturing enterprise residual, research and government clusters, sovereign small deployments, and regulated enterprise private cloud.^{[8]SemiAnalysis, ClusterMAX 2.0, 209 GPU cloud providers tracked. Captures the small neocloud and Tier 2 cloud provider universe.} Omdia and Dell'Oro independent trackers confirm a defensible range of 178 to 215. Consulting firms frequently report 240 to 280 by conflating announced and contracted capacity with operational deployment.

Pipeline realization rates

Rates differ by buyer type:

Hyperscaler internal capex: 85 to 95 percent. Board-approved, funded, with land and power secured. Meta Prometheus and Hyperion, Microsoft Fairwater, and AWS internal builds sit here.
Anchor neocloud offtakes: 70 to 85 percent. Contractual obligations with named offtakers. CoreWeave backlog, Nscale-Microsoft, Lambda-Microsoft, and Nebius-Microsoft sit here.
Sovereign AI: 40 to 70 percent. Subject to political cycles, permitting delays, and export license uncertainty. Historical midpoint of 55 percent based on prior procurement cycles.
Frontier labs: 60 to 80 percent. Fundraising-dependent. xAI's Colossus 2 target sits here.
Enterprise: 50 to 70 percent. ROI reassessment is common between announcement and deployment.

Saudi HUMAIN's 600,000-GPU announcement at 50 percent yields 300,000 expected. UAE G42 Stargate's 500,000 at 60 percent yields 300,000. Cross-checked against NVIDIA's reported 3.6M unit backlog (December 2025), the 2.7M realization-adjusted pipeline accounts for approximately 75 percent of backlog.^{[5]NVIDIA reported approximately 3.6 million unit backlog across major cloud providers in December 2025 press coverage.} The remaining 0.9M sits in MOUs, LOIs, and earlier-stage commitments not captured as named pipeline deals.

Forward projection build

EOY 2026: 3.2M base plus approximately 2.2M organic net-new shipments (NVIDIA and Morgan Stanley guidance) plus approximately 1.3M from pipeline realization equals approximately 6.7M cumulative. Operational clusters approximately double from 275 to 400-500 as named pipeline sites energize.

EOY 2027: approximately 2.4M net-new shipments (including approximately 0.8M from remaining pipeline realization) brings cumulative to approximately 9.1M. The deceleration relative to 2026 reflects conservatism around Rubin transition timing. If Rubin launches on schedule and cannibalizes Blackwell Ultra orders, 2027 Blackwell-only figures would trend lower while total NVIDIA DC shipments remain on plan.

Supply constraints

TSMC CoWoS-L capacity expanded from approximately 75,000 wafers per month in 2025 to 120,000-130,000 in 2026 and 140,000-160,000 in 2027. NVIDIA secured approximately 70 percent of allocation. At Blackwell die-per-wafer yields of 25 to 30 and two dies per package, this supports approximately 13 to 16 million packages annually at the 2026 ceiling, well above the 2.2M net-new 2026 projection.

HBM3e supply from SK Hynix, Samsung, and Micron tightens at higher volumes. Grid power and permitting are the binding constraint for large-cluster deployment in 2026, with Virginia (Dominion Energy service territory) the tightest market due to moratorium pressure. Texas, Ohio, Wisconsin, and international deployments in the Nordics and Malaysia have capacity more readily available for 2026 energization.

References

Jensen Huang, GTC Washington DC keynote, session DC1204, October 28, 2025. Slide footnote clarified that the Blackwell architecture uses two GPU dies per chip; the "six million Blackwell" figure thus refers to dies, implying approximately 3 million complete packages.
NVIDIA CFO Colette Kress, Q3 FY2026 earnings call, November 19, 2025: "GB300 crossed over GB200" during that quarter; cumulative B200 still dominates through October.
TrendForce CY2025 Blackwell rack mix estimate of approximately 81 percent GB200 versus 19 percent GB300 (July 24, 2025).
Morgan Stanley supply chain research, NVL72 rack count analysis, 2025. At 72 packages per rack, yields 2.0 million packages in NVL72 form factor.
NVIDIA reported approximately 3.6 million unit backlog across major cloud providers in December 2025 press coverage.
NVIDIA Q4 FY2026 earnings call, CFO commentary, February 25-26, 2026. Jensen Huang statement on data center revenue split by customer type: approximately 50 percent cloud service providers.
cloud-gpus.com, reserved 1-year B200 pricing as of April 2026. 31 providers tracked, 15 or more offering B200.
SemiAnalysis, ClusterMAX 2.0, 209 GPU cloud providers tracked. Captures the small neocloud and Tier 2 cloud provider universe.

Frequently Asked Questions

How many NVIDIA Blackwell GPUs have been shipped?

NVIDIA shipped approximately 3.2 million Blackwell GPU packages globally through December 31, 2025. Each package fuses two silicon dies, so Jensen Huang's "six million Blackwell" figure at his October 2025 GTC keynote counts dies. The actual package count is approximately three million. The cumulative installed base skews roughly 75 percent B200 and 25 percent B300. NVL72 racks account for 2.0 million of the 3.2 million; the remaining 1.2 million ship as HGX boards and DGX systems.

What is the difference between B200 and B300 GPUs?

Both are NVIDIA Blackwell architecture. The B200 began shipping mid-2025 in the HGX server form factor (eight GPUs per board) and the NVL72 system (72 GPUs wired together in a rack). The B300 is the higher-clocked successor, cannot be air-cooled, began volume shipments around September 2025, and crossed over B200 in quarterly production by November 2025. When a B200 or B300 is paired with NVIDIA's Grace CPU in the NVL72 rack, the combined module is sold as GB200 or GB300. The standalone HGX version drops the "G" prefix.

How many Blackwell GPUs are actually deployed and clustered?

Roughly 10 percent of shipped packages sit in staging at any given time, with the 4-to-12-week deployment lag covering installation, power commissioning, and network fabric integration. Of the 2.88 million deployed packages, about a third (0.98 million) run as standalone nodes or micro-clusters rather than in 256-plus GPU clusters. That split is structural. Tier 2 cloud providers like RunPod, Vast.ai, and Vultr rent single 8-GPU HGX servers because one node can serve most production inference on its own. Hyperscalers place HGX nodes in secondary regions for latency-sensitive inference. Enterprise and sovereign pilots typically start at 1 to 4 DGX nodes before scaling out. Fine-tuning, evaluation, and development workloads usually fit on a single 8-GPU server. The remaining 1.9 million sit in clusters of 256 or more interconnected GPUs, approximately 59 percent of total shipments.

What network fabric do Blackwell GPU clusters use?

Blackwell clusters of 256 or more GPUs use a high-bandwidth, low-latency network fabric to link GPUs across racks. Four fabrics qualify: InfiniBand (NDR and XDR generations), NVIDIA's Spectrum-X Ethernet, standards-based RoCEv2 Ethernet on merchant switch silicon (what Meta publicly discloses for its AI training clusters), and AWS's proprietary EFA over Ethernet (Project Ceiba and the P5 UltraCluster family). NVLink alone only scales to 72 GPUs inside a single NVL72 rack, so reaching the 256-GPU cluster threshold always requires four or more racks connected by one of these scale-out fabrics. InfiniBand and Spectrum-X are NVIDIA-branded networking. RoCEv2 and EFA let hyperscalers use merchant switch silicon and bypass NVIDIA on the network layer.

Who are the largest operators of Blackwell GPU clusters?

Ten operators control 67 percent of clustered Blackwell compute. Five hyperscalers (Microsoft, Meta, Google, Amazon, Oracle) hold 49 percent and five anchor neoclouds (CoreWeave, Nscale, Lambda, Nebius, Crusoe) hold 18 percent. Enterprise deployments take 11 percent, including Mistral AI, Tesla Cortex, IBM, and a reported 18,000-package Apple order. Frontier labs (6 percent, effectively xAI alone), Tier 2 cloud providers (6 percent), small neoclouds (5 percent), sovereign AI (3 percent), and research, government, and academic sites (2 percent) split the remainder. Roughly 125 operators hold clustered Blackwell compute across all eight segments.

What are the largest Blackwell GPU clusters in the world?

Twelve operational large clusters (10,000 or more packages under a single operator) exist globally at December 31, 2025. The two largest are Oracle/OpenAI Stargate in Abilene, Texas and Meta's internal training fleet, each at approximately 100,000 GPUs. CoreWeave operates 50,000 across 32 data centers. xAI's Colossus 1 in Memphis runs 30,000 GB200, and Colossus 2 is ramping toward a 550,000-GPU target. Mistral AI holds approximately 14,000 GB300 packages in Paris. Between these and the long tail sit roughly 140 medium clusters of 1,000 to 9,999 GPUs and 170 to 215 sub-$100 million clusters.

How reliable are NVIDIA's announced Blackwell pipeline deals?

Eighteen named pipeline deals total approximately 4.3 million announced Blackwell packages. Realization rates vary by buyer type: hyperscaler internal capex converts at 85 to 95 percent, anchor neocloud offtakes at 70 to 85 percent, frontier labs at 60 to 80 percent, enterprise at 50 to 70 percent, and sovereign AI programs at 40 to 70 percent. Saudi HUMAIN's 600,000-GPU announcement yields approximately 300,000 expected. UAE G42 Stargate's 500,000 yields approximately 300,000 expected. Applied across the pipeline, the realization-adjusted figure is approximately 2.7 million packages, about 75 percent of NVIDIA's reported 3.6 million unit backlog in December 2025.

How will the Blackwell installed base grow through 2027?

Cumulative shipments approximately double to 6.7 million by end of 2026 and reach 9.1 million by end of 2027. Large clusters (10,000 or more GPUs) more than triple from 12 to 40-60, and operational clusters of 256-plus GPUs grow from 220-330 to 550-700. Rubin architecture begins shipping in H2 2026 per NVIDIA's disclosed production schedule and adds to Blackwell demand rather than replacing it. The primary risks are two-to-four-year grid interconnection queues in major markets like Virginia, HBM and CoWoS-L packaging constraints, and expanded export restrictions affecting sovereign programs.

Bridging GPU operators and financing partners

We help emerging neoclouds find financing partners, and help financing partners enhance story credit with GPU collateral management and residual value insurance solutions.

Learn how it works →

ShareLinkedIn X Facebook