AI Cluster Cost Breakdown: OpEx, TCO, and Payback (2026)

·Bernie Margulies
▾ -$8.5M equity
$0.5M$1.0M$1.5M$2.0M$0-$0.5M
Breakeven
Y1
Y2
Y3
Y4
Y5
Revenue
Revenue
Residual sale
Costs
Power
Colo
Operations
Distribution
Interest
Loan repayment
Lines
Net income
Capital structure
Total CapEx$26.5M
Cash to launch$9.3M
Debt service / mo$643K
TCO$45.6M
TCO / GPU / hr$1.81
Returns
Equity payback45 mo
IRR28.7%
Operator's MOIC2.44x
DSCR1.36x
Operating performance
Gross margin87.1%
EBITDA margin77.1%
Net cash margin20.3%
Break-even util.62%
Facility power1166 kW
Pick a preset
Rate Y1$/GPU/hr$3.44
Rate Y2$/GPU/hr$3.10-10%
Rate Y3$/GPU/hr$2.99-13%
Rate Y4$/GPU/hr$2.89-16%
Rate Y5$/GPU/hr$2.82-18%
Utilization% of time rented80%
Goodput lossaka. wasted compute0.5%
Loan term
LTV% financed70%
Interest rate15%
Origination3%
Ramp uptime to deploy2 mo
Residual value% of CapEx at exit5%$1.3M
Pick a preset
GPU count576 GPUs72 servers
Cost per GPUserver all-in ÷ 8$46.0K
Total CapEx$26.5M1166 kW
Staffingannual$150K
Admin & Legalannual$50K
Insurance% of CapEx/yr0.36%
Softwareannual$50K
Distribution% of revenue7.5%
Pick a preset
Cooling
Electricity$/kWh$0.070
PUE1.35
Colo overhead$/GPU/mo, excl. power$150

What it costs to operate a GPU cluster

As of Q1 2026, a 576-GPU B200 cluster costs roughly $26.5M in hardware and about $8.5M in equity (the cash you put up) to launch, even with a loan. The simulator above uses this default scenario: 72 servers with 8 NVIDIA B200 GPUs each (HGX platform), air-cooled, deployed to a colocation facility.

The defaults reflect the most common deployment we see: a small-to-mid-size inference cluster. Hardware, power, facility, and financing dominate at this scale. The shared storage and orchestration costs that drive large training clusters are a minor line item here.

The financing assumes 70% loan-to-value (LTV), meaning the lender covers 70% of the hardware cost and you put up the remaining 30%. The loan charges 15% annual interest over a 3-year term with a 3% origination fee paid upfront. That puts $18.5M financed, $7.9M in equity, plus a $556K origination fee. Monthly debt service (loan repayments) runs ~$643K.

Revenue starts after a 2-month ramp. The ramp period is driven by procurement lead times, the weeks it takes to receive, rack, cable, and burn in (stress-test) the hardware. Because there is no live cluster to operate yet, staffing and admin overhead are minimal. You still pay interest and facility costs. Year 1 rental rates are $3.44/GPU/hr, dropping about 10% in the first year and flattening to 2-3% annual declines after that, following the pattern seen in historical GPU rental data.

At 80% utilization with 0.5% goodput loss (revenue lost to hardware failures and job restarts), the cluster runs cash-flow positive throughout and hits equity payback around month 45. Years 1-3 are tight, with the bulk of returns arriving in years 4 and 5 after the debt is paid off.

What actually drives profitability

Some assumptions move the payback period by months. Others are rounding errors.

DriverImpactWhy it matters
Utilization% of GPU-hours generating revenueVery highThe single biggest lever. Dropping from 80% to 60% wipes out roughly $3.5M/year in revenue. Below ~61%, most clusters cannot cover monthly costs. Having contracted offtake (a signed agreement where a customer commits to renting capacity) before deployment is the difference between a profitable cluster and an expensive rack of idle hardware.
Rental rateVery highEvery $0.50/hr move in the B200 rate changes annual revenue by ~$2M. Prices falling as next-gen GPUs arrive (Rubin, expected H2 2026) is the primary risk to long-term returns.
Residual valuehardware sale price at end of lifeModerateGoing from 5% to 15% residual adds $2.6M to total returns at year 5. Matters most for IRR (internal rate of return, the annualized return on equity) and MOIC (multiple on invested capital, total cash back divided by cash invested), but is highly uncertain at 60-month horizons.
Electricity rateModerateA $0.04/kWh swing is roughly $409K/year, material but secondary to revenue-side drivers.

Loan terms are a minor driver. Going from 15% to 10% interest barely changes the total return. Minor OpEx items and ramp-up delays matter even less.

Utilization and rental rate dominate the model. Clusters without pre-signed offtake agreements (where a customer commits to renting capacity) carry spot-market pricing risk from day one. The lender in a 3-year loan is still safe even when utilization and rates are lowered, it's mainly the operator who suffers the slow payback.

The default scenario

The simulator's defaults are calibrated from three independently sourced private deals that converge on the same cluster price.[1]

ParameterDefault
GPUNVIDIA B200 (HGX platform, NVIDIA's standard 8-GPU server design)
GPU count576 (72 servers x 8 GPUs)
CoolingAir-cooled
Total CapEx$26.5M ($46,000/GPU all-in, includes servers, networking, optics, installation, and 3-year warranty)
LTV / Equityloan-to-value ratioThe lender covers 70% of the hardware cost. You put up the remaining 30% ($7.9M) plus a $556K origination fee, $8.5M out of pocket at closing.
Interest rate15% (3-year fully amortizing, meaning each payment covers both interest and principal so the loan is paid off at the end)
Electricity$0.07/kWh (Texas average)
PUEpower usage effectiveness1.35 (35% cooling and distribution overhead, air-cooled)
Colocationfacility hosting fee, excl. power$150/GPU/month
Rental rate (Y1)$3.44/GPU/hr
Utilization% of GPU-hours generating revenue80%
Operating period5 years
Ramp-uptime before revenue starts2 months (procurement, racking, and burn-in, you pay interest and facility costs but staffing and admin are minimal)
Residual valuehardware sale price at end of life5% of CapEx at year 5

The 5-year operating period extends two years beyond the loan term. Years 4 and 5 are debt-free, and for most operators that is where the real equity return shows up.

Equipment (CapEx)

CapEx (capital expenditure) is the upfront hardware cost. As of Q1 2026, the $46,000/GPU default is a turnkey, all-in cost per GPU: server hardware, networking gear, racking, structured cabling, burn-in, cluster configuration, and 3-year next-business-day (NBD) support from the server manufacturer (OEM). A single 8-GPU server costs roughly $368,000.[1]

Many cost models separate servers from networking or break out maintenance as a separate line. The simulator combines them because NBD support is bundled into the purchase price and networking is bought alongside the servers.

What the per-GPU cost includes

GPU servers are 85-90% of CapEx. The remaining 10-15% covers InfiniBand switches (the high-speed network connecting servers), fiber optic transceivers, management servers, racks, power distribution units, installation, and 3-year NBD OEM support from Dell, Lenovo, or Supermicro.[1]

Enterprise GPU procurement runs 55-70% off MSRP (manufacturer's suggested retail price), based on procurement deals we have reviewed as of Q1 2026.[1] Fiber optic transceivers carry even heavier markups. List price is not a useful reference for modeling real costs.

The simulator default of $46,000 is the average across private B200 deals we have reviewed as of Q1 2026.[1] B300 servers run roughly $62,000/GPU all-in, a 35% premium driven by the move to liquid cooling and higher-TDP components.

Electricity

Most colocation contracts bundle electricity, labor, and services into one monthly per-kW rate. The simulator splits electricity out as a separate line. That way you can compare different power markets (Texas vs. Pacific Northwest vs. California) without changing the facility cost for space, cooling infrastructure, security, and remote hands. In a real contract those costs move together, but isolating electricity makes location trade-offs visible.

The default is $0.07/kWh, a typical Texas data center power rate. Texas is the most active US market for new GPU deployments because of deregulated power and available capacity. National average for commercial data center power is $0.08-$0.12/kWh. California runs $0.14+. Pacific Northwest hydro markets dip below $0.035.[2]

Electricity cost depends on three variables: GPU power draw, the PUE (Power Usage Effectiveness) of the facility, and the utility rate.

PUE

PUE is the ratio of total facility power to IT equipment power. A PUE of 1.35 means 35% overhead for cooling and power distribution.

The industry average is 1.58, flat for six consecutive years. Weighted by capacity, it drops to 1.47.[3] Best-in-class liquid-cooled sites hit 1.10-1.15. Air-cooled GPU facilities run 1.3-1.5. The default of 1.35 represents a typical air-cooled GPU colocation facility. The simulator adjusts PUE automatically when you switch cooling types.

GPU power draw

The B200 ships at two power profiles. Air-cooled: 1,000W per GPU. Liquid-cooled: 1,200W per GPU.[4] The simulator uses the air-cooled figure, which means each 8-GPU server draws roughly 12 kW. The per-GPU figure includes server overhead: 1.5 kW/GPU for B200 air, 1.75 kW/GPU for B200 liquid or B300 air, 2.25 kW/GPU for B300 liquid, 1.25 kW/GPU for H100/H200. Multiply by GPU count for IT load, then by PUE for total facility power.

Monthly power cost

At the defaults (576 GPUs x 1.5 kW x PUE 1.35 = 1,166 kW facility load), the cluster draws about $60,000/month in electricity at $0.07/kWh. That is $715K per year, or roughly 5.2% of year-one revenue.

Move the same cluster to a market paying $0.10/kWh and that bill jumps to $85K/month, $1.02M/year. Electricity is the single most location-sensitive cost in the model.

Location$/kWhAnnual power cost
Texas (default)$0.07~$715K
Pacific Northwest hydro$0.0325~$332K
US national average$0.10~$1.02M
California$0.14~$1.43M

Colocation

The colocation fee covers rack space, physical security, connectivity, and facility operations. Most real contracts bundle these with electricity into a single per-kW rate. The simulator splits electricity out so you can model different power markets without changing the facility cost for labor and services.

The default is $150/GPU/month, or $86,400/month for 576 GPUs, roughly $1.04M per year.[2] Adding the simulator's electricity cost back in (~$60K/month at $0.07/kWh), total facility cost is about $253/GPU/month, mid-market for bundled colocation pricing.

How colo pricing works in the real world

Most colocation providers quote per-kW rates that include power. As of Q1 2026, AI/HPC colocation across US markets runs $100-$200/kW/month for Tier 3 (high-availability) facilities.[5] At 12-16 kW per server (a reasonable B200 estimate with overhead), that translates to $1,600-$3,200/server/month, or $200-$400/GPU/month, power included.

Some managed infrastructure providers quote all-in rates closer to $2,500-$3,200/server/month that include power, cooling, internet, physical security, and repair labor.[1]

ScenarioColo $/GPU/moIncludes power?
Simulator default$150No (power billed separately)
Mid-market (typical)$250-$325Yes (bundled)
Managed infrastructure$325-$400Yes (incl. remote hands)

Staffing

The simulator defaults to $150,000/year for staffing. That covers one infrastructure engineer (salary, benefits, and overhead) for a lean operation.

With 3-year OEM warranty support and remote hands from the colocation provider (on-site staff who handle physical tasks like racking servers and swapping cables), one dedicated engineer is enough. The OEM handles hardware failures. The colo handles physical work. The engineer handles cluster operations, monitoring, and network management. ML workload management sits on the customer side.

Insurance

The simulator includes insurance at 0.36% of CapEx per year.

This rate reflects an inland marine policy, all-risk property insurance that covers hardware against fire, theft, water damage, and most other physical damage while it sits in the data center.[6] The annual bill includes the base premium plus a small TRIA (Terrorism Risk Insurance Act) surcharge, typically about 4% of the base premium.

Transit insurance (covering hardware during shipping) is a one-time cost and small enough that the simulator skips it. Colocation agreements also require commercial general liability and workers' compensation coverage, but most businesses already carry those policies.[6]

Maintenance and OEM support

The model does not break out maintenance separately. Three-year NBD onsite support is bundled into the server purchase price from every major OEM (Dell ProSupport, Lenovo, Supermicro).[1] That cost is already in the $46,000/GPU CapEx figure.

Once the warranty expires, operators can buy extended support contracts (typically 5% of hardware value per year) or keep spare parts on hand to swap in when something fails. You can reflect extra spend by adjusting software or staffing if you want to stress-test.[1]

Software

The simulator defaults to $50,000/year for software licensing: monitoring, firewall subscriptions, and miscellaneous tooling.

The largest potential software cost is NVIDIA AI Enterprise at $4,500/GPU/year, or $2.6M/year for 576 GPUs. Most operators skip it and use NVIDIA's free CUDA programming toolkit, open-source drivers, and community-supported containers.[1]

The $50K default assumes primarily open-source tooling with a few commercial subscriptions for monitoring and security. Some licenses are bundled into the bill of materials and captured in the per-GPU CapEx cost.

Distribution

The simulator charges 7.5% of revenue as a distribution fee. This covers broker commissions, marketplace take rates, and payment processing.

The range is wide. A take-or-pay contract (where the customer pays whether they use the GPUs or not) with a single primary customer might cost 0-3%. Selling through GPU marketplace platforms or channel partners typically runs 5-10%. Full-service GPU cloud providers that handle sales, billing, and customer support charge 10-15% on top.[7]

At default settings (80% utilization, $3.44/GPU/hr), 7.5% comes to about $1.0M per year at the year-one run rate.

The simulator's "Admin & Legal" line defaults to $50,000/year, covering ongoing legal fees, accounting, and general corporate overhead for a lean operation. Initial setup (forming the LLC, negotiating colocation contracts and financing agreements) can run $50K as a one-time cost, which the simulator does not model separately.

Financing

GPU hardware loans are a form of asset-backed private credit. As of early 2026, typical terms run 2-4 years, 60-80% LTV, and 8-15% interest.[1]

The simulator defaults to the expensive end: 15% interest, 70% LTV, 3-year term, 3% origination. This reflects what a small-to-mid operator without credit history actually pays. First-time operators typically see 12-15%. Operators with contracted revenue and a track record get better terms.[1]

The ramp period: interest-only before revenue

During the ramp period (2 months by default), the cluster generates zero revenue but the loan is already outstanding. The simulator switches the loan to interest-only during ramp, you pay ~$232K/mo in interest on the $18.5M balance but no principal. Once the cluster is live, payments switch to fully amortizing ($643K/mo).

Staffing, admin, insurance, and software costs are excluded during ramp. The ramp period is driven by procurement lead times: ordering, shipping, racking, cabling, and burn-in. There is no live cluster to staff yet. Facility costs (power and colo) still run. Total pre-revenue cash burn during the 2-month ramp is roughly $756K on top of the $8.5M equity at closing.

Default financing math

Line itemAmount
Hardware CapEx$26.5M
Financed (70%)$18.5M
Equity (30%)$7.9M
Origination fee (3%)$556K (paid at closing)
Total cash to close$8.5M
Monthly payment~$643K (fully amortizing)
Total interest over 3 years~$4.6M

Revenue and rental rates

The simulator uses $3.44/GPU/hr as the Year 1 B200 rental rate, the median on-demand price on Vast.ai as of early 2026.[8] On-demand rates from Lambda, RunPod, and Verda (DataCrunch) fall in a similar range.[9]

Rates drop about 10% in the first year as the generation matures, then flatten to 2-3% annual declines: $3.44 in Y1, $3.10 in Y2, $2.99 in Y3, $2.89 in Y4, $2.82 in Y5. American Compute's analysis of pricing data from GPU rental index providers shows this pattern: rates fall sharply when a new generation launches, then plateau as the market stabilizes. H100 on-demand rates dropped from ~$3.00 to ~$2.00 over 12 months, then held steady for the next 6 months.[7]

Utilization and Goodput

The default utilization is 80%. Utilization is the single most important assumption in the model.

Goodput loss is revenue lost to GPU failures, checkpoint replays (re-running training work from the last saved state after a crash), and job restarts. The simulator defaults to 0.5%, as it assumes the cluster is meant for inference workloads. Top-tier operators achieve 2.0-2.8% goodput loss on training workloads, while mid-tier operators run 3.4-10.7%.[10]

Residual value

The simulator defaults to 5% residual value at the end of the 5-year operating period. On a $26.5M cluster, that recovers $1.3M through hardware resale.

Going from 5% to 15% at year 5 adds $2.6M to total returns. The simulator does not model residual value insurance, an optional policy that guarantees a minimum resale price for the hardware.

Costs the simulator bundles, and costs it skips

Bundled into other line items

  • OEM warranty (years 1-3): included in the $46,000/GPU equipment cost.[1]
  • Local storage: 76.8TB raw NVMe (fast solid-state drives) per server, included in CapEx.
  • InfiniBand network management software licenses: in the bill of materials, captured in per-GPU CapEx.
  • Bandwidth and cross-connects (cables between your racks and the facility network): typically included in the colocation contract.
  • Insurance, admin, and legal: visible as separate sliders but grouped under the Operations bar.
  • On-site Labor: Part of the colocation contract, known as "remote hands".

Omitted entirely

  • Transit insurance: ~$65K one-time, too small to model.[6]
  • Shared storage systems (VAST, Weka, Lustre): $10-30K/month for training workloads with 12+ TB/GPU of checkpoints and datasets. Inference needs under 1 TB/GPU for model weights and logs, so shared storage is a minor line item at this scale.[10]
  • Taxes: corporate, state, sales, and property tax all vary. Bonus depreciation (a tax benefit letting you deduct the full hardware cost in year one, reinstated at 100% in January 2025) significantly impacts after-tax returns.[11]
  • Initial legal and setup: $50-150K one-time, pre-operating period.
  • Residual value insurance: optional, deal-specific pricing.
  • Post-warranty maintenance (years 4-5): 5-10% of hardware value per year.covers this.

References

  1. Based on American Compute’s review of real Bills of Materials (BOMs) from GPU infrastructure vendors and private customer deals (2025-2026)
  2. IREN Limited, GPU cluster economics tool and public investor presentations (2025-2026)
  3. Uptime Institute, "Global Data Center Survey Results 2024": industry average PUE 1.58, capacity-weighted 1.47 (July 2024)
  4. NVIDIA DGX B200 System User Guide, Power Specifications (2025)
  5. QuoteColo, AI/HPC colocation marketplace pricing data (accessed March 2026)
  6. Based on American Compute’s proprietary research on GPU cluster insurance, including review of example inland marine / EDP policies from private deals (2026)
  7. American Compute internal pricing data: 2,326 pricing observations across 8 providers and 10 GPU models, July 2022 through March 2026
  8. Vast.ai GPU marketplace, live pricing data (accessed March 2026)
  9. Lambda, RunPod, Verda (DataCrunch), on-demand and contract GPU pricing pages (accessed March 2026)
  10. SemiAnalysis, "Calculating the Total Cost of a GPU Cluster," commissioned by Nebius (February 2026)
  11. IRS Publication 946, "How To Depreciate Property" (2024)
Residual Value Insurance Solutions for GPUs

Coverage creates a minimum value for what your GPUs are worth at a future date. If they sell below the floor, the policy pays you the difference.

Learn how it works →
AI Cluster Cost Breakdown: OpEx, TCO, and Payback (2026) | American Compute