Who Is Building Compute and Why Is It So Lucrative

Mar 27, 2026·Bernie Margulies, CEO·AC Research

Private equity and entrepreneurial operators are building sub-$100M GPU clusters focused on inference, the workload that runs trained AI models in production. High demand for AI compute lets operators lock in long-term offtake agreements, contracts where a customer commits to buying capacity in advance, making 2-3x multiples on invested capital realistic over a 3-5 year hold.

Who is building sub-$100M compute clusters

The compute buildout has three tiers. The largest by far are hyperscalers, the large cloud providers that build their own infrastructure and spend billions. Microsoft committed $80 billion in capital expenditure on AI for 2025 alone.^[1] Meta is building a $30 billion facility.^[2] New neocloud operators like CoreWeave are also spending on hardware and facilities, carrying $21.4 billion in debt as of February 2026.^[3]

Who builds compute infrastructure

Hyperscalers

Examples

Microsoft, Meta, Google

Scale

$10B-$80B+

Workload

Training + inference

Capital

Cash flow, bonds

Neoclouds

Examples

CoreWeave, Lambda, Crusoe

Scale

$100M-$10B+

Workload

Training + inference

Capital

VC, GPU-backed debt

Sub-$100M operators

Examples

PE firms, family offices, entrepreneurs

Scale

$10M-$100M

Workload

Inference

Capital

Equity + GPU-backed debt

Below them, a different market is forming. Operators are deploying clusters of a few thousand GPUs, where total project costs are under $100 million and under 5 megawatts.

Private equity firms and family offices are deploying fund capital into GPU clusters the same way they deploy into infrastructure assets: buy hardware,finance it with debt, lock in revenue through contracts, and distribute profits after several years. Entrepreneurial operators are raising from both, bringing the technical team and the data center relationships.

Depending on the deal, the operator might earn a management fee and carry, a share of the profits above a return threshold. The capital partners collect the rest of the profits.

Renting space

These smaller operators don't always build out the data centers. They rent space inside them just as often. Servicing them are colocation providers like Equinix, QTS, or CyrusOne, which own the facilities. The operator signs a agreement, installs the GPU servers in the facility, and the colo provider gives power, cooling, and internet.

The operator owns the GPUs and the customer contracts. The colo provider owns the building. This split is what makes sub-$100M deployments possible: you don't need to spend $500 million and two years constructing a facility.

These smaller operators cannot serve AI training workloads as well as the top neocloud operators, but they are well positioned to target AI inference workloads, which is using already trained AI models. The market for inference is rapidly growing as AI applications mature.

What the returns look like

As of 2026, a 576-GPU B300 cluster costs roughly $36 million ($62,000/GPU all-in). The operator finances it at 65% loan-to-value (LTV), borrowing about $23.4 million and putting up roughly $12.6 million in equity to buy the hardware.

Typical GPU-backed loan terms as of early 2026:

Term	Typical range	Notes
LTV	60-70%	Lenders advance 60-70% of hardware purchase price
Interest rate	~15%	Higher for smaller operators with limited track records
Loan term	3 or 5 years	Depends on the operator's plans
Amortization	Level payments	Same total payment each month; interest declines as principal grows
Collateral	GPUs + contracts	Hardware plus assigned customer revenue

B200 rates on marketplaces like Vast.ai run around $2-4 per GPU-hour as of early 2026.^[4] A GPU-hour is one GPU running for one hour, the standard billing unit. At a blended $3.50/hr and 80% utilization, that cluster generates roughly $14.1 million per year.

After debt service, the combined interest and principal payments on the loan, plus colocation, power, and a small technical team, the operator keeps a thin margin in the early years. As the loan pays down, more revenue turns into profit. When the operator eventually sells or refinances the GPUs, the remaining hardware value adds to the total return.

Over a 3-5 year hold, that math produces a 2-3x MOIC (multiple on invested capital), the ratio of total money returned to total money invested.

MOIC waterfall: 576-GPU B300 cluster, 5-year hold

Assumes a flat $3.50/GPU-hr for all 5 years (reserved-contract pricing) at 80% utilization, $3.25M/yr OpEx, and a 3-year loan at 15% on $23.4M debt against $12.6M initial equity. Debt fully amortizes by end of year 3. Years 4-5 run unlevered. Residual is 10% of hardware cost.

A 2.3x MOIC over 5 years is roughly a 22% IRR (internal rate of return, the annualized return on equity). The cash flow is back-loaded: years 1-3 return little while the loan amortizes, then years 4-5 deliver most of the return once the cluster runs unlevered.

CoreWeave posted a 60% adjusted EBITDA margin in 2025.^[3] Smaller operators running leaner clusters at similar utilization see comparable unit economics per GPU, though they lack CoreWeave's scale advantages on procurement and colocation pricing. CoreWeave also has much better cost of capital (interest rates) than smaller operators.

The biggest risk is utilization

The return math depends on keeping GPUs busy. Every 20% drop in utilization costs roughly $3.5 million per year on a 576-GPU cluster.

Annual free cash flow by utilization (576-GPU cluster)

Annual free cash flow averaged over a 5-year hold at $3.50/GPU-hr. Fixed costs (debt service plus OpEx) average ~$10M/year across the 3-year loan life and the 2 unlevered years. Each 20% utilization drop changes revenue by ~$3.5M/year.

Below roughly 60% utilization, most clusters cannot cover monthly costs. Interest payments on the loan are due every month even when revenue isn't coming in. CoreWeave paid $1.2 billion in interest in 2025 on $21.4 billion in debt.^[3] Those costs exist proportionally for smaller operators.

The primary way operators manage this is to lock long-term offtake agreements with termination penalties. A take-or-pay contract, where the customer pays whether they use the capacity or not, turns variable market demand into predictable revenue. Operators often structure these agreements in tranches: we deliver A compute by X date, another B compute by Y date, etc. CoreWeave, for example, has a $66.8 billion backlog in contracted future revenue. ^[3]

For lenders, this helps contain the risk of not getting paid back. In a 3-year GPU-backed loan, the lender will be covered as long as utilization lands above the 60% range. Lenders require operators to have offtake agreements in place before closing the loan.

Utilization risk is really customer risk. If your customers churn, downsizes, or goes bankrupt, your GPUs will sit idle while debt service accrues. Operators who rely on a single customer face concentration risk. CoreWeave faced this with Microsoft, which accounted for 62% of CoreWeave's revenue in 2024 and 67% in 2025.^[3]

Infrastructure booms that attracted small operators with debt

Railroads

1860s-1870s

Massive overbuilding, 1873 panic, consolidation into ~7 major lines

Fiber / telecom

1998-2001

90%+ went bankrupt when demand lagged, $2T in market cap destroyed

Shale oil

2010-2015

Oil crashed from $100 to $30, overleveraged operators wiped out

GPU inference

2024-now

Demand exceeds supply; market still in early buildout phase

Why offtake is "easy" to lock in

Demand for compute outstrips supply as of early 2026.

Anthropic's revenue run rate grew from roughly $1 billion in early 2025 to nearly $20 billion by March 2026.^[5] Other AI applications are following, and the inference demand is building. As proof of demand, NVIDIA Blackwell B200 and GB200 GPUs are sold out through mid-2026, with an estimated backlog of 3.6 million units according to industry analysts.^[6]

Customers are willing to sign 1-3 year offtake agreements because they cannot get capacity elsewhere. A company building an AI product needs inference compute to serve its users; if they cannot get it, their product won't be able to launch. If hyperscalers like AWS and Azure are 2-3x the price, that company will commit to a 2-year contract with a smaller operator; as long as the operator can deliver it on time in the next few months.

The typical offtake process for a sub-$100M cluster:

Operator identifies a customer who needs compute (an AI company, a SaaS company deploying AI features, an enterprise running internal models).
Customer signs an offtake agreement committing to 1-3 years of GPU-hours at a fixed rate.
Operator uses the signed contract to secure GPU-backed financing from a private credit lender.
Operator deploys hardware on time and begins billing the customer.

Lending

Private credit firms are funding these cluster operators. CoreWeave raised $2.3 billion from Magnetar and Blackstone in August 2023 and $7.5 billion from Blackstone in May 2024.^[7] Apollo provided $3.5 billion in financing for the xAI/Valor compute infrastructure transaction in January 2026.^[8] The next month, a large neocloud, Nscale, announced a $1.4 billion loan from PIMCO, Blue Owl, and LuminArx Capital Management.^[9]

Smaller operators are borrowing from a similar profile, but on a smaller scale. They are typically funded by boutique private credit firms and family offices that are speculating in the space. Private credit direct lending to these small operators typically yields 12-15%, well above the 4-5% on investment-grade corporate bonds.

Traditional banks largely stepped back from this kind of mid-market lending after post-2008 regulations. Basel III and Dodd-Frank forced banks to hold more capital against risky loans, making syndicated lending less profitable. Private credit firms filled the gap. The market has grown to over $2.1 trillion as of 2023,^[10] and GPU-backed lending is one of the newer asset classes absorbing that capital.

What can go wrong

How timeline risk cascades

1. GPU procurement slips 6-9 months

2. Deployment milestone missed

3. Customer signs with competitor

4. No offtake, debt can't be repaid

5. Stuck with depreciating GPUs

The biggest risk is timeline. Offtake agreements have clear milestones and tranches; miss a milestone and the customer can end the contract.

Of 110 data center projects slated for 2025, 26% were delayed.^[11]
Smaller GPU procurement orders can face 6-9 month lead times.
GPU-backed loans can take 3-6 months to close.
Colocation with enough power density for modern power-hungry GPUs, 40-80 kW per rack, is scarce.
As of end 2024, about 10,300 energy projects of all types were waiting to connect to the U.S. power grid.^[12] Data centers depend on new power generation coming online to get the capacity they need. Any delay in those energy projects means a delay for the data center.

Customer churn is a serious risk. Customers sign letters of intent with multiple operators and go with whoever delivers first. Revenue goes to the operator who shipped on time.

Offtake can fall apart in two ways: the customer defaults (e.g. a startup runs out of money and cannot pay), or the operator misses deployment milestones and the customer walks without breaching the contract.

If either happens and the operator cannot find new customers, the lenders have to rely on GPU residual value to recover their loan. The GPU residual value is what the operator can get from selling the used GPUs on the secondary market.

The problem is that GPU technology moves fast. CoreWeave depreciates servers over 6 years, Lambda uses 5, Nebius uses 4.^[13] A lender holding a 3-year loan against B200 GPUs needs those GPUs to hold enough value through the term to cover the outstanding balance if the borrower defaults.

References

Frequently Asked Questions

Who is building GPU compute infrastructure outside the hyperscalers?

Private equity firms, family offices, and entrepreneurial operators are deploying sub-$100M GPU clusters focused on inference workloads. They rent space in colocation facilities, finance hardware with GPU-backed debt at 60-70% loan-to-value, and lock in revenue through long-term offtake agreements with AI companies and enterprises.

What returns can GPU infrastructure investors expect?

Operators targeting 2-3x MOIC over a 3-5 year hold period. Returns come from three sources: operating cash flow from GPU-hour rentals, equity accrual as the loan amortizes, and residual hardware value at exit. The math depends on maintaining 80%+ utilization and securing take-or-pay offtake contracts.

Why is inference compute demand so strong?

Every company deploying AI into production needs inference capacity. Anthropic's revenue run rate grew from roughly $1 billion in early 2025 to nearly $20 billion by March 2026, almost entirely inference compute. NVIDIA B200 and GB200 GPUs are sold out through mid-2026 with a 3.6 million unit backlog. Hyperscaler GPU instances have waitlists.

What is the biggest risk in GPU infrastructure investing?

Timeline risk. GPU procurement, colocation buildout, power availability, and loan closing can all slip. When timelines slip, operators miss deployment milestones and lose offtake deals to competitors who deliver first. Without a performing offtake contract, the investment falls back on GPU residual value, which compresses every time NVIDIA ships a new architecture.

Bridging GPU operators and financing partners

We help emerging neoclouds find financing partners, and help financing partners enhance story credit with GPU collateral management and residual value insurance solutions.

Learn how it works →

ShareLinkedIn X Facebook