Who Is Building Compute and Why Is It So Lucrative
Private equity and entrepreneurial operators are building sub-$100M GPU clusters focused on inference, the workload that runs trained AI models in production. High demand for AI compute lets operators lock in long-term offtake agreements, contracts where a customer commits to buying capacity in advance, making 2-3x multiples on invested capital realistic over a 3-5 year hold.
Who is building sub-$100M compute clusters
The compute buildout has three tiers. The largest by far are hyperscalers, the large cloud providers that build their own infrastructure and spend billions. Microsoft committed $80 billion in capital expenditure on AI for 2025 alone.[1] Meta is building a $30 billion facility.[2] New neocloud operators like CoreWeave are also spending on hardware and facilities, carrying $21.4 billion in debt as of February 2026.[3]
Who builds compute infrastructure
Examples
Microsoft, Meta, Google
Scale
$10B-$80B+
Workload
Training + inference
Capital
Cash flow, bonds
Examples
CoreWeave, Lambda, Crusoe
Scale
$100M-$10B+
Workload
Training + inference
Capital
VC, GPU-backed debt
Examples
PE firms, family offices, entrepreneurs
Scale
$10M-$100M
Workload
Inference
Capital
Equity + GPU-backed debt
Below them, a different market is forming. Operators are deploying clusters of a few thousand GPUs, where total project costs are under $100 million and under 5 megawatts.
Private equity firms and family offices are deploying fund capital into GPU clusters the same way they deploy into infrastructure assets: buy hardware,finance it with debt, lock in revenue through contracts, and distribute profits after several years. Entrepreneurial operators are raising from both, bringing the technical team and the data center relationships.
Depending on the deal, the operator might earn a management fee and carry, a share of the profits above a return threshold. The capital partners collect the rest of the profits.
Renting space
These smaller operators don't always build out the data centers. They rent space inside them just as often. Servicing them are colocation providers like Equinix, QTS, or CyrusOne, which own the facilities. The operator signs a agreement, installs the GPU servers in the facility, and the colo provider gives power, cooling, and internet.
The operator owns the GPUs and the customer contracts. The colo provider owns the building. This split is what makes sub-$100M deployments possible: you don't need to spend $500 million and two years constructing a facility.
These smaller operators cannot serve AI training workloads as well as the top neocloud operators, but they are well positioned to target AI inference workloads, which is using already trained AI models. The market for inference is rapidly growing as AI applications mature.
What the returns look like
As of 2026, a 576-GPU B200 cluster costs roughly $36 million. The operator finances it at 65% loan-to-value (LTV), borrowing about $23.4 million and putting up roughly $12.6 million in equity to buy the hardware.
Typical GPU-backed loan terms as of early 2026:
| Term | Typical range | Notes |
|---|---|---|
| LTV | 60-70% | Lenders advance 60-70% of hardware purchase price |
| Interest rate | ~15% | Higher for smaller operators with limited track records |
| Loan term | 3 or 5 years | Depends on the operator's plans |
| Amortization | Straight-line | Loan payment schedule is spread evenly over the loan term |
| Collateral | GPUs + contracts | Hardware plus assigned customer revenue |
B200 rates on marketplaces like Vast.ai run around $2-4 per GPU-hour as of early 2026.[4] A GPU-hour is one GPU running for one hour, the standard billing unit. At a blended $3.50/hr and 80% utilization, that cluster generates roughly $14.1 million per year.
After debt service, the combined interest and principal payments on the loan, plus colocation, power, and a small technical team, the operator keeps a thin margin in the early years. As the loan pays down, more revenue turns into profit. When the operator eventually sells or refinances the GPUs, the remaining hardware value adds to the total return.
Over a 3-5 year hold, that math produces a 2-3x MOIC (multiple on invested capital), the ratio of total money returned to total money invested.
MOIC waterfall: 576-GPU B200 cluster, 4-year hold
CoreWeave posted a 60% adjusted EBITDA margin in 2025.[3] Smaller operators running leaner clusters at similar utilization see comparable unit economics per GPU, though they lack CoreWeave's scale advantages on procurement and colocation pricing. CoreWeave also has much better cost of capital (interest rates) than smaller operators.
The biggest risk is utilization
The return math depends on keeping GPUs busy. Every 20% drop in utilization costs roughly $3.5 million per year on a 576-GPU cluster.
Annual free cash flow by utilization (576-GPU cluster)
Below roughly 60% utilization, most clusters cannot cover monthly costs. Interest payments on the loan are due every month even when revenue isn't coming in. CoreWeave paid $1.2 billion in interest in 2025 on $21.4 billion in debt.[3] Those costs exist proportionally for smaller operators.
The primary way operators manage this is to lock long-term offtake agreements with termination penalties. A take-or-pay contract, where the customer pays whether they use the capacity or not, turns variable market demand into predictable revenue. Operators often structure these agreements in tranches: we deliver A compute by X date, another B compute by Y date, etc. CoreWeave, for example, has a $66.8 billion backlog in contracted future revenue. [3]
For lenders, this helps contain the risk of not getting paid back. In a 3-year GPU-backed loan, the lender will be covered as long as utilization lands above the 60% range. Lenders require operators to have offtake agreements in place before closing the loan.
Utilization risk is really customer risk. If your customers churn, downsizes, or goes bankrupt, your GPUs will sit idle while debt service accrues. Operators who rely on a single customer face concentration risk. CoreWeave faced this with Microsoft, which accounted for 62% of CoreWeave's revenue in 2024 and 67% in 2025.[3]
Infrastructure booms that attracted small operators with debt
Railroads
1860s-1870s
Massive overbuilding, 1873 panic, consolidation into ~7 major lines
Fiber / telecom
1998-2001
90%+ went bankrupt when demand lagged, $2T in market cap destroyed
Shale oil
2010-2015
Oil crashed from $100 to $30, overleveraged operators wiped out
GPU inference
2024-now
Demand exceeds supply; market still in early buildout phase
Why offtake is "easy" to lock in
Demand for compute outstrips supply as of early 2026.
Anthropic's revenue run rate grew from roughly $1 billion in early 2025 to nearly $20 billion by March 2026.[5] Other AI applications are following, and the inference demand is building. As proof of demand, NVIDIA Blackwell B200 and GB200 GPUs are sold out through mid-2026, with an estimated backlog of 3.6 million units according to industry analysts.[6]
Customers are willing to sign 1-3 year offtake agreements because they cannot get capacity elsewhere. A company building an AI product needs inference compute to serve its users; if they cannot get it, their product won't be able to launch. If hyperscalers like AWS and Azure are 2-3x the price, that company will commit to a 2-year contract with a smaller operator; as long as the operator can deliver it on time in the next few months.
The typical offtake process for a sub-$100M cluster:
- Operator identifies a customer who needs compute (an AI company, a SaaS company deploying AI features, an enterprise running internal models).
- Customer signs an offtake agreement committing to 1-3 years of GPU-hours at a fixed rate.
- Operator uses the signed contract to secure GPU-backed financing from a private credit lender.
- Operator deploys hardware on time and begins billing the customer.
Lending
Private credit firms are funding these cluster operators. CoreWeave raised $2.3 billion from Magnetar and Blackstone in August 2023 and $7.5 billion from Blackstone in May 2024.[7] Apollo provided $3.5 billion in financing for the xAI/Valor compute infrastructure transaction in January 2026.[8] The next month, a large neocloud, Nscale, announced a $1.4 billion loan from PIMCO, Blue Owl, and LuminArx Capital Management.[9]
Smaller operators are borrowing from a similar profile, but on a smaller scale. They are typically funded by boutique private credit firms and family offices that are speculating in the space. Private credit direct lending to these small operators typically yields 12-15%, well above the 4-5% on investment-grade corporate bonds.
Traditional banks largely stepped back from this kind of mid-market lending after post-2008 regulations. Basel III and Dodd-Frank forced banks to hold more capital against risky loans, making syndicated lending less profitable. Private credit firms filled the gap. The market has grown to over $2.1 trillion as of 2023,[10] and GPU-backed lending is one of the newer asset classes absorbing that capital.
What can go wrong
How timeline risk cascades
The biggest risk is timeline. Offtake agreements have clear milestones and tranches; miss a milestone and the customer can end the contract.
- Of 110 data center projects slated for 2025, 26% were delayed.[11]
- Smaller GPU procurement orders can face 6-9 month lead times.
- GPU-backed loans can take 3-6 months to close.
- Colocation with enough power density for modern power-hungry GPUs, 40-80 kW per rack, is scarce.
- As of end 2024, about 10,300 energy projects of all types were waiting to connect to the U.S. power grid.[12] Data centers depend on new power generation coming online to get the capacity they need. Any delay in those energy projects means a delay for the data center.
Customer churn is a serious risk. Customers sign letters of intent with multiple operators and go with whoever delivers first. Revenue goes to the operator who shipped on time.
Offtake can fall apart in two ways: the customer defaults (e.g. a startup runs out of money and cannot pay), or the operator misses deployment milestones and the customer walks without breaching the contract.
If either happens and the operator cannot find new customers, the lenders have to rely on GPU residual value to recover their loan. The GPU residual value is what the operator can get from selling the used GPUs on the secondary market.
The problem is that GPU technology moves fast. CoreWeave depreciates servers over 6 years, Lambda uses 5, Nebius uses 4.[13] A lender holding a 3-year loan against B200 GPUs needs those GPUs to hold enough value through the term to cover the outstanding balance if the borrower defaults.
References
- CNBC, "Microsoft expects to spend $80 billion on AI-enabled data centers in fiscal 2025" (January 2025)
- NPR, "Meta is building a massive data center" (December 2025)
- CoreWeave Q4/FY2025 earnings report and SEC S-1 filing
- Vast.ai B200 marketplace pricing (as of early 2026)
- Bloomberg, "Anthropic Nears $20 Billion Revenue Run Rate" (March 2026)
- TokenRing via Financial Content, "NVIDIA Blackwell B200 and GB200 Sold Out Through Mid-2026" (December 2025)
- CoreWeave press release, "$2.3 Billion Debt Financing Facility" (August 2023); Blackstone, "$7.5 Billion Debt Financing Facility" (May 2024)
- Apollo, "Apollo Backs $5.4 Billion Valor and xAI Data Center Compute Infrastructure Transaction" (January 2026)
- Nscale, "$1.4bn Delayed Draw Term Loan Backed by GPUs" (February 2026)
- IMF, "Fast-Growing $2 Trillion Private Credit Market Warrants Closer Watch" (April 2024)
- Sightline Climate, Data Center Outlook report (2025); Latitude Media coverage
- Lawrence Berkeley National Laboratory, "Queued Up: 2025 Edition" (2025)
- theCUBE Research, "Resetting GPU Depreciation: Why AI Factories Bend But Don't Break Useful Life Assumptions" (2025)
Frequently Asked Questions
Who is building GPU compute infrastructure outside the hyperscalers?
Private equity firms, family offices, and entrepreneurial operators are deploying sub-$100M GPU clusters focused on inference workloads. They rent space in colocation facilities, finance hardware with GPU-backed debt at 60-70% loan-to-value, and lock in revenue through long-term offtake agreements with AI companies and enterprises.
What returns can GPU infrastructure investors expect?
Operators targeting 2-3x MOIC over a 3-5 year hold period. Returns come from three sources: operating cash flow from GPU-hour rentals, equity accrual as the loan amortizes, and residual hardware value at exit. The math depends on maintaining 80%+ utilization and securing take-or-pay offtake contracts.
Why is inference compute demand so strong?
Every company deploying AI into production needs inference capacity. Anthropic's revenue run rate grew from roughly $1 billion in early 2025 to nearly $20 billion by March 2026, almost entirely inference compute. NVIDIA B200 and GB200 GPUs are sold out through mid-2026 with a 3.6 million unit backlog. Hyperscaler GPU instances have waitlists.
What is the biggest risk in GPU infrastructure investing?
Timeline risk. GPU procurement, colocation buildout, power availability, and loan closing can all slip. When timelines slip, operators miss deployment milestones and lose offtake deals to competitors who deliver first. Without a performing offtake contract, the investment falls back on GPU residual value, which compresses every time NVIDIA ships a new architecture.
Coverage creates a minimum value for what your GPUs are worth at a future date. If they sell below the floor, the policy pays you the difference.
Learn how it works →