NVIDIA is set to release earnings after the closing bell on August 23 and I think its top-line results will be the data point of the week. Analysts are estimating an average revenue figure of $11.17b for the quarter, compared to $7.19b last quarter.
Betting on Wednesday’s announcement is a gamble (although my gut tells me they’ll beat on top-line). Beyond Wednesday, NVIDIA remains a key AI play and is likely to continue experiencing massive sales growth.
Massive Demand for NVIDIA’s H100 GPUs
Drug dealers go to war with other drug dealers over territory in order to maintain a monopoly position in their market. This allows them to control supply and prices with little interference.
The same general principle works for any other good or service – particularly if it’s an inelastic good, like meth, insulin or NVIDIA’s H100 GPUs.
How big is the demand for H100s?
According to Elon Musk, OpenAI’s GPT-5 might require 30,000-50,000 of NVIDIA’s H100 GPUs. Is it any wonder why NVIDIA stock has almost quadrupled over the past several months?
Of course, OpenAI isn’t the only company thirsty for GPUs. All big tech is at the front of the line, along with dozens of startups. Startups building LLMs might need hundreds if not thousands of their own H100 GPUs (or it’s predecessor A100).
According to one source, here’s what we might be looking at in terms of overall demand:
OpenAI might want 50k. Inflection wants 22k. Meta maybe 25k (I’m told actually Meta wants 100k or more). Big clouds might want 30k each (Azure, Google Cloud, AWS, plus Oracle). Lambda and CoreWeave and the other private clouds might want 100k total. Anthropic, Helsing, Mistral, Character, might want 10k each. Total ballparks and guessing, and some of that is double counting both the cloud and the end customer who will rent from the cloud. But that gets to about 432k H100s. At approx $35k a piece, that’s about $15b worth of GPUs. That also excludes Chinese companies like ByteDance (TikTok), Baidu, and Tencent who will want a lot of H800s.
There are also financial companies each doing deployments starting with hundreds of A100s or H100s and going to thousands of A/H100s: names like Jane Street, JP Morgan, Two Sigma, Citadel.
How much does all this cost? A fortune according to the same source:
1x DGX H100 (SXM) with 8x H100 GPUs is $460k including the required support. $100k of the $460k is required support. The specs are below. Startups can get the Inception discount which is about $50k off, and can be used on up to 8x DGX H100 boxes for a total of 64 H100s.
Elon Musk: Easier to Buy Drugs than GPUs Right Now…
Chip shortages are causing a slow-down in AI development right now. With the flow of releases slowing, some of the hype around AI development has died down. Google search trends for AI peaked at the end of April and has moved sideways since.
However, the development pipeline and hype will likely resurge once companies can get their hands on GPUs.
“One reason the AI boom is being underestimated is the GPU/TPU shortage. This shortage is causing all kinds of limits on product rollouts and model training but these are not visible. Instead all we see is Nvidia spiking in price. Things will accelerate once supply meets demand.”
— Adam D’Angelo, CEO of Quora, Poe.com, former Facebook CTO
Will NVIDIA Retain its Leadership Position?
NVIDIA doesn’t have a regulated monopoly position, so there’s nothing (except resources, talent, capacity, infrastructure and capital investment) preventing competitors from launching their own GPUs.
Although it currently seems like NVIDIA’s monopoly has a couple years to play out at least, here are some potential alternatives:
Here’s a list of possible “monopoly breakers” I’m going to write about in another post – some of these are things people are using today, some are available but don’t have much user adoption, some are technically available but very hard to purchase or rent/use, and some aren’t yet available:
* Software: OpenAI’s Triton (you might’ve noticed it mentioned in some of “TheBloke” model releases and as an option in the oobabooga text-generation-webui), Modular’s Mojo (on top of MLIR), OctoML (from the creators of TVM), geohot’s tiny corp, CUDA porting efforts, PyTorch as a way of reducing reliance on CUDA
* Hardware: TPUs, Amazon Inferentia, Cloud companies working on chips (Microsoft Project Athena, AWS Tranium, TPU v5), chip startups (Cerebras, Tenstorrent), AMD’s MI300A and MI300X, Tesla Dojo and D1, Meta’s MTIA, Habana Gaudi, LLM ASICs, [+ Moore Threads]
The A/H100 with infiniband are still the most common request for startups doing LLM training though.
The AI Arms Race is in Full Swing
In response to the rising strategic threat, on August 10th President Biden signed an executive order meant to restrict China’s access to strategically important technology. The order authorizes the U.S. Treasury secretary to prohibit or restrict U.S. investments in Chinese entities in three sectors: semiconductors and microelectronics, quantum information technologies and certain artificial intelligence systems.
At the same time, the UK announced it seeks to acquire 5,000 H100 chips as it strives to build a dominant position on the global AI stage. This is happening while other countries – such as Saudi Arabia and UAE – clamor for their own supply.
After 20 years of technological transfer and enablement to trading partners with questionable motives, western allies will fight to retain their AI lead through massive investment and geopolitical firewalls.
In my opinion, it is not unreasonable to suggest we’re at the beginning of a secular AI-driven bull market. One that will occasionally crash-and-burn, but could lift capital markets for years to come.