Nvidia's earnings and the LLM scaling laws slowdown?
Hi all,
Given the importance of Nvidia on the whole AI market and the topic of slowing scaling laws for LLMs, I decided to publish my overview of their latest earnings and my take on what we should expect next from the company and the industry.
Headline numbers
Nvidia reported its earnings this Wednesday. They reported $35.1 billion in revenue, up 17% sequentially and up 94% YoY. It was also well above the $32.5B outlook. Data Center revenue was again the crown jewel, with Data center revenue of $30.8B, up 112% YoY. Nvidia's Hopper and H200 mainly drove the revenue growth. Nvidia H200 sales in this quarter increased to double-digit billions, and it is the fastest production ramp in the company's history. Net income increased by 109% YoY to $19.3B. Ironically, in the same quarter last year, Nvidia had less revenue than it does net income in this quarter. Surprisingly for many, even though we have Blackwell ramping up, according to Nvidia Hopper, demand continues to be strong and is expected to continue even in the coming quarters.
Are cloud service providers still the main driver of Data Center revenue?
Cloud service providers (CSPs) like Amazon, Microsoft, Google, and Oracle are still the main drivers of the Data Center revenue. Nvidia says that approximately half of their Data Center sales came from CSPs this quarter. Nvidia said that they do more than 2x on a year-on-year basis. This means that Nvidia's reliance on CSPs is getting bigger again.
CSPs as % of Data Center revenue:
Q4 2023 – +50%
Q1 2024 – mid 40%
Q2 2024 – 45%
Q3 2024 – 50%
The above figures again confirm the argument that many worry that the growth of Nvidia is very dependent on the CapEx spend of CSPs.
Throughout the call, Nvidia hinted at other client groups contributing to growth, like Consumer Internet companies such as Meta, xAI, and Sovereign AI. These are either governments or government companies or big local companies from certain regions building their local AI:
»So our sovereign AI and our pipeline going forward is still absolutely intact as those are working to build these foundational models in their own language, in their own culture, and working in terms of the enterprises within those countries. And I think you'll continue to see this be growth opportunities that you may see with our regional clouds that are being stood«
The problem with growth being more dependent on CSPs is that CSPs will be coming to limits regarding their yearly CapEx spent from their investor bases at some point.
Looking at this chart of the CapEx growth of CSPs, we can see that we are getting to a territory of big numbers, even for the biggest companies on Earth. We also have to note that all of these companies have already increased their CapEx guidance for next year, so the numbers will be even bigger next year.
Looking at their Free cash flow, we can see that they are making almost the same FCF as they are currently spending on CapEx.
The question of how much more CapEx increase investors of cloud providers are willing to handle is a complicated one. It also depends on the growth of AI workloads that contribute to the top line. In the last quarter, we saw accelerated growth from all the CSPs, but the pace of that growth and the trajectory of CapEx growth so far are on a different curve.
A slight sequential dip in networking revenue?
The less positive news for Nvidia came quite surprisingly from networking revenue. Nvidia said networking revenue increased 20% YoY, but revenue was down sequentially. I find this surprising given how companies so far preferred Nvidia to have their full stack in terms of GPU, CPU, networking, and everything bundling together in one supercomputer. When asked about this sequential decline in networking revenue, management avoided the question and instead reaffirmed that networking will return to sequential growth in Q4:
»So this quarter is just a slight dip down and we're going to be right back up in terms of growing. We're getting ready for Blackwell and more and more systems that will be using not only our existing networking but also the networking that is going to be incorporated in a lot of these large systems we are providing them to.«
Gross margin
A very important part that investors are focused on when it comes to Nvidia is the gross margin. It is a barometer showing if competition is heating up or not. This quarter's gross margin was 74.6%, down sequentially, primarily driven by a mixed shift of the H100 systems to more complex and higher-cost systems within the Data Center.
Nvidia also explained that we should expect the gross margin to continue to moderate to the low 70s area as Blackwell ramps up. Once Blackwell is fully ramped, it should be back in the mid-70s. The CFO hinted that the full ramp and, with it, the higher gross margin assumption for the second half of 2025 is reasonable.
When it comes to the margin, a topic that is becoming more important as the years progress is what happens with the older versions of the Nvidia chips that clients purchased. Right now, we are witnessing a drastic fall in the prices of running Nvidia H100, which were very sought-after chips just 1 year ago.
»If you look across the ecosystem, there are full data centers of H100 that they don't have any interest in it. A year ago, there was a mad scramble. People were paying ridiculous prices just for the H100 hardware, locking customers into anywhere from one- to three-year deals from CSP standpoint, at ridiculous rates like $6 an hour per H100. Now, because the H200 is out there at the same price and the Blackwells are coming, nobody wants the H100. It's going to be really interesting to see what's the price of an H100 drops to.«
( Former high ranking Intel Employee)
source: Alphasense (get a 14 day free trial and acess the full interview)
Jensen also made an interesting comment that 64 Blackwell GPUs are required to run the GPT-3 OpenAI benchmark compared to 256 H100s or a 4x reduction in cost. He also noted that the H200 delivers up to 2x faster inference performance and up to 50% improved TCO (Total cost of ownership).
The older versions, like the H100, seem to be becoming irrelevant. However, some people expect them to be used in inference. That doesn't seem so straightforward to me, as Nvidia requires someone who uses their chips for AI to have an AI enterprise license. These licenses are costly (a few thousand $ per GPU) and transferable, so if a client upgrades to a newer chip, they can use the license of the older one and get more compute. But if you still want to use the older chip, you have to buy another license, which increases your total costs and contributes to adding more costs to inference. On top of that, the older chip tends to be less performance/power-efficient, which is the key thing for inference, given that energy is quite limited and expensive.
»The license cost, it's per year. AMD I don't believe is charging anything. It's going to be interesting from NVIDIA's standpoint because you're going to get into a situation where customers are going to be encouraged because of the licensing cost if they're on a license to move faster to the next generation because essentially, at the rate of which they're increasing performance, it would be cheaper to replace four old GPUs with one new one because you only need one license versus the four, and the licenses are transferable. It's a nice way for NVIDIA to make sure people stay on the leading-edge hardware.«
(Former Intel Director)
source: Alphasense (get a 14 day free trial and acess the full interview)
Are LLM Scaling laws slowing down?
This is a question that Jensen couldn't avoid, as he got it as the first question on his earnings call. It is something that the whole AI community has been asking for the last few weeks. Are scaling laws slowing down for LLMs? Several prominent figures in the AI community have recently hinted at that. Bloomberg came out with an article saying that LLM developers companies are falling short of internal pre-training targets. Prominent figures from the AI community, even Ilya Sutskever (ex-OpenAI founder), have hinted at a slowdown in improvement with the pre-training of these LLMs. In his answer to the question, Jensen shifted the question more toward new scaling paradigms that are being brought up.
He explained that LLMs now scale with pre-training, post-training with reinforcement learning, and now with inference time scaling, which OpenAI's o1 model first introduced. (For those interested in how inference is the next scaling paradigm, you can check this recent article I posted.) However, to me, the fact that Jensen shifted that answer towards showing post-training and inference as a scaling paradigm also indicates that there is more truth to pre-training scaling laws slowing down.
It is essential to understand this shift as it can signal significant changes for the industry. If pre-training doesn't make big incremental improvements to the LLM models anymore and the focus shifts to post-training and inference scaling, then the gravity of investments might shift as well. Post-training scaling means improving the model with reinforcement learning via human or machine feedback. Here, the story is not about how many GPUs you have but more about the quality of your data. While some argue that we have synthetic data that is produced from older versions of LLMs, some don't think that synthetic data will be good enough for major improvements. A Former Google Deepmind Sr. Scientist said this about synthetic data:
»Now they're all just doing synthetic data to fix things here and here, but scaling laws does not want synthetic data. It actually wants what's called independent and identically distributed data, IID data. It wants new data. It doesn't want to just get a reframing of the old data. I think in the LLM world there is a sense in which the next-generation models are not going to be much better than the previous models if you just talking about pure text LLMs«
(Former Google Deepmind Sr. Scientist)
source: Alphasense (get a 14 day free trial and acess the full interview)
The above comment suggests that the companies with the best real-world or siloed closed online data might see more considerable advantages compared to companies that don't have that data.
Also, regarding the inference scaling paradigm, the biggest question for Nvidia is what their competitive position in this market is. As I already wrote some time ago, when o1 came out, Nvidia was and will be in the inference market. Still, its competitive position differs from the pre-training market, where they have no real competition. At inference, given there is more competition and less complexity, I do expect the margins to reflect that over the longer period. A comment from a COO of a big AI compute company:
»I think in the next 12 months, you will see that the people will be able to figure out inference jobs running on NVIDIA or AMD almost equally other than some few major exceptions, so that will catch up«
source: Alphasense (get a 14 day free trial and acess the full interview)
The biggest advantage Nvidia can carve out is that if context windows for time-scale computing become very large, customers would again need a larger inference cluster, where Nvidia might have an advantage again. The other thing where Nvidia would benefit is if demand for inference starts to really take off and other competitors do not get enough supply from TSMC compared to Nvidia, which has close relationships with TSMC.
All of this leads us to another expected thing that happened during the Nvidia call: Jensen talking a lot more about inference.
Here are just a few of his comments from the call mentioning inference:
»Our upcoming release of NVIDIA NIM will boost Hopper inference performance by an additional 2.4x.«
»NVIDIA Blackwell architecture with NVLink Switch enables up to 30x faster inference performance and a new level of inference scaling, throughput and response time that is excellent for running new reasoning inference applications like OpenAI's o1 model.«
»But remember, simultaneously, we're seeing inference really starting to scale up for our company. We are the largest inference platform in the world today because our installed base is so large«
For Nvidia, positioning itself firmly in inference is the next key step as CapEx for 2025, in my view, will shift from training to inference in a larger way—more than what is expected today.
Before we continue with the article a section for our partner Alphasense
In my research process, I use the AlphaSense platform as my primary research tool. I made a short video explaining some of the features they provide and how I use the platform:
now back to the article…
Future growth vectors for Nvidia and the AI chip market
We also got some hints from the call about potential new growth verticals for Nvidia. Two things that stood out to me were, firstly, Jensen talking about robotics and industrial AI use cases saying:
»Industrial AI and robotics are accelerating. This is triggered by breakthroughs in physical AI, foundation models that understand the physical world, like NVIDIA NeMo for enterprise AI agents. We built NVIDIA Omniverse for developers to build, train, and operate industrial AI and robotics. Some of the largest industrial manufacturers in the world are adopting NVIDIA Omniverse to accelerate their businesses, automate their workflows, and to achieve new levels of operating efficiency«
He even got as far as saying: »The age of robotics is coming.«
If this is true and we do get a near-term breakthrough in AI robotics, then this is another huge growth vector for Nvidia and the chip industry at large.
But Jensen also talked about multimodal models being important, especially in the context of video:
»Now we have multimodality foundation models and the amount of petabytes video that these foundation models are going to be trained on, it's incredible.«
This is something that I also agree with, as text models are slowing when it comes to pre-training, but video and other multimodalities seem to be in the early stages, even at the pre-training level. A Former Google Deepmind scientist agrees with this take:
»However, if in other modalities, let's say, we start doing images or video or even agents, which is what I'm working on, then scaling laws holds. One, we're in the early stages of the scaling laws for agents and video. We have pretty much infinite video at this point.
Obviously, at some point we will exhaust that too but right now we have a lot of video and we're not training on them. Very few people train on big chunks of YouTube. Sora was one example. Also in agents a lot of action data is generated in the world, but it's not captured, it's not recorded, it's not trained on.«
(Former Google Deepmind Sr. Scientist)
source: Alphasense (get a 14 day free trial and acess the full interview)
Jensen here again pointed to an essential point of the advantage of GPUs compared to ASICs in this case, which is the flexibility of the GPU as there is still a lot of innovation happening on the video and multimodal front:
»On the other hand, the models are getting larger. They're multimodality. Just the number of dimensions that inference is innovating is incredible. And this innovation rate is what makes NVIDIA's architecture so great because we -- our ecosystem is fantastic. Everybody knows that if they innovate on top of CUDA and top of NVIDIA's architecture, they can innovate more quickly and they know that everything should work.«
Nvidia's TAM
As often before, we got Jensen's response on how big he thinks the AI market is and when and where Nvidia’s demand should start to slow down. Jensen said:
»We are really at the beginnings of 2 fundamental shifts in computing that is really quite significant. The first is moving from coding that runs on CPUs to machine learning that creates neural networks that runs on GPUs.
And that fundamental shift from coding to machine learning is widespread at this point. There are no companies who are not going to do machine learning. And so machine learning is also what enables generative AI. And so on the one hand, the first thing that's happening is $1 trillion worth of computing systems and data centers around the world is now being modernized for machine learning. On the other hand, secondarily, I guess, is that on top of these systems are going to be -- we're going to be creating a new type of capability called AI.«
He later added the following:
»And let's assume that over the course of 4 years, the world's data centers could be modernized as we grow into IT, as you know, IT continues to grow about 20%, 30% a year, let's say.
And so -- but let's say by 2030, the world's data centers for computing is, call it, a couple of trillion dollars. We have to grow into that. We have to modernize the data center from coding to machine learning. That's number one. The second part of it is generative AI.«
As I often like to do with CEO statements, I did some back-of-the-napkin math on what that could mean for Nvidia.
The modeling here is the next 4 years for Nvidia, which Jensen hinted could be when Nvidia replaces the old architecture and later on continues to grow in 20-30% in line with Data center usage growth.
If we look at when Nvidia started to scale more into data centers with their newest chips (2023), we can see that they have already sold approximately $150B of those systems to date, which means around $850B of those still need to be replaced in the next 4 years. If we add the 20-30% growth rate of data centers in this period, we get a $1650B of data center revenue potential for Nvidia in those 4 years. The CAGR for these 4 years at which Nvidia would grow its revenue if that were the case would be around 59%. The estimate for Nvidia's revenue in 2028 would, in that case, be around $600B, and if the net margin is kept at the high levels like it is today, net income would be around $300B. At that point, Nvidia would be a business growing at 20%-30% going forward, so a 28x P/E might be an appropriate multiple, meaning the stock would trade at $8.5T and would make close to a 4-year 25% CAGR from today's prices. But in this scenario, given the CSPs are 50% of Nvidia's revenue, if the dependency remains the same, the CSPs would have to grow their cloud CapEx in the 4 years close to 40% p.a. basis. Despite their high ambitions and being in an arms race, this seems too high, especially considering the already high absolute number of CapEx we are at now.
Also, looking at current analyst consensus estimates for Nvidia's revenue growth going forward, they are much lower. Next year, Nvidia’s revenue growth is projected to grow at 48.5% and then drop to 20% growth. So, it is not close to the 59% CAGR that the above calculation expects.
The calculation above is probably the »if all goes perfect« bull case assessment and, for my taste, is too optimistic as it presumes a complete replacement cycle of cloud IT infrastructure. It also assumes no competition for Nvidia. But what is more important, in my view, is that it assumes that scaling laws for pre-training continue to hold, which I don’t think is the case.
For now, I believe the reality is that because we are shifting from pre-training scaling to post-training and inference scaling, the unknowns are still too big to estimate Nvidia's future growth correctly. The inference market will be crucial, and here, the demand/supply and competition landscape seems much different than the pre-training market. Customers are also much more price sensitive on inference as the amount of inference needed is in orders of magnitude bigger than the compute for training. So, while revenue and demand might grow at a high clip, keeping the 55% net income margin is the biggest challenge for Nvidia.
In summary, Nvidia's quarter was really strong; seeing this incredible demand for Hopper and H200 still lasting despite Blackwell ramping up is a rare thing to witness, but at the same time, from the call and the recent conversations in the AI space, my take remains that we are in a period where essential shifts are happening in terms of the compute and architecture and Nvidia's position in this new shift is strong but less strong than it was before. The expectations for the company and its investor base at $3.6T are not small.
As always, I hope you found this insightful, and until next time.
PS: If you are interested in the impact of AI on the investment industry and the investment research process, I kindly invite you to join this free webinar that Patrick O’Shaughnessy from the Invest Like The Best podcast is hosting:
If you found this article insightful, I would appreciate it if you could share it with people you know who might find it interesting. Thank you!
Disclaimer:
I own Amazon (AMZN), Meta (META), TSMC (TSM), and Oracle (ORCL) stock.
Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer's opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.