<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[UncoverAlpha]]></title><description><![CDATA[Deep dives/analyses of the technology companies and tech sub-industries. Mostly about AI, semiconductor, cloud, software, and ad tech sectors.]]></description><link>https://www.uncoveralpha.com</link><image><url>https://substackcdn.com/image/fetch/$s_!YsyF!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28f3c972-e191-4eec-857d-3ccfda10a107_227x227.png</url><title>UncoverAlpha</title><link>https://www.uncoveralpha.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 13 Jun 2026 14:37:35 GMT</lastBuildDate><atom:link href="https://www.uncoveralpha.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Rihard Jarc]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[uncoveralpha@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[uncoveralpha@substack.com]]></itunes:email><itunes:name><![CDATA[UncoverAlpha]]></itunes:name></itunes:owner><itunes:author><![CDATA[UncoverAlpha]]></itunes:author><googleplay:owner><![CDATA[uncoveralpha@substack.com]]></googleplay:owner><googleplay:email><![CDATA[uncoveralpha@substack.com]]></googleplay:email><googleplay:author><![CDATA[UncoverAlpha]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Memory per Token Optimizations. Where are we, and how much more room do we have?]]></title><description><![CDATA[Software optimization techniques for lowering HBM usage and give my estimates on how much of that optimization is already captured, how much reduction these optimizations could still offer, hardware angles from SRAM accelerators like Groq and Cerebras, as well as the decode and prefill split up of hardware, and what that opens up.]]></description><link>https://www.uncoveralpha.com/p/memory-per-token-optimizations-where</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/memory-per-token-optimizations-where</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 11 Jun 2026 13:12:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Q2-s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>In the AI compute buildout phase, especially for inference, which is where the industry is shifting most of its capex and where the actual revenue gets generated, the binding constraint is increasingly not compute. It&#8217;s memory. Capacity and bandwidth.</p><p>I covered the supply side of this story in my memory cycle article &#8221;<a href="https://www.uncoveralpha.com/p/every-memory-cycle-ends-the-same">Every Memory Cycle Ends the Same. Until It Doesn&#8217;t</a>,&#8221; where I argued that HBM has turned memory from a gadget component into a raw input for intelligence. In that article, I also wrote that the real risk for the memory cycle is a technical breakthrough that would require orders of magnitude less memory. Today&#8217;s article is about the other side of that exact coin: what the labs and inference providers are doing in software and model architecture to need less memory per token, how much of that optimization potential is already captured, and what it means for the hardware stack, including some unconventional setups like using depreciated H100s and A100s as dedicated decode machines.</p><p>The reason this matters now and not in two years is simple: companies are starting to hit their token spend limits. Agentic workloads (coding agents, research agents, computer-use agents) consume tokens at a rate that makes the chatbot era look like a rounding error. A single coding agent session can chew through millions of tokens of context and companies are increasingly becoming frustrated with it.</p><p>In this article, I cover software optimization techniques for lowering HBM usage and give my estimates on how much of that optimization is already captured, give my view on the best solution, and how much reduction it could offer, and also cover hardware angles from SRAM accelerators like Groq and Cerebras, as well as the decode and prefill split up of hardware and what that opens up.</p><p>Let&#8217;s start.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q2-s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q2-s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!Q2-s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!Q2-s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Q2-s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q2-s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2009669,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/201591402?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q2-s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!Q2-s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!Q2-s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Q2-s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F328e7b20-36b8-4545-85e3-1a6423897e9e_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Prefill and decode: the two jobs inside every AI request</strong></p><p>To understand why memory is the bottleneck, you first need to understand that every LLM request is actually two completely different workloads stapled together: prefill and decode.</p><p><strong>Prefill</strong> is what happens when you send the model your input: the prompt, the document, the codebase, the conversation history. The model reads all of it and processes every input token in parallel, in one big pass. Think of it as an analyst who gets handed a 300-page data room before a meeting. He reads the whole thing in one sitting, takes structured notes on every page, and files those notes away. This is brute-force work; the limiting factor is how fast his brain works, not how fast he can pull pages out of the binder. In GPU terms, prefill is compute-bound: the chip&#8217;s arithmetic units (the FLOPs) are the bottleneck, and the memory system can keep up.</p><p><strong>Decode</strong> is what happens when the model generates its answer, one token at a time. To generate each single token, the model has to read essentially all of its weights from memory, plus all of the notes it took during prefill (the so-called KV cache, more on this later in the article). Then it does a comparatively tiny amount of math, produces one token, and does the entire memory read again for the next token. Our analyst is now in the meeting, and before he speaks each individual word, he has to re-skim his entire stack of notes. The bottleneck is no longer his brainpower. It&#8217;s how fast he can flip pages. Decode is memory-bandwidth-bound.</p><p>You can put numbers on this. An Nvidia H100 delivers roughly 989 TFLOPS of dense BF16 compute against 3.35 TB/s of HBM3 bandwidth. That ratio means the chip needs to perform roughly ~295 floating point operations for every byte it pulls from memory just to keep its compute units fed. Prefill, processing thousands of tokens in parallel, easily clears that bar. Decode doesn&#8217;t come close: roofline analyses of autoregressive generation put its arithmetic intensity at roughly 1 FLOP per byte, about two orders of magnitude below the compute-bound ridge point. In plain English: during decode, the most expensive compute engines on the planet sit idle 95%+ of the time, waiting for memory.</p><p>The simplest illustration: a 70B parameter model in FP16 is ~140GB of weights. On an H100 with 3.35 TB/s of bandwidth, the theoretical single-user decode ceiling is about 3,350/140 = 24 tokens per second. On an A100 with 2 TB/s, it&#8217;s about 14 tokens per second. Notice what&#8217;s not in that equation: FLOPs. You could double the H100&#8217;s compute and single-stream decode speed wouldn&#8217;t move at all. The only lever is bandwidth, or reading fewer bytes.</p><p>This is the &#8220;memory wall,&#8221; and it&#8217;s also why a lot of inference revolves around batching. If reading 140GB of weights produces one token for one user, that&#8217;s terrible economics. If the same 140GB read produces one token each for 200 users simultaneously, your cost per token just dropped ~200x. The weights are read once, amortized across the batch. So the game every inference provider plays is: cram as many concurrent users as possible onto each GPU. And what limits how many users you can cram on? Memory capacity. Because every user brings their own luggage.</p><p><strong>The luggage: KV cache, and why agents made it explode</strong></p><p>That luggage is the KV cache. During prefill, the model stores intermediate &#8220;key&#8221; and &#8220;value&#8221; representations of every token in the context, so that during decode it doesn&#8217;t have to re-process the whole prompt for every new token. Those are the analyst&#8217;s notes. The catch: the notes grow linearly with context length, and they have to sit in the same precious HBM as the model weights.</p><p>In a standard transformer, Llama 3.1 405B needs 516 KB of KV cache per token of context; Qwen-2.5 72B needs 327 KB per token. Run that forward: a single user with a 128K-token context on a 70B-class model is carrying roughly 40GB of KV cache. At one million tokens of context, a Llama-70B-scale model would need ~330GB in BF16 for the KV cache alone, which doesn&#8217;t fit in any single GPU. For reference, the H100 has 80GB total. The B300 has 288GB.</p><p>In the chatbot era, this was manageable because most conversations were a few thousand tokens. Agentic workloads change that. An agent doing a long coding task holds the repo, the tool outputs, the execution traces, the full plan, all of it, in context, for hours. Context lengths of 100K-1M tokens went from research demo to daily production workload. And every one of those tokens occupies HBM for the entire duration of the session. The KV cache, not the model weights, becomes the dominant consumer of memory. Which means batch sizes collapse, which means the weight-read amortization collapses, which means cost per token explodes.</p><p>So what is the industry doing about it? A lot, actually. Let&#8217;s go through the software optimization stack, and, importantly for investors trying to model how much efficiency is still on the table, my estimate of how much of each technique&#8217;s potential has already been captured.</p><p><strong>The optimization stack: where we are on each curve</strong></p><p>A quick framing note: the percentages below are my own estimates of &#8220;captured potential&#8221; at the frontier (the major labs and serious inference providers), based on what&#8217;s publicly documented. The long tail of enterprise deployments is far behind the frontier on all of these, which is itself an investment-relevant point: there is a lot of &#8220;free&#8221; efficiency still sitting unused in corporate AI deployments.</p><p><strong>1. Continuous batching </strong></p>
      <p>
          <a href="https://www.uncoveralpha.com/p/memory-per-token-optimizations-where">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Most of the Economy Won't Run on the Best Model]]></title><description><![CDATA[It&#8217;s a thesis about where the money in AI actually goes once we transition to scaled-up AI workloads and how that might look different from today&#8217;s expectations.]]></description><link>https://www.uncoveralpha.com/p/most-of-the-economy-wont-run-on-the</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/most-of-the-economy-wont-run-on-the</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 04 Jun 2026 12:49:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!TR1s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I want to share some of my thoughts on why I think a significant change is going to happen in the AI industry, and the market is blind to it. It&#8217;s a thesis about where the money in AI actually goes once we transition to scaled-up AI workloads and how that might look different from today&#8217;s expectations.</p><p>Let me start with an analogy that I keep coming back to.</p><p>When a company hires an accountant, it does not go out and hire a PhD in pure mathematics to reconcile the ledgers. Not because the PhD couldn&#8217;t do it &#8212; they obviously could, and probably faster &#8212; but because it makes no economic sense. The PhD is overqualified, which is just another way of saying they are too expensive for the value the task produces. The economic output of bookkeeping is capped. There is only so much upside in getting the books done. So you hire the cheapest person who clears the quality bar, and you pocket the difference. And you can take this analogy and apply it to multiple other jobs.</p><p>Now flip it. If you are running a drug-discovery program, you absolutely want the PhD &#8212; in fact you want five of them, plus a Nobel laureate consulting on the side. Why? Because the economic output of a single discovery is enormous, almost unbounded. The expected value of a breakthrough is measured in tens of billions, so the cost of the smartest possible person working on it rounds to zero against the prize. Here, intelligence is the only thing that matters, and cost is an afterthought.</p><p>This is, I think, exactly how the AI model market is going to bifurcate. And we are right at the inflection point where it starts to happen.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TR1s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TR1s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!TR1s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!TR1s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!TR1s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TR1s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1801680,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/200606531?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TR1s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!TR1s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!TR1s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!TR1s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51126957-2f3d-4f00-be4b-a267700e792b_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The metric is not intelligence. It&#8217;s intelligence per dollar</strong></p><p>Today, essentially everyone uses the state-of-the-art (SOTA) model for everything. You want to summarize an email? SOTA model. Classify a support ticket? SOTA model. Extract three fields from an invoice? SOTA model. We do this for one simple reason: the frontier models have only just crossed the threshold of being broadly truly impactful for knowledge work, and when something has only just started working, you reach for the best version of it you can find. You don&#8217;t optimize cost on a capability you weren&#8217;t sure you had last quarter.</p><p>But I believe this is a transitional behavior, not a stable equilibrium. And there are a few reasons for that.</p><p>The Stanford HAI AI Index found that the inference cost for a system performing at GPT-3.5 level dropped more than 280-fold between November 2022 and October 2024 &#8212; from roughly $20 per million tokens to about $0.07. Andreessen Horowitz, looking at the same phenomenon across the whole performance spectrum, concluded that for a model of equivalent performance, cost falls by roughly 10x every year &#8212; faster than compute fell during the PC revolution, faster than bandwidth fell during the dotcom build-out. Epoch AI, slicing it by benchmark, found the price to hit GPT-4-level performance on PhD-level science questions fell by about 40x per year, with the range across benchmarks running anywhere from 9x to 900x annually. We just had Sara Friar, OpenAI&#8217;s CFO on the All in conference say the following:</p><div class="pullquote"><p>&#8220; The good news on compute is that there is a massive deflationary curve on cost, right? From ChatGPT... uh, [GPT] 4 to 5.4, I think the deprecation of cost was something like 97%. It&#8217;s like kind of an amazing curve, actually..but that happened in like two years. &#8220;</p></div><p>Pick whichever number you find least aggressive. They all say the same thing: the capability you are paying a premium for today becomes nearly free in about a year.</p><p>On top of it, many companies and enterprises are starting to burn through their annual planned token consumption in just a few months. This is a trend that is accelerating, and I have been hearing it all across the industry. Yesterday, there was a comment published from Sam Altman saying:</p><div class="pullquote"><p>&#8220;Probably the second biggest theme is around cost. People are really saying, that&#8217;s kind of become a meme now, but &#8220;my company spent my entire 2026 budget in Q1. Can you make this more efficient?&#8221;...that went from at the beginning of this year, an issue that never came up - I know people were totally happy with the amount they were spending - to all of a sudden a huge issue&#8221;.</p></div><p>What this means is that companies have no choice but to optimize costs, and that will soon mean using models other than the SOTA model for specific tasks.</p><p>The second force is that the frontier itself is getting smaller, not just cheaper. Epoch AI has pointed out that frontier models are now roughly an order of magnitude smaller in parameter count than GPT-4 was, because once inference becomes the dominant cost, you stop training huge models and start over-training small ones on far more data. Distillation compounds this: a teacher model&#8217;s capability gets compressed into a student, a fraction of its size. This is exactly what Meta is doing internally when using their AI models to power their ads and content platform. The student model is the one that is applied at scale, and it distilled its knowledge from the teacher model.</p><p>So putting all of these together. We have a rapidly falling price for any given level of capability and frontier that is already shrinking in size in terms of what is actually being deployed, and we have companies burning through their annual token budgets in a matter of months.</p><p>As such, I believe that for the overwhelming majority of economically valuable knowledge work, the correct model is not the SOTA model. It&#8217;s the cheapest model that clears the task&#8217;s quality bar. And as pilots move into full production (which is the stage we are in today) &#8212; where you&#8217;re suddenly paying for millions or billions of tokens a day instead of running a demo &#8212; intelligence-per-dollar becomes the only metric that survives contact with a CFO.</p><p>At the same time, the SOTA model and its use case don&#8217;t disappear. It goes where the economic ceiling is unbounded: frontier R&amp;D, drug discovery, novel mathematics, and the hardest agentic reasoning chains. But that is a smaller slice of the token volume in terms of our current economy. The accountant&#8217;s quadrant &#8212; classification, extraction, summarization, routine code, customer support, the boring profitable middle of the economy &#8212; is where the majority of tokens actually are, and that quadrant is going to run on cheaper, distilled, often fine-tuned, frequently &#8220;older&#8221; models.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>The investment angle: the money moves to the owners of installed compute</strong></p><p>If the thesis above is right, where does the capital go?</p><p>The intuitive answer &#8212; the one the market is currently screaming &#8212; is &#8220;buy the picks and shovels.&#8221; Buy Nvidia, buy Broadcom, buy the ASIC co-designers, buy memory,  anything that sells new compute. But my view is that while that sector still might do well, there is a different part of the tech stack that will benefit even more, looking at the rate of change from the current state.</p><p>The sellers of new compute (semis) are only winners in a world of continued high-cadence spending on new compute. And my thesis specifically questions whether that cadence is necessary. So let me lay out the two states the world can be in, because the asymmetry between them is the whole argument.</p><p><strong>Scenario 1: Capex falls or stabilizes.</strong> If you can squeeze an order of magnitude more useful tokens out of the hardware you already own &#8212; because models got smaller, cheaper, more efficient and verticalized &#8212; then you no longer need to spend $100bn+ every single year just to stay relevant. In this world, the owners of the installed base win and the sellers of new compute lose. Hyperscaler free cash flow inflects sharply upward, because capex was the one thing suppressing it. Multiples re-rate higher as the cloud business converts from a capex incinerator into a cash machine running largely paid-for, partly-depreciated hardware. And the semis de-rate, because the market finally realizes the upgrade treadmill has slowed.</p><p><strong>Scenario 2: Capex stays high &#8212; and revenue explodes.</strong> This is the Jevons-paradox-on-steroids case. Demand is so strong that hyperscalers do both: they extract enormous output from cheap, long-lived existing hardware and keep buying new gear. Here everyone wins at once &#8212; but the hyperscalers win more, because their incremental revenue now lands on a cost base that is partly depreciated and dramatically more efficient per token. Operating leverage goes vertical.</p><p>The interesting thing is that the market is currently priced for neither. The market is currently pricing only a future in which CapEx continues to grow for the foreseeable future and the semiconductor industry benefits, but at the same time, the hyperscalers are making a losing bet with spending on this CapEx, as the market is questioning the return on that spend.</p><p>I think this market premise is very wrong, as we are actively transitioning to production-scale AI workloads where the economics are different from those in the pilot world, where we mostly lived for the last few months.</p><p>As always, I hope you found this article valuable. I would appreciate it if you could share it with people you know who might find it interesting. I also invite you to become a paid subscriber, as paid subscribers get additional articles covering both big tech companies in more detail, as well as mid-cap and small-cap companies that I find interesting.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe&quot;,&quot;text&quot;:&quot;Subscribe to Paid&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe"><span>Subscribe to Paid</span></a></p><p>Thank you!</p><p><strong>Disclaimer:</strong></p><p>I own hyperscaler Meta (META), Amazon (AMZN), Microsoft (MSFT), Google (GOOGL) stock.</p><p>Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer&#8217;s opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.</p>]]></content:encoded></item><item><title><![CDATA[The Harness: The Moat for AI Model Providers?]]></title><description><![CDATA[The real moat that the frontier labs are building right now is not the model. It is the harness around the model and the cost of serving the model.]]></description><link>https://www.uncoveralpha.com/p/the-harness-the-moat-for-ai-model</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-harness-the-moat-for-ai-model</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Fri, 22 May 2026 14:03:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!aul4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I want to walk through what is, in my opinion, an important structural shift happening in AI right now.</p><p>For years, the way we value AI model providers is via benchmarks based on the raw performance of these models. The narrative has been &#8220;whoever has the best benchmark model wins&#8221;</p><p>I think this framing is increasingly becoming wrong. The real moat that the frontier labs are building right now is not the model. It is the harness around the model and the cost of serving the model. In this article, I will focus on the harness. And once you understand what a harness is, you start to see why Anthropic&#8217;s enterprise revenue keeps compounding even when its raw benchmark scores are not always the best, and why the labs that own both the model and the harness are setting themselves up for the kind of platform lock-in that historically produced high gross margin businesses.</p><p>Let start.</p><p><strong>What is a harness?</strong></p><p>You can picture a harness as this entire system that wraps the model and turns it from a text generator into something that can do real work. Tools, memory, system prompts, permission policies, sandboxes, subagent dispatch, context management, the agent loop itself.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aul4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aul4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 424w, https://substackcdn.com/image/fetch/$s_!aul4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 848w, https://substackcdn.com/image/fetch/$s_!aul4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 1272w, https://substackcdn.com/image/fetch/$s_!aul4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aul4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png" width="953" height="828" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7c689d9-6577-4899-aa84-70bac90561a3_953x828.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:828,&quot;width&quot;:953,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73021,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/198827305?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aul4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 424w, https://substackcdn.com/image/fetch/$s_!aul4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 848w, https://substackcdn.com/image/fetch/$s_!aul4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 1272w, https://substackcdn.com/image/fetch/$s_!aul4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7c689d9-6577-4899-aa84-70bac90561a3_953x828.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The cleanest one-line definition comes from LangChain: &#8220;Agent = Model + Harness. If you&#8217;re not the model, you&#8217;re the harness.&#8221;</p><p>A useful analogy is to think of a model as a very smart contractor who can only communicate through notes. A harness is the tool setup &#8212; the computer, the phone, the filing cabinet, the rulebook, the permissioning &#8212; that lets the contractor actually do the job.</p><p>Mechanically, even the most sophisticated harnesses are just a while loop. The model emits a tool call, the harness executes it, the result gets fed back, the loop continues. Anthropic literally calls their runtime a &#8220;dumb loop&#8221; where all the intelligence lives in the model. But the complexity is in everything that loop manages: which tools exist, what their schemas look like, what gets injected into context on each turn, what gets compacted away, what gets remembered across sessions, how subagents are spawned, how errors are recovered, when the model is allowed to take a destructive action without asking. None of that is &#8220;the model.&#8221; All of it determines whether your agent actually finishes the task or burns 100,000 tokens going in circles. This has effects both on the quality of the output and on costs.</p><p><strong>The empirical evidence: the harness is moving benchmarks even more than the model is in some cases</strong></p><p>We now have some hard data showing that the harness can move model performance by more than a full model generation upgrade.</p><p>Three independent data points from the last six months, all converging on the same conclusion:</p><p><strong>Data point 1 - LangChain on Terminal-Bench 2.0.</strong> Terminal-Bench 2.0 is now the standard benchmark for evaluating coding agents on real terminal tasks (89 tasks across machine learning, debugging, biology, infrastructure). LangChain kept the model fixed (GPT-5.2-Codex) and only changed the harness. Score went from <strong>52.8% to 66.5%</strong>. That moved them from outside the Top 30 to <strong>Top 5</strong> on the leaderboard. Same model, same weights, different scaffolding. A 13.7 percentage point jump from harness alone.</p><p><strong>Data point 2 - Cursor on Terminal-Bench 2.0.</strong> Cursor&#8217;s research team published a piece on April 30, 2026 reporting that they took their own coding agent from <strong>Top 30 to Top 5</strong> by only changing the harness. Same conclusion, different team, different harness. A 25-position jump on a public leaderboard, attributable to scaffolding alone.</p><p><strong>Data point 3 - Claude Opus 4.6, same weights, very different harnesses.</strong> This is the cleanest one. Look at the Terminal-Bench 2.0 leaderboard from late April 2026:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HNYZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HNYZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 424w, https://substackcdn.com/image/fetch/$s_!HNYZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 848w, https://substackcdn.com/image/fetch/$s_!HNYZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 1272w, https://substackcdn.com/image/fetch/$s_!HNYZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HNYZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png" width="787" height="140" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/848847aa-07f2-48ee-947b-f03541587073_787x140.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:140,&quot;width&quot;:787,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14503,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/198827305?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HNYZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 424w, https://substackcdn.com/image/fetch/$s_!HNYZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 848w, https://substackcdn.com/image/fetch/$s_!HNYZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 1272w, https://substackcdn.com/image/fetch/$s_!HNYZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F848847aa-07f2-48ee-947b-f03541587073_787x140.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Same model. Different harnesses. A 4.5 percentage point spread<strong> </strong>between ForgeCode and Capy, on a benchmark where teams are fighting for tenths of a point.</p><p>The interesting observation is that the<strong> </strong>labs that trained the models do not always have the best harness for their own models, which I believe shows just how much untapped potential there still is in this area. ForgeCode is a third-party harness, and it lands three of the top six entries on Terminal-Bench 2.0 by routing across model families. LangChain summarized this as: &#8220;Opus 4.6 in Claude Code scores far below Opus 4.6 in other harnesses.&#8221; Anthropic&#8217;s flagship model in Anthropic&#8217;s flagship harness gets beaten by the same weights running in third-party scaffolding.</p><p>Now compare this to model generation upgrades. Going from Claude Opus 4.5 (80.9% on SWE-bench Verified) to Opus 4.7 in April 2026 took SWE-bench Verified from 80.8% to 87.6% &#8212; about a 6.8-point upgrade. The harness can move you 13.7 points on Terminal-Bench from scaffolding alone. The harness is moving the score by more than a model generation upgrade.</p><p>Don&#8217;t get me wrong, I am not trying to say that raw AI model performance doesn&#8217;t matter. The model floor is rising, and that floor matters a lot. The point is that on top of any given model, the harness is now the largest single performance variable in agent quality. Anyone shipping a coding agent in 2026 who picks the model first and the harness second is leaving a lot of performance and cost-effectiveness.</p><p>This section from an interview with a former high-ranking Meta employee is very interesting. It explains just how much more room there is for improvement around the harness, and that because raw model improvements are so fast right now, the resources are not yet focused on it enough:</p><div class="pullquote"><p>&#187;The biggest barrier, I would say it&#8217;s just the modeling evolution is so quick. We want to push the model architecture faster but we don&#8217;t really have the time to really do parameter tuning to find fast architecture for our data or for different use cases. Right now, it&#8217;s more like we have one safe home. That&#8217;s our goal at that time. We have the foundation model. The foundation model powers roughly four or five different orgs&#8217; ranking models.<br><br> Although it&#8217;s a foundation model, it&#8217;s a Wikipedia, knows everything, but still, how to find an optimal maybe adapter or optimization, tuning for each of the five use cases that are under its cloud. I think that&#8217;s one of the biggest challenges. We have to chase our targets and there are some ways to hit a target a little bit easier than really going deep into understanding this model, what the model does and what&#8217;s the best parameter.&#171;</p><p>Source: Former Meta employee found on AlphaSense</p></div><p><strong>Why the harness forms a moat</strong></p><p>It is important to understand that models are post-trained against a specific harness. They are not generic. Every modern frontier model was fine-tuned, with reinforcement learning, against a specific tool surface, a specific schema, a specific memory ritual, a specific citation format, a specific system prompt structure. The model&#8217;s instincts &#8212; what it reaches for when it needs to edit a file, what tag it wraps citations in, how it structures a plan, how it handles subagent dispatch &#8212; are baked into the weights during post-training. And they are baked in against one specific harness.</p><p>The clearest concrete example is the file-editing tool. From Cursor&#8217;s harness team:</p><div class="callout-block" data-callout="true"><p>&#8220;OpenAI&#8217;s models are trained to edit files using a patch-based format, while Anthropic&#8217;s models are trained on string replacement. Either model could use either tool, but giving it the unfamiliar one costs extra reasoning tokens and produces more mistakes. So in our harness, we provision each model with the tool format it had during training.&#8221;</p></div><p>Cursor is saying that the wrong tool format produces a measurable cost in reasoning tokens and an observable increase in error rate, recorded at scale across millions of agent turns. The model performs worse because the wire format it sees at runtime does not match the wire format it was trained on.</p><p>This is becoming a challenge for everyone trying to become an orchestration layer on top of AI model providers like Microsoft. I am not saying it is an impossible task, but you need to have different harnesses for each AI model you use. It also suggests that the products that these AI labs are offering are not just the model, but the model together with the custom harness, as the model and the harness were fused over months of post-training, and you cannot pull them apart without giving up performance. In the end, this might mean the moat becomes stronger for AI labs, as switching AI models becomes increasingly complex.</p><p><strong>Why this is a co-evolution, not a one-time thing</strong></p><p>The more powerful version of the moat argument is that the post-training and the harness feed back into each other on every model generation. This is what is starting to create compounding lock-in over time.</p><p>An example of this is Anthropic&#8217;s harness team ships a new primitive &#8212; say, a smarter subagent dispatch verb, or a new way to compact context, or a memory file convention. By month three, that primitive shows up in millions of real agent traces from Claude Code users. By month six, those traces are training data for the next model generation. By month twelve, the next model has the primitive baked into its instincts, and the harness can now lean on it as something the model does natively.</p><p>Anthropic puts it this way in their March engineering blog: &#8220;every component in a harness encodes an assumption about what the model can&#8217;t do on its own. Those assumptions go stale.&#8221; When a model upgrade kills an old assumption, that piece of scaffolding gets retired, freeing up the harness to push on the next ceiling.</p><p>A concrete example from Anthropic&#8217;s own work: when Claude Sonnet 4.5 was the frontier model, the agent had &#8220;context anxiety&#8221; &#8212; it would wrap up tasks prematurely as it approached what it thought was its context limit. The harness compensated by aggressively resetting context between sessions and using structured handoff artifacts to carry state across the boundary. When Opus 4.6 shipped, that behavior was largely gone in the model itself, so Anthropic dropped the entire context-reset machinery and ran continuous sessions over two hours. The harness shrank because the model swallowed a chunk of its work.</p><p>The matched pair is not static. It moves with each model generation. And the labs that own both sides of the pair are the only ones who can move it cleanly. A third-party harness builder is always reacting to a model release; the labs are designing the next model with the next harness in mind.</p><p>This is genuinely structural moat behavior. It looks a lot like the way Microsoft built lock-in through tightly coupling Windows + Office + Exchange in the 1990s and 2000s - each layer made the others stickier, and competing on any single layer in isolation got harder every year.</p><p>There is a counter-example worth knowing because it sharpens the moat argument. In December 2025, Vercel published an engineering post-mortem: their internal text-to-SQL agent had 16 specialized tools - schema lookup, query validation, error recovery, intent clarification, join-path finders, syntax validators. They deleted 80% of them and replaced everything with a single capability: execute arbitrary bash commands against a file system.</p><p>Results: success rate went from 80% to 100%. Speed improved<strong> </strong>3.5x. Token usage dropped 37%. The worst-case run improved from 724 seconds, 100 steps, and 145,463 tokens (failing) to 141 seconds, 19 steps, 67,483 tokens (succeeding).</p><p>The lesson is more than just that fewer tools are better. It is that the right harness depends on the model&#8217;s current capabilities, and that gap shifts every model generation. When the model gets smart enough to use bash + grep + cat directly, your custom tool layer becomes overhead. GitHub&#8217;s Copilot team hit the same wall from the opposite direction: they cut Copilot&#8217;s tool count from 40+ to 13 core tools, and pre-expansion accuracy jumped from 19% to 72%.</p><p>This is exactly why owning both the model and the harness is so valuable. You can retire scaffolding as your model gets smarter, faster than anyone else can. A third party building on top of your model is always reacting to your release notes. You are designing both sides.</p><p><strong>Where the harness wars are headed</strong></p><p>A few things I am watching that I think will define the next 12-18 months of this space:</p><ol><li><p><strong>The harness becomes the product, not the model.</strong> This is already happening. Anthropic does not sell &#8220;Claude&#8221; anymore as the main enterprise product &#8212; they sell Claude Code, Cowork, the Claude Agent SDK, and Claude Managed Agents. OpenAI sells Codex CLI, the Agents SDK, and the Codex cloud agent. Google just announced that they are deprecating Gemini CLI entirely and replacing it with Antigravity CLI, which shares a unified server-side harness with their Antigravity desktop IDE. The model is the engine; the harness is the car. Customers buy the car.</p></li><li><p><strong>&#8220;Harness-as-a-Service&#8221; (HaaS) is the next API layer.</strong> The Claude Agent SDK, the OpenAI Agents SDK, and Google&#8217;s new Antigravity SDK all point the same way: you do not get a model API anymore, you get a harness API &#8212; the loop, the tools, the context management, the hooks, the sandbox primitives, the memory layer, all out of the box. This is a bigger and stickier surface than &#8220;completions.&#8221;</p></li><li><p><strong>These are important implications for the SaaS companies. </strong>The harness will absorb more of what is currently sold as SaaS. A managed harness with built-in code execution, browser control, file storage, memory, and orchestration is the runtime for AI-native software. That competes directly with a stack of SaaS tools previously bought separately: developer environments, browser automation, vector databases, observability layers, sandboxing services. Anthropic just shipped Claude Managed Agents, which puts the entire harness behind an API.</p></li><li><p><strong>Harnesses will increasingly diverge by vertical.</strong> The highest-quality harnesses today are all coding harnesses, because the ROI is most obvious. The same primitives apply to legal, finance, healthcare, and support. Whoever builds the best vertical harness for a domain will own that domain&#8217;s spend, even if the underlying model is shared. The providers that specialize and build harnesses for specific verticals can overcome even AI model raw performance deficiencies because the harness is better. An example of this would be someone like Meta, specializing in shopping, social media, and the healthcare vertical.</p></li></ol><p><strong>The investment angle</strong></p><p>This is where the moat argument actually matters for capital allocation in terms of sectors and for which public company this might be most important.</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/the-harness-the-moat-for-ai-model">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Market Is Pricing Meta Like It's the AI Loser. Big Mistake.]]></title><description><![CDATA[In this article, I want to walk through why I think the market is making one of its bigger mistakes of this cycle with Meta. The stock is down roughly 24% from its high of $796.25, trading around $602, with a forward P/E of 19x &#8212; well below its 10-year average of around 26-27x and below the S&P 500 multiple. The dominant narrative is that Meta is &#8220;spending too much&#8221; on AI data centers without a clear path to monetization.]]></description><link>https://www.uncoveralpha.com/p/the-market-is-pricing-meta-like-its</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-market-is-pricing-meta-like-its</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Mon, 11 May 2026 14:14:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!E-vS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi everyone,</p><p>In this article, I want to walk through why I think the market is making one of its bigger mistakes of this cycle with Meta. The stock is down roughly 24% from its high of $796.25, trading around $602, with a forward P/E of 19x &#8212; well below its 10-year average of around 26-27x and below the S&amp;P 500 multiple. The dominant narrative is that Meta is &#8220;spending too much&#8221; on AI data centers without a clear path to monetization. Every quarter, the market punishes the stock harder on CapEx prints &#8212; the post-Q1 2026 reaction was a 10% drawdown on a 57% EPS beat, the worst price reaction Meta has had in its last six earnings reports, despite delivering the largest earnings surprise in that window.</p><p>The thesis I lay out has four parts:</p><p>- Meta&#8217;s AI CapEx is already showing up in revenue and engagement in a significant way and more importantly, will continue to do so (expert interview with a Former Meta employee on this field)</p><p>- The data centers are not &#8220;moonshots&#8221; (and far from the metaverse spend analogy) &#8212; they are Meta&#8217;s new workforce, and the more compute Meta has, the more it can improve products and ship new ones;</p><p>- Meta has the single most underappreciated asset in tech right now, which is distribution, and the market is assigning zero value to it;</p><p>- The valuation has compressed to a point where the bar for an aggressive re-rate is much lower than people think.</p><p>Let&#8217;s dive in.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E-vS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E-vS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!E-vS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!E-vS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!E-vS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E-vS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2503432,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/197218225?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E-vS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!E-vS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!E-vS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!E-vS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74d35c16-727c-4e21-b16e-dff47d7ed3b6_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The CapEx that the market hates is already showing up in the P&amp;L</h2><p>Meta did $200.97 billion in revenue in 2025, up 22% YoY, with operating income of $83.28 billion, up 20%. In Q1 2026, they did $56.31 billion of revenue, up 33% YoY &#8212; an acceleration from a $200 billion base. Ad impressions grew 19% YoY and average price per ad grew 12% YoY in Q1 2026, with the Q2 guide of $58-61 billion implying continued acceleration.</p><p>Now think about what that means in context. The bear narrative is that Meta is spending $125-$145 billion in 2026 CapEx with nothing to show for it. But the company is putting up 33% growth on a $200 billion base while also compounding the underlying business with double-digit price-per-ad gains. When advertisers pay more per impression, and impression volume also grows 19%, that shows that the system is getting better at allocating attention to higher-value placements.</p><p>The AI in the P&amp;L is most visible in three specific areas that I want to walk through, because management actually quantified them on the calls.</p><p>First, the ad-ranking models. Meta has been rolling out a model called GEM (Generative Ads Recommendation Model), which is essentially their LLM-style foundation model for ads. On the Q4 2025 call, management said this directly:</p><div class="pullquote"><p><em>In Q4, we doubled the number of GPUs we used to train our GEM model for ads ranking. We also adopted a new sequence learning model architecture, which is capable of using longer sequences of user behavior and processing much richer information about each piece of content. The GEM and sequence learning improvements together grew a 3.5% lift in ad clicks on Facebook and a more than 1% gain in conversions on Instagram in Q4.</em></p></div><p>A 3.5% lift in ad clicks on a base of roughly $200 billion is multiple billions of incremental dollars, and that&#8217;s from one model improvement in one quarter.</p><p>Meta also said on the Q3 2025 call that GEM is now &#8220;4x more efficient at driving ad performance gains&#8221; compared to the original ranking models. This is exactly the scaling law dynamic you want to see &#8212; more compute thrown at the model translates to more revenue, and the elasticity of that conversion is improving, not deteriorating.</p><p>A <a href="https://www.alpha-sense.com/uncoveralpha/">recent interview</a> with a high-ranking former Meta employee who worked in this field was very useful. He explained some details about Meta&#8217;s internal metric called Internal Revenue per Engagement (iREV). According to him, Meta&#8217;s internal goal is a minimum 1.5-2% improvement of that metric every 6 months. So far, they have been delivering on this metric, and he is very confident that going forward, Meta will continue to be able to deliver that as they have a lot of room for improvement in three areas that affect iREV: model architecture, more data, and transfer learning. He even quantified it: 30-40% from model architecture changes, 20-30% from more data, and 30-40% from transfer learning. While model architecture and more data are quite straightforward, transfer learning is something many of you might not be familiar with. To explain the context as simply as possible, because serving a large LLM model across the scale of Meta is too expensive, they had to figure out a structure where they had a teacher LLM (the big one) and a student LLM who basically distilled knowledge from the teacher model to be smaller and more effective to run inference on at the scale Meta needs it. So improving data transfer between the teacher and the student model for Meta is key, as it will be for any other company running large-scale production workloads (the phase of AI adoption we are now in). So every time Meta makes improvements in this realm, it translates to better ad performance and rankings, as performance is essentially determined by the quality of the student model.</p><p>What I found really insightful was when the person was asked about what the biggest barrier to improving even faster in terms of ad ranking and performance beyond the 2% per 6 months was:</p><div class="pullquote"><p>&#187;The biggest barrier, I would say it&#8217;s just the modeling evolution is so quick. We want to push the model architecture faster but we don&#8217;t really have the time to really do parameter tuning to find fast architecture for our data or for different use cases. Right now, it&#8217;s more like we have one safe home. That&#8217;s our goal at that time. We have the foundation model. The foundation model powers roughly four or five different orgs&#8217; ranking models.<br><br>Although it&#8217;s a foundation model, it&#8217;s a Wikipedia, knows everything, but still, how to find an optimal maybe adapter or optimization, tuning for each of the five use cases that are under its cloud. I think that&#8217;s one of the biggest challenges. We have to chase our targets and there are some ways to hit a target a little bit easier than really going deep into understanding this model, what the model does and what&#8217;s the best parameter.<br><br>Even what I said, what&#8217;s the teacher model capacity and the student model capacity ratio? What&#8217;s the optimal ratio between the two models? That even what was studied during my time there. I feel that at that time, the big challenge, we are just chasing the whole statistically or too aggressively and ignore those foundational things, all those long-term things a little bit less.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>What this means is that because the model performance is moving so fast, internally, Meta hasn&#8217;t even had enough time and resources to pull or spend time to optimize other levers, which shows us just how early we still are in these improvements and how much more growth from this ad AI tailwind Meta has for its core business, not just from scaling laws and model improvements but also from the data and student teacher mechanizm.</p><p>Second, engagement. On the Q3 2025 call, Zuckerberg said:</p><div class="pullquote"><p><em>Across Facebook, Instagram, and Threads, our AI recommendation systems are delivering higher quality and more relevant content, which led to 5% more time spent on Facebook in Q3 and 10% on Threads. Video is a particular bright spot, with video time spent on Instagram up more than 30% since last year. As video continues to grow across our apps, Reels now has an annual run rate of over $50 billion.</em></p></div><p>A 5% lift in time spent on Facebook &#8212; a 21-year-old app that everyone wrote off as dead &#8212; is huge. The platform was supposed to be in decline. AI ranking systems brought it back. In Q4 2025, the optimizations drove a 7% lift in views of organic feed and video posts on Facebook, which Susan Li called &#8220;the largest quarterly revenue impact from Facebook product launches in the past two years&#8221;.</p><p>Third, the end-to-end AI ad tools. The annual run-rate of revenue going through Meta&#8217;s fully AI-powered ad tools (Advantage+) passed $60 billion on Q3 2025. The video generation tools alone hit a $10 billion run rate by Q4 2025, with quarter-over-quarter growth outpacing the broader ad revenue increase by nearly 3x. Click-to-WhatsApp ads grew revenue 60% YoY in Q3. None of this exists without the AI infrastructure that the market is currently punishing the company for building.</p><p>The way to think about this is the same logic I laid out in my <a href="https://www.uncoveralpha.com/p/the-market-hates-big-cloud-spending">February article</a> about the hyperscalers: the CapEx Meta is spending this year doesn&#8217;t show up in this year&#8217;s revenue. A data center takes around 2 years to build and operationalize, so the revenue acceleration Meta is showing today is the return on 2023 CapEx (~$28 billion), not 2025 CapEx ($72.2 billion). When 2025&#8217;s CapEx starts showing up in 2027 revenue, the operating leverage will be far more aggressive than what we&#8217;re seeing now.</p><h2>Data centers are Meta&#8217;s new workforce</h2><p>Here is the part I think most investors are missing, and it&#8217;s where the analogy needs to shift.</p><p>For the last 15 years, Meta&#8217;s growth engine was: hire engineers, ship product, get more users, monetize via ads. Headcount was the input that scaled output. That framework is now different, because the marginal unit of &#8220;intelligence&#8221; inside Meta is no longer an engineer. It&#8217;s a GPU.</p><p>Zuckerberg essentially said this on the Q3 2025 call when he framed Meta&#8217;s strategy around three &#8220;giant transformers&#8221; running Facebook, Instagram, and ads, with the goal of merging them into one unified system:</p><div class="pullquote"><p><em>At the same time, we&#8217;re also working on combining these three major AI systems into a single unified AI system that will effectively run our family of apps and business &#8212; using increasing intelligence to improve the trillions of recommendations that it will make for people every day.</em></p></div><p>Meta is openly saying that the entire company &#8212; the feeds, the ads, the recommendations across 3.56 billion daily users &#8212; is going to be run by AI systems whose performance scales with compute. The CapEx number is the headcount number for the AI era.</p><p>And here is the kicker &#8212; Meta is compute-starved on the current business. Zuckerberg said it directly on Q3 2025:</p><div class="pullquote"><p><em>We are sort of perennially operating the Family of Apps and ads business in a compute-starved state at this point, which is on the one hand sort of an odd thing to say, given the compute that we built up. But we really are taking a lot of the resources and using them to advance future things that we&#8217;re doing. And we think that there&#8217;s a lot more compute that we could put towards these that would just unlock a huge amount of opportunity in the core business as well.</em></p></div><p>Meta CFO Susan Li doubled down on this on the same call:</p><div class="pullquote"><p><em>We&#8217;re certainly seeing that we wish we had more capacity today than we do. We would be able to put it towards good use, certain not only would the MSL team appreciate having more capacity, but we&#8217;d be able to put it towards good and ROI positive use in the core business as well.</em></p></div><p>This is not a company building speculative infrastructure for products that might monetize in 2030. This is a company that has more profitable use cases for compute than it has compute, and is rationing GPU hours between training the next frontier model and improving the ranking systems that drive a quantifiable lift in conversions every quarter. Investors who treat this CapEx like a moonshot are misreading the situation.</p><h2>Meta&#8217;s distribution is Slept on</h2><p>There is a more general thesis floating around in the market right now that says: &#8220;If AI commoditizes, then everyone with compute can build products, from software to something like a Meta platform.&#8221; I think this gets the second-order logic backward. If models commoditize and anyone with enough GPUs can ship a product, then the question becomes: who can get that product in front of users? Distribution becomes the bottleneck, not the model. And Meta&#8217;s distribution machine is arguably the single biggest in the world.</p><p>Meta&#8217;s Family of Apps had 3.56 billion daily active people in March 2026. Instagram crossed 3 billion monthly active users in September 2025. WhatsApp also has over 3 billion users across 180+ countries. Facebook still serves billions of people daily. That is unmatched at this scale anywhere in tech &#8212; Google has Search, but the engagement-per-user profile is fundamentally different (people come to Search, do a query, leave; people stay on Meta apps for 30+ minutes a day).</p><p>Here&#8217;s how that distribution muscle has shown up historically. Meta bought Instagram for $1 billion in 2012. It is now a 3-billion-MAU asset that is the cultural center of gravity for an entire generation. They bought WhatsApp for $19 billion in 2014. It has more than 6x to over 3 billion users. They built Threads from scratch in mid-2023 &#8212; a product that, frankly, was not particularly differentiated from X &#8212; and rode Instagram&#8217;s social graph to 400 million MAUs and 150 million daily actives in roughly 2.5 years. Similarweb data shows Threads passed X in daily mobile active users in January 2026.</p><p>Threads was a clone. The product was almost identical to X. There was nothing technically novel about it. And in 2.5 years, by being plugged into Instagram&#8217;s distribution graph, it overtook a 19-year-old product with deep cultural roots. The question every investor should ask themselves: what other company on earth could have done that?</p><p>Now apply this to AI products. Meta AI hit 1 billion monthly active users by May 2025, doubling from 500 million in roughly 8 months, and let&#8217;s be honest, the product wasn&#8217;t even good. ChatGPT took roughly 2 years to reach similar scale. Meta did it by embedding the assistant into search bars and chat interfaces inside WhatsApp, Instagram, Facebook, and Messenger. Roughly 63% of Meta AI&#8217;s usage comes from WhatsApp alone. Meta did not need to convince anyone to download an app, learn a new interface, or change a habit. The distribution infrastructure was already there.</p><p>If you believe &#8212; and I do &#8212; that the next phase of AI is going to produce a wave of consumer products (AI-generated content, personalized AI assistants, business AI, voice agents, creator tools, AI shopping experiences), then the company that can ship each of those products to 3.56 billion people on day one has a structural advantage that the market is not pricing in. Zuckerberg said it himself on the Q3 2025 call:</p><div class="pullquote"><p><em>I would guess that Meta has the best track record of any company out there of taking a new product that people love and getting it to billions of people in terms of usage. So I think that the ability to plug in leading models is going to, I would predict, lead to a very large amount of use of these things over the coming years.</em></p></div><p>The market is essentially treating Meta as if distribution is free. It&#8217;s not free. It is the single hardest moat to build in consumer technology, and Meta is the only company that has built three of them in parallel (Facebook, Instagram, WhatsApp), then bolted on a fourth (Threads) using the first three as the launchpad.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>The AI sentiment changes on a dime</h2><p>The market right now is running on AI sentiment more than fundamentals. Companies are being bucketed as &#8220;AI winners&#8221; or &#8220;AI losers&#8221;, and the valuation gaps between those buckets are enormous. The reason is that the marginal flows of capital in public markets are still controlled by investors with financial-domain expertise but a relatively shallow understanding of how AI actually works at the technical level. So the signal that gets weighted most heavily is: did this company ship a frontier model? Did they show up on the benchmark leaderboards?</p><p>This is exactly the gap that creates opportunity. Meta released Muse Spark on April 8 &#8212; the first model from Meta Superintelligence Labs (MSL). Muse Spark scored 52 on the Artificial Analysis Intelligence Index, behind Gemini 3.1 Pro and GPT-5.4 (both at 57) and Claude Opus 4.6. On absolute benchmark terms, it&#8217;s not SOTA.</p><p>But look at what it is good at. Muse Spark used only 58 million output tokens on the Intelligence Index evaluation, versus 157 million for Claude Opus 4.6 and 120 million for GPT-5.4 &#8212; meaning Meta is delivering near-frontier intelligence with less than half the inference compute of competitors. Meta also said the model achieves the same capability level as the older mid-size Llama 4 Maverick using an order of magnitude less compute. For a company that&#8217;s about to deploy this model to 3 billion daily users, inference efficiency at this scale is a multi-billion-dollar economic advantage. And the model is particularly strong in vision, health, and what Meta is calling &#8220;personal intelligence&#8221; use cases &#8212; exactly the domains that map onto consumer apps.</p><p>Now think about what happens when the larger frontier model lands. Meta has been working on a next-generation flagship, a bigger model internally named the &#8220;Watermelon&#8221; model. If Meta lands a model that is genuinely competitive on benchmarks with frontier models from Anthropic, OpenAI, and Google, the market will re-rate the company aggressively. The current setup is that Meta is being priced as if it can&#8217;t compete at the frontier. The forward P/E of 19x reflects that. Compare that to Google trading at a meaningfully higher multiple post the TPU/Gemini story repricing in late 2025. The asymmetry is real. If Meta merely catches up to where the market is already pricing Google and other labs like Anthropic, OpenAI, the implied upside is substantial&#8212; and that&#8217;s without assigning incremental value to the distribution moat or to any of the new AI products.</p><h2>The Market hates the data center spend. Meanwhile, everyone else is desperate for compute.</h2><p>The market is currently assigning negative value to Meta&#8217;s data center buildout. Every time CapEx goes up, the stock goes down. Meta&#8217;s CapEx jumped from $39 billion in 2024 to $72 billion in 2025 to a guided $125-145 billion in 2026.</p><p>At the same time, the rest of the AI ecosystem is screaming that we don&#8217;t have enough compute. Anthropic just signed a deal to take all 300+ MW of compute capacity at xAI&#8217;s Colossus 1 data center in Memphis &#8212; roughly 222,000 Nvidia GPUs including H100, H200, and GB200 systems. That deal is worth billions. xAI has effectively pivoted to a neocloud model, renting GPUs to Anthropic. The CoreWeave and Nebius backlogs continue to grow. Oracle&#8217;s cloud business is being capacity-constrained. AWS, Google Cloud, and Azure are all selling everything they have available and have multi-year backlogs.</p><p>So the market believes simultaneously that (a) there is a multi-year compute shortage that will continue at least through 2027, and (b) Meta is wrong to be building data centers and putting negative value to them. These two beliefs cannot both be true. If there is a compute shortage, then Meta&#8217;s buildout&#8217;s terminal value in the worst case should be based on the value of those data centers if Meta sells that compute on the market. Zuckerberg made this point explicitly on the Q3 2025 call:</p><div class="pullquote"><p><em>To date, we keep on seeing this pattern where we build some amount of infrastructure to what we think is an aggressive assumption and then we keep on having more demand to be able to use more compute&#8230; any compute that we don&#8217;t need for that, we feel pretty good that we&#8217;re going to be able to absorb a very large amount of that to just convert into more intelligence and better recommendations in our Family of Apps and ads in a profitable way. Now, I mean, it&#8217;s of course possible to overshoot that, right&#8230; If we do, this is what I mentioned in my comments then we see that there&#8217;s just a lot of demand for other new things that we build internally, externally. Like almost every week, people come to us from outside the company asking us to stand up an API service or asking if we have different compute that they could get from us. And we haven&#8217;t done that yet, but obviously if you got to a point where you overbuilt, you could have that as an option.</em></p></div><p>The fact that the market is ignoring this and assigning a negative value to the CapEx is, in my view, a significant mistake.</p><p>Here&#8217;s why this matters even more: Meta is arguably the only company outside the three hyperscalers (AWS, Microsoft, Google) that has the operational capability to run hyperscale data centers for both training and inference at the level required. They&#8217;ve been operating planetary-scale infrastructure for over a decade. They know how to manage multiple AI accelerators: Nvidia GPUs, AMD GPUs, and custom ASIC (their MTIA). If Meta wanted to offer compute externally tomorrow, the renters lining up would include some of the largest AI labs and enterprises in the world, and the unit economics would look more like AWS than like a cap-on-cost neocloud.</p><h2>Valuation: What should the real value be?</h2><p>The company is </p>
      <p>
          <a href="https://www.uncoveralpha.com/p/the-market-is-pricing-meta-like-its">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Amazon, Google, Microsoft, Meta Q1 earnings: AI profits are here, custom silicon is winning]]></title><description><![CDATA[I argued that the market was wrong to punish big tech for raising CapEx because the returns on the 2023-2024 spend were already showing up in the P&L. This quarter, that argument got significantly stronger. Margins on cloud businesses expanded again, the core ad businesses at Meta and Google went into a higher gear]]></description><link>https://www.uncoveralpha.com/p/amazon-google-microsoft-meta-q1-earnings</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/amazon-google-microsoft-meta-q1-earnings</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 30 Apr 2026 11:59:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!oQvP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>We just got Q1 2026 earnings from Meta, Microsoft, Google, and Amazon, and after spending hours on the calls and the prints, I want to share what I think is the most important takeaway.</p><p>I argued that the market was wrong to punish big tech for raising CapEx because the returns on the 2023-2024 spend were already showing up in the P&amp;L. This quarter, that argument got significantly stronger. Margins on cloud businesses expanded again, the core ad businesses at Meta and Google went into a higher gear (Meta +33% YoY, Google Search +19% YoY - both fastest in years), and we got hard data on what custom silicon is doing to the unit economics of inference.</p><p>There are a few patterns from this earnings season:</p><ul><li><p>The core ad businesses at Meta and Google are accelerating because of AI, not in spite of it</p></li><li><p>Operating margins on cloud are expanding even as AI workloads scale</p></li><li><p>Custom ASICs are no longer a side project - they are the next big business segment</p></li><li><p>The era of &#8220;subsidized&#8221; compute is ending, and we are seeing the first hints of pricing power coming through</p></li><li><p>Compute supply is still the binding constraint</p></li></ul><p>Let&#8217;s get into it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oQvP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oQvP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!oQvP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!oQvP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!oQvP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oQvP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81438849-f526-4f23-a56b-a192a1657845_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2346019,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/195985479?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oQvP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!oQvP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!oQvP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!oQvP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81438849-f526-4f23-a56b-a192a1657845_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Google: Cloud +63%, Search +19%, And The Margin Story</strong></p><p>Google delivered a really strong quarter. Two key numbers: Google Cloud up 63% YoY and the &#8220;old left for dead&#8221; Search up 19% YoY.</p><p>Google Cloud revenue hit $20.0B, growing 63% YoY, an acceleration from the 48% growth we saw in Q4 2025. To put that in context, this is the fastest growth rate Google Cloud has ever posted, and they are doing it on a $70B+ annual run-rate base. Google&#8217;s Cloud backlog also nearly doubled sequentially to $462B, with management telling us:</p><div class="pullquote"><p>&#187;The majority of the backlog is related to typical GCP contracts and we expect to recognize just over 50% of the backlog as revenue over the next 24 months&#171;</p></div><p>That last part is important. A lot of the bear case on backlog numbers is that they are stuffed with one mega-deal that won&#8217;t translate into revenue for years (the Microsoft/OpenAI dynamic). Google is essentially telling investors: half of $462B is coming through the P&amp;L in the next 8 quarters.</p><p>But the bigger story for me on the Google call was the margin commentary on Cloud:</p><div class="pullquote"><p>&#187;Cloud operating income was $6.6 billion, tripling year-over-year and operating margin increased from 17.8% in the first quarter of last year to 32.9%&#171;</p></div><p>The bear thesis on hyperscaler AI workloads has been: &#8220;Yes, revenue is growing, but margins on AI workloads will be lower than legacy cloud workloads&#8221;. If that thesis were correct, you would expect to see operating margin compression at Google Cloud, given that AI workloads are now the majority of new revenue growth there. Instead, we saw the opposite: operating margin nearly doubled YoY, from 17.8% to 32.9%, in a single year. Now it is important to understand that not all of this is direct GCP infrastructure a lot of it is now Gemini products:</p><div class="pullquote"><p>&#187;Our enterprise AI solutions have become our primary growth driver for cloud for the first time. In Q1, revenue from products built on our GenAI models grew nearly 800% year-over-year.&#171;</p></div><p>The part of Cloud that is growing fastest (GenAI products, +800% YoY) is also the part driving the margin expansion. The reason Google can do this is threefold: scale-driven optimization (Google said earlier this year that it reduced Gemini&#8217;s serving cost by 78% in 2025), custom silicon (TPUs), and owning its own frontier model.</p><p>On TPUs, we got another important update:</p><div class="pullquote"><p>&#187;we&#8217;ll begin to deliver TPUs to a select group of customers in their own data centers in the hardware configuration to expand our addressable market opportunity.&#171;</p></div><p>This is a massive strategic shift that I have been writing about for over a year now. Google is essentially saying: the TPU is so valuable to certain customers (think Anthropic, but also potentially Apple, Meta, etc.) that we will sell or lease the chips outside of GCP. This is the same direction Amazon is going with Trainium. The implication is that the cloud providers are starting to compete with NVIDIA directly at the chip layer, not just at the cloud layer.</p><p>And on Search, the Search is dead narrative is looking more distant and distant:</p><div class="pullquote"><p>&#187;Turning to Search. AI continues to drive search usage, and queries are at an all-time high&#171;</p></div><p>19% YoY growth in Search advertising on a $240B+ annual run rate is, candidly, an absurd number.</p><p><strong>Microsoft: The AI Business Hit $37B ARR, Up 123%, And Margins Are Holding</strong></p><p>Microsoft&#8217;s print was a strong quarter, and one of the most important numbers was this:</p><div class="callout-block" data-callout="true"><p>&#187;Our AI business surpassed $37 billion ARR, up 123%.&#171;</p></div><p>Just on this number alone: Microsoft&#8217;s AI business is now larger than ServiceNow, Workday, or Datadog as standalone businesses. And it is growing at 123% YoY. There is no public software company at scale growing faster.</p><p>But the more important nuance came from the management commentary on margins, which is essentially the same story Google told. From Amy Hood:</p><div class="pullquote"><p>&#187;Thanks, Brent. I think -- we&#8217;ve been talking about sort of where this AI business of ours has been in the cycle compared to even the cycle we saw with the cloud, which now seems very long ago. And how margins were actually better and they remained better in our AI business versus where we saw in the cloud transition, looking back.&#171;</p></div><p>I want to highlight this because it is being completely missed by the market. Microsoft is telling us that AI workload margins are BETTER than the early cloud workload margins were.</p><p>On Copilot, the seat numbers continued to ramp and were much better than the quarter before:</p><div class="pullquote"><p>&#187;In knowledge work, it was another record quarter for Microsoft 365 Copilot seat ads, which increased 250% year-over-year, representing our fastest growth since launch. Quarter-over-quarter, we continue to see acceleration and now have over 20 million Microsoft 365 Copilot paid seats. The number of customers with over 50,000 seats quadrupled year-over-year and Accenture now has over 740,000 seats, our largest Copilot win to date.&#171;</p></div><p>That is 20M paid seats up from 15M last quarter (~33% sequential growth) and 250% YoY. At a $30/user/month list price, that is a roughly $7B+ ARR business just from M365 Copilot.</p><p>Engagement on Copilot is the other piece of the story:</p><div class="pullquote"><p>&#187;We have seen a surge in usage of our first-party agents with monthly active usage up 6x year-to-date. Copilot queries per user were up nearly 20% quarter-over-quarter. To put this momentum in perspective, weekly engagement is now at the same level as Outlook, as more and more users make Copilot a habit.&#171;</p></div><p>Weekly engagement at the level of Outlook is a wild data point.</p><p>GitHub Copilot also continues to scale:</p><div class="pullquote"><p>&#187;We see this even with GitHub Copilot. Nearly 140,000 organizations now use GitHub Copilot and enterprise subscribers have nearly tripled year-over-year.&#171;</p></div><p>And here is where it gets interesting - Microsoft is moving GitHub Copilot to a usage-based pricing model:</p><div class="pullquote"><p>&#187;And earlier this week, we announced our move to usage-based pricing model for GitHub Copilot as we align pricing to actual usage and cost&#171;</p><p>&#187;Microsoft Cloud gross margin percentage should be roughly 64%, down year-over-year, driven by continued investments in AI and increased GitHub Copilot usage. Just this week, we announced a business model transition in GitHub Copilot that will align pricing with usage and value that takes effect on June 1 of this year.&#171;</p></div><p>Microsoft is admitting that GitHub Copilot adoption is now so heavy that it is dragging down Microsoft Cloud gross margins because they were charging a flat per-seat fee while costs scale with usage. So they are switching to usage pricing on June 1. This is a small story right now, but it is a leading indicator for the entire industry: per-seat pricing on AI products is going to be replaced with consumption pricing, because the cost-to-serve scales with intensity of use, not with seat count. The bigger investor takeaway is what management said elsewhere on the call:</p><div class="pullquote"><p>&#187;Bookings growth was impacted by weaker renewals as customers balance spend between the traditional per-seat and the emerging seats-plus-consumption model.&#171;</p></div><p>So enterprises are already adjusting their procurement around this. Hybrid pricing models are becoming the standard.</p><p>On capacity, Microsoft was, again, very direct:</p><div class="pullquote"><p>&#187;Even with these additional investments and continued efforts to bring GPU, CPU and storage capacity online faster, we expect to remain constrained at least through 2026. Despite these constraints, and the continued need to balance incoming supply, we expect Azure growth to show modest acceleration in the second half of the calendar year compared with the first half.&#171;</p></div><p>And then this:</p><div class="pullquote"><p>&#187;I think in so many ways, this just reminds us of the last cycle. And when the TAM is so expansive and when shortages are generally, I think, growing seems to be the sentiment between supply and demand&#171;</p></div><p>The &#8220;supply shortage growing&#8221; line is important because it is the third quarter in a row where Microsoft has said this.</p><p>The other big tell came on Foundry, Microsoft&#8217;s model marketplace:</p><div class="pullquote"><p>&#187;Over 10,000 customers have used more than one model on Foundry. 5,000 have used open source models, and the number who have used Anthropic and OpenAI models increased 2x quarter-over-quarter.&#171;</p><p>&#187;The majority of users leverage multiple models.&#171;</p></div><p>Note the Anthropic mention here. Microsoft is now a meaningful Anthropic distribution channel. This matters because a year ago, the bull case for Microsoft was &#8220;Microsoft + OpenAI.&#8221; Today, the company is hosting Anthropic, OpenAI, and open-source models in the same product. The orchestration layer could become the moat, not the model and Microsoft could benefit from it.</p><p>The CFO also gave us perhaps the most important signal for what to expect in fiscal Q4 (calendar Q2):</p><div class="pullquote"><p>&#187;We&#8217;re guiding for that to be better again in Q4. I think that&#8217;s where you&#8217;re starting to see, right? I think the thing that investors have been asking and Mark, you&#8217;re asking about is when we&#8217;ll start to see that show up in revenue growth. And I think that&#8217;s the first place you point to. We can also point to it, and I think you&#8217;ll start to see it in GitHub, right, where you see revenue growth rates and usage consumption models result in acceleration in the top line.&#171;</p></div><p>In other words, the next two quarters are when Microsoft expects AI products to start showing up as accelerating revenue, not just bookings.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Amazon: AWS Reaccelerates To 28% (Fastest In 15 Quarters), And The Chips Business Is Now Top-3 In The World</strong></p><p>AWS, which a lot of people had been writing off as the &#8220;loser&#8221; of the cloud AI race, posted its strongest growth rate since 2022:</p><div class="pullquote"><p>&#187;Starting with AWS, growth continued to accelerate, up 28% year-over-year, the fastest growth rate in 15 quarters, up $2 billion quarter-over-quarter, the largest Q4 to Q1 AWS revenue increase ever. AWS is now a $150 billion annualized revenue run rate business&#171;</p></div><p>So AWS is now growing 28% YoY (vs 24% last quarter) on a $150B run-rate base. The sequential add of $2B is the largest Q4-to-Q1 increase in AWS history.</p><p>Bedrock is the key to it:</p><div class="pullquote"><p>&#187;high-performance inference with the leading selection of frontier models in Bedrock, which saw 170% growth in customer spend quarter-over-quarter and processed more tokens in Q1 than all prior years combined.&#171;</p></div><p>Bedrock processed more tokens in Q1 2026 than in all prior years combined. Customer spend on Bedrock grew 170% QoQ. There has been a real perception in the market that AWS was behind on the AI inference story, and that is just no longer true based on these numbers.</p><p>Second, Trainium. This is the segment of the call that I think investors really need to focus on, because it is reshaping the unit economics of AWS:</p><div class="pullquote"><p>&#187;We saw nearly 40% quarter-over-quarter growth in Q1, and our annual revenue run rate is now over $20 billion and growing triple-digit percentages year-over-year, but this somewhat masks the size. If our chips business was a stand-alone business and sold chips produced this year to AWS and other third parties as other leading chip companies do, our annual revenue run rate would be $50 billion. As best as we can tell, our custom silicon business is now one of the top 3 data center chip businesses in the world, the speed at which we&#8217;ve gotten here is extraordinary. And we have momentum.&#171;</p></div><p>Jassy already mentioned this in his recent shareholder letter, and I have written about it, but still, Amazon is now a top-3 data center chip business globally. That is a statement most investors are not pricing in, because they think of Amazon as a retailer + cloud company, not a chip company.</p><p>And the demand signal on Trainium specifically is just enormous:</p><div class="callout-block" data-callout="true"><p>&#187;And we now have over $225 billion in revenue commitments for Trainium.&#171;</p></div><p>$225B in commitments just for Trainium is insane and was the most shocking number for me among all the earnings calls yesterday.</p><div class="pullquote"><p>&#187;Amazon Bedrock, which is used expansively by over 125,000 customers, runs most of its inference on Trainium and almost 80% of the Fortune 100 companies are using Bedrock.&#171;</p><p>&#187;While the largest number of AI chips we&#8217;re bringing in are Trainium, we continue to have a deep partnership with NVIDIA&#171;</p></div><p>The fact that AWS is now bringing in more Trainium chips than NVIDIA chips on a unit basis is also an important shift. It is going to flow through to AWS margins over time. The CFO basically said this directly:</p><div class="pullquote"><p>&#187;Different companies will offer different benefits for customers and the uniquely strong price performance that Trainium offers is compelling to our external and internal customers. For perspective, at scale, we expect Trainium will save us tens of billions of dollars of CapEx each year and provide several hundred basis points of operating margin advantage versus relying on others&#8217; chips for inference.&#171;</p></div><p>Several hundred basis points of operating margin advantage. AWS operating margin in Q1 was already strong at 37.7% (up from 35% the last quarter), and management is telling us Trainium adds several hundred more bps over time. This is exactly the same dynamic Google has been showing with TPUs, where Google Cloud operating margin went from 17.8% to 32.9% YoY. The custom silicon advantage will be crucial in the coming years.</p><p>The other under-appreciated comment was on the relationship between AI and core cloud spend:</p><div class="pullquote"><p>&#187;We continue to see customers increase cloud migrations and scale their use of AWS core services. Customers seeking the full benefit of AI are accelerating their transition to the cloud. We also see a strong correlation between AI spend and core growth. As customers spend more on AI, we see a corresponding demand increase in core.&#171;</p></div><p>This is the &#8220;AI lock-in&#8221; effect I have written about before. Companies that adopt AI workloads end up moving more of their non-AI workloads to the cloud as well, because the data needs to be co-located. This is why core (non-AI) cloud spend is also accelerating - which is something a lot of people miss when they look at AWS AI growth in isolation.</p><p>Backlog at AWS:</p><div class="pullquote"><p>&#187;On the backlog, the backlog for Q1 is $364 billion. That does not include the recent deal that we announced with Anthropic for over $100 billion. There&#8217;s reasonable breadth in that as well. It&#8217;s not just 1 customer or 2 customers.&#171;</p></div><p>$364B in backlog ex-Anthropic deal. The &#8220;reasonable breadth&#8221; comment is important - this is not a one-customer book like Microsoft&#8217;s relationship with OpenAI. The AWS book is more diversified.</p><p>And on selling chips outside the cloud, similar to the Google TPU pivot:</p><div class="pullquote"><p>&#187;On the question about Trainium and the notion of our selling racks over time, I do think that&#8217;s very much a possibility. Always, we have to balance -- we have such demand right now for Trainium, and we have such demand from various companies who will consume as much as we make that we have to decide how much we&#8217;re going to allocate to the existing demand and customers and how much we&#8217;re going to save to sell as racks. ... But I expect over time, there&#8217;s a good chance we&#8217;re going to sell racks over the next couple of years.&#171;</p></div><p>The most important framing on AI workloads from the entire AWS call:</p><div class="pullquote"><p>&#187;Most of the value companies derive from AI will be through agents&#171;</p></div><p>The chatbot era was really just us trying to use a new technology in a way that we knew so far (information retrieval via Google Search). Agentic AI - where models execute multi-step workflows on behalf of users autonomously - is where the real economic unlock of LLMs lies and requires orders of magnitude more compute per task. We are still very early in this shift.</p><p><strong>Meta: Revenue Up 33%, The Fastest In 4 Years, Because The Ad Engine Is An AI Engine Now</strong></p><p>Revenue growth 33% is the fastest in the last 4 years.</p><p>$56.3B in Q1 revenue, +33% YoY. To put that in context, in Q1 2025, Meta grew 16%. So the company has roughly doubled its growth rate in 12 months, on a $200B+ annualized base. That is essentially impossible without a structural change in the underlying engine, and the structural change is that Meta&#8217;s ad ranking and content recommendation systems are now LLM-scale.</p><p>On the engagement side:</p><div class="pullquote"><p>&#187;On Instagram, the ranking improvements that we made in Q1 drove a 10% lift in reel time spent.&#171;</p><p>&#187;On Facebook, total video time increased more than 8% globally in Q1, the largest quarter-over-quarter gain in 4 years. Within the U.S. and Canada, ranking improvements we made drove a 9% increase in video watch time on Facebook in Q1.&#171;</p></div><p>Note the Facebook number specifically. Facebook is a 20-year-old product, and it is posting the largest QoQ gain in video time in 4 years. That doesn&#8217;t happen organically. That happens because Meta deployed a new ranking model that materially improves the relevance of what users see.</p><p>The under-appreciated piece here is what Zuck said about how ad ranking actually works now:</p><div class="pullquote"><p>&#187;In the second half of last year, we began rolling out our new adaptive ranking model, which is an LLM scale adds recommender model that we use for inference. This model improves our inference ROI by routing requests to more compute-intensive inference models when it determines there is a higher probability of conversion.&#171;</p></div><p>And then this longer segment, which I think is one of the most important pieces of technical commentary:</p><div class="pullquote"><p>&#187;Historically, we haven&#8217;t used larger model architectures like GEM for inference, because their size and complexity would make them too cost prohibitive. And the way we drive performance from those models is by using them to transfer knowledge to smaller, more lightweight models that are used at run time. The inference models are bound by strict latency requirements since they need to find the right ad within milliseconds, and that has, again, historically prevented us from meaningfully sizing up -- scaling up their size and complexity. But in the second half of last year, we introduced a new adaptive ranking model, which enables us to leverage LLM scale model complexity of 1 trillion parameters, and we made advances in the model architecture and codesigned the system with the underlying silicon, so it maintains the sub-second speed that is required to serve ads at scale. We also developed an approach that intelligently routes requests more compute-intensive inference models if it determines that there is a higher probability of conversion and that lets us drive both better performance and increase inference ROI.&#171;</p></div><p>This is important for the whole industry.</p><p>For years, Meta could not use LLM-scale models (think GPT-style models with hundreds of billions to trillions of parameters) for ad ranking because the latency would be too high. When you load a Facebook feed, the ad selection has to happen in well under 100ms. LLM-scale models can take seconds to respond, which is way too slow.</p><p>What Meta has now done is two things: (1) they have re-architected the model and co-designed it with their custom silicon (this is the Broadcom-partnered ASIC Zuck referenced) so that a 1 trillion parameter model can run within sub-second latency, and (2) they have built a &#8220;router&#8221; that decides when to use the big expensive model vs. the small cheap one, based on the predicted probability that the ad will convert. So if you are clearly not going to click on an ad about a car, Meta won&#8217;t waste compute running the big model on you. But if you are showing strong purchase intent, they will use the most expensive model to find the absolute best ad to show you.</p><p>This is a fundamental change in the unit economics of advertising. It is why the average price per ad on Meta increased 12% YoY while ad impressions grew 19% YoY - Meta is showing more ads AND each ad is more valuable.</p><p>Now, on the macro AI strategy, Zuck is doubling down. From the call:</p><div class="pullquote"><p>&#187;we are increasing our infrastructure CapEx forecast for this year. Most of that is due to higher component costs, particularly memory pricing, but every sign that we&#8217;re seeing in our own work and across the industry gives us confidence in this investment.&#171;</p></div><p>The memory pricing call-out is something I have written about extensively. HBM is sold out through 2026, prices are spiking, and Meta is essentially saying: yes, our CapEx is going up, but it is partly because the price of the input is going up, not because we are buying more capacity than we expected.</p><p>On the custom ASIC story:</p><div class="pullquote"><p>&#187;That said, we are very focused on increasing the efficiency of our investments, and as part of that, we are rolling out more than 1 gigawatt of our own custom silicon that we&#8217;re developing with Broadcom, as well as a significant amount of AMD chips to complement the new NVIDIA systems that we&#8217;re rolling out as well.&#171;</p></div><p>A gigawatt of custom silicon is meaningful - that is a very serious deployment. This is the same playbook Google ran with TPUs and Amazon ran with Trainium. Meta is now in the custom silicon game in a real way, with Broadcom as the partner. That has implications both for Meta&#8217;s long-term margin profile and for Broadcom&#8217;s revenue trajectory.</p><p>On the broader AI investment thesis from Zuck:</p><div class="pullquote"><p>&#187;So you&#8217;re getting to a point where today, the models are still able to learn from people and then I think at some point, the models will have to improve themselves. And that&#8217;s how the growth is going to -- an improvement in the models is going to happen. And if you don&#8217;t -- if we don&#8217;t have an ability to do that, then we or anyone else, I think the companies that don&#8217;t do that are not going to be leading labs, then they&#8217;re not going to produce leading products. So I think that, that&#8217;s like -- that is a table stakes thing that we are focused on.&#171;</p><p>&#187;but then the model improvement, I think, is going to be something that&#8217;s going to go on for a very long time.&#171;</p></div><p>Zuck&#8217;s view that model improvement continues for a long time is a meaningful statement against the &#8220;scaling laws are over&#8221; thesis. He is essentially saying the opposite - he sees a long runway of improvement, and Meta is committing capital accordingly.</p><p>One thing I do want to flag for investors is the nuance of how Zuck is positioning Meta&#8217;s AI strategy versus other labs:</p><div class="pullquote"><p>&#187;That&#8217;s why we believe that we need to be a company that builds frontier models in addition to building the agents. And then in order to do that, you, of course, need to build your infrastructure in order to be able to do that well. So we&#8217;re undertaking this large investment to be able to do that top to bottom.&#171;</p><p>&#187;But I don&#8217;t hear any other labs out there talking about how they&#8217;re building an AI that&#8217;s really good at shopping. And I think that the reason for that is like not because shopping is the most important thing by itself, but because like empowering people to do the things that matter in their lives, whether that&#8217;s local or understanding social context, or shopping or personal health things or understanding what&#8217;s going on around them visually&#171;</p></div><p>This is Meta differentiating its AI strategy from OpenAI/Anthropic. Those labs are building horizontal foundation models. Meta is building vertical AI that is really good at the things people do on Meta&#8217;s surfaces - shopping, social, content discovery. It is a different bet, and arguably a much more defensible one given Meta&#8217;s distribution.</p><p><strong>The Pattern: AI Workloads Are Now Margin-Accretive At Scale, And Custom Silicon Is The Reason</strong></p><p>If I step back and try to find the single most important pattern across all four prints, here it is:</p><p>The bear case on hyperscaler AI spending - that AI workloads have structurally lower margins than legacy cloud workloads, and therefore CapEx returns will be poor - has been wrong.</p><p>Look at the hard data:</p><ul><li><p>Google Cloud operating margin: 17.8% Q1 2025 &#8594; 32.9% Q1 2026 (+1,510bps YoY)</p></li><li><p>AWS Q1 operating margin: 31.4% (Q1 2025) &#8594; 37.7% (Q1 2026)</p></li><li><p>Microsoft saying explicitly &#8220;margins were actually better and they remained better in our AI business versus where we saw in the cloud transition&#8221;</p></li><li><p>Amazon saying Trainium gives them &#8220;several hundred basis points of operating margin advantage versus relying on others&#8217; chips for inference&#8221;</p></li></ul><p>One of the structural reasons for the margin&#8217;s being stable is custom silicon: TPUs, Trainium. The cloud providers have figured out that they can&#8217;t allow NVIDIA to keep 75% gross margins on the most valuable workloads of the next decade, so they are vertically integrating into chips.</p><p>The other pattern is that AI is a significant core growth driver in the ad business at Meta and Google.</p><p>Both companies now run their ad ranking on LLM-scale models with adaptive routing. Both companies are showing direct evidence that AI-driven ranking improvements are translating into both more ad inventory consumed (impressions up) and higher revenue per impression (price per ad up). The ad business is no longer a separate thing from the AI business - the ad business is the AI business now.</p><p>For Microsoft and Amazon, the equivalent flywheel is happening in cloud + agents. Microsoft&#8217;s AI ARR hit $37B at +123% YoY. Amazon&#8217;s Bedrock token volume in Q1 alone exceeded all of 2025 combined.</p><p><strong>The Bigger Picture: The Era Of Subsidized Compute Is Ending</strong></p><p>A theme I want to leave you with, because I think it is the most important macro shift for the next 12-18 months:</p><p>It really comes down to what the end cost of intelligence is and how much companies are willing to pay for it. The whole industry would benefit from an architectural change that would bring new efficiencies to serving these models. It&#8217;s not who has the best model but who can solve the economic task with the least cost (using hardware, cloud infra, models, harness everything). The era of subsidizing compute is over - you can even see it from GitHub. Companies will have to shift budgets from other OpEx items towards compute, the moment is here, and this will only increase as Anthropic and OpenAI go IPO and have to produce &#187;passable&#171; gross margins.</p><p>The GitHub Copilot pricing change Microsoft announced is the canary in the coal mine here. You can&#8217;t have flat per-seat pricing for a product whose cost-to-serve scales with usage intensity. Either prices go up for heavy users or the seller bleeds gross margin. Microsoft chose to raise prices on the heavy users via consumption pricing. Anthropic is doing the same thing - their API pricing has been moving in this direction, and Claude Code&#8217;s heavy users have been hitting rate limits and seeing throttling for months now. The only real question that I believe we will get the answer to soon is how much value end users have and how much they are willing to pay for it.</p><p>As always, I hope you found this article valuable. I would appreciate it if you could share it with people you know who might find it interesting. I also invite you to become a paid subscriber, as paid subscribers get additional articles covering both big tech companies in more detail, as well as mid-cap and small-cap companies that I find interesting.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe&quot;,&quot;text&quot;:&quot;Subscribe to Paid&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe"><span>Subscribe to Paid</span></a></p><p>Thank you!</p><p><strong>Disclaimer:</strong></p><p>I own Google (GOOGL), Amazon (AMZN), Microsoft (MSFT), Meta (META) stock.</p><p>Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer&#8217;s opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.</p>]]></content:encoded></item><item><title><![CDATA[Q1 2026 Channel Checks & Alternative Data: Cloud is on Fire]]></title><description><![CDATA[For this report, I covered cloud providers Google, Microsoft, and Amazon in terms of their cloud business, and some insightful signals on Microsoft Copilot, which is a pressure point for Microsoft.]]></description><link>https://www.uncoveralpha.com/p/q1-2026-channel-checks-and-alternative</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/q1-2026-channel-checks-and-alternative</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Fri, 24 Apr 2026 12:02:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4VQ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4daaec-27a8-4395-a938-9dc63c6d872e_682x441.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I am posting my regular channel check &amp; other alternative data report before we start the big tech earnings.</p><p>For this report, I covered cloud providers Google, Microsoft, and Amazon in terms of their cloud business, and some insightful signals on Microsoft Copilot, which is a pressure point for Microsoft.</p><p>Let&#8217;s dive in.</p><p><strong>Cloud is on fire</strong></p><p>Let&#8217;s start with my alt data on the most relevant channel-check interviews from clients, cloud consultants, integrators, and former employees.</p><p>When bulk analyzing these interviews, demand is high for Q1 2026 across all three hyperscalers: AWS, Azure, and GCP. Looking at the future demand pipeline for the next 3-6 months, the results are even more impressive: almost 60% of these experts see demand exceeding their expectations, 26% see it meeting expectations, and 15% see it below expectations. Keep in mind that expectations were already high going into this year and quarter, so for 60% of experts seeing higher-than-expected demand, the signal is very strong. The driver of demand, as expected, is AI workloads and the move from test to production environments, especially with Agentic AI starting to roll out.</p><p>Looking deeper, let&#8217;s look at the individual hyperscaler level and what the data shows. As always, I made the % breakdown of experts who think AWS, Azure, or GCP is accelerating the fastest. Here are the results:</p><p>62% think GCP is growing the fastest, 41% think Azure is growing the fastest, and 27% think AWS is growing the fastest (important note: the sum is greater than 100% because some experts mentioned two platforms as growing at a faster pace than the other).</p><p>Now, this data doesn&#8217;t add much value until we compare it to my historical data from past quarters, as we did in the last reports, to truly understand whether anything shifted significantly in Q1. Here is the data:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/q1-2026-channel-checks-and-alternative">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Left for Dead on AI, Meta and Amazon Are About to Have the Last Laugh]]></title><description><![CDATA[I break down some significant fundamental shifts when it comes to AI efforts from Amazon and Meta, and why I think both are the next two big AI beneficiaries. Based on what we are seeing, both companies are on a path to reaccelerating their efforts, while still perceived by the market as &#8220;AI laggards&#8221;. We believe this premise will be proven wrong in the coming months.]]></description><link>https://www.uncoveralpha.com/p/left-for-dead-on-ai-meta-and-amazon</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/left-for-dead-on-ai-meta-and-amazon</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Fri, 17 Apr 2026 13:01:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Jhj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88752802-60c3-4379-ac50-7468e34b7f81_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi everyone,</p><p>In this article, I break down some significant fundamental shifts when it comes to AI efforts from Amazon and Meta, and why I think both are the next two big AI beneficiaries. Based on what we are seeing, both companies are on a path to reaccelerating their efforts, while still perceived by the market as &#8220;AI laggards&#8221;. We believe this premise will be proven wrong in the coming months.</p><p>Let&#8217;s start.</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/left-for-dead-on-ai-meta-and-amazon">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Era of Subsidized AI Model Usage is Over, the IPOs are coming]]></title><description><![CDATA[There are four interconnected themes I want to walk through today, and they all converge on a single conclusion: the era of subsidizing AI model usage is coming to an end.]]></description><link>https://www.uncoveralpha.com/p/the-era-of-subsidized-ai-model-usage</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-era-of-subsidized-ai-model-usage</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Fri, 10 Apr 2026 14:38:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Qcs3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>The AI industry is approaching an inflection point that will reshape the priorities of AI model companies and the entire space. There are four interconnected themes I want to walk through today, and they all converge on a single conclusion: the era of subsidizing AI model usage is coming to an end.</p><p>Here&#8217;s what I cover in this article:</p><ul><li><p>Anthropic is taking over the enterprise &#8212; but the curse of the best model is real</p></li><li><p>OpenAI is losing the enterprise race to Anthropic and facing structural problems heading into its IPO</p></li><li><p>The era of subsidized AI model usage is ending as both companies prepare for public markets</p></li><li><p>The IPO race: who lists first matters more than most people realize</p></li></ul><p>Let&#8217;s get into it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qcs3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qcs3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!Qcs3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!Qcs3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Qcs3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qcs3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2176992,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/193796773?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qcs3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!Qcs3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!Qcs3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Qcs3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f14f027-f39a-445a-8dd1-62668b239c10_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Anthropic Is Taking Over Enterprise: And the Curse of the Best Model Is Coming for Them</strong></p><p>Anthropic announced that its revenue had surpassed $30 billion, up from $9 billion at the end of 2025. That&#8217;s more than tripling in roughly four months. Anthropic has now supposedly surpassed OpenAI&#8217;s run-rate revenue of approximately $25B.</p><p>The enterprise composition is what separates Anthropic from the rest. Approximately 80% of revenue comes from business customers. The number of customers spending over $1 million annually has doubled to more than 1,000, up from 500+ where it was just 2 months ago. Business subscriptions to Claude Code have quadrupled since the start of 2026.</p><p>And then there&#8217;s Mythos. Just a few days ago, Anthropic announced Claude Mythos Preview - a new general-purpose model that sits in an entirely new tier above Opus. The draft described it as &#8220;by far the most powerful AI model we&#8217;ve ever developed&#8221; and said it is &#8220;very expensive for us to serve, and will be very expensive for our customers to use.&#8221;</p><p>The benchmarks are quite telling: 93.9% on SWE-bench Verified (vs. Opus 4.6&#8217;s ~80.9%), 77.8% on SWE-bench Pro, 82% on Terminal-Bench 2.0, 97.6% on USAMO 2026, and 83.1% on CyberGym vs. Opus 4.6&#8217;s 66.6% &#8212; a 16.5 percentage-point jump on cybersecurity tasks. On Anthropic&#8217;s internal zero-day exploit benchmark, Opus 4.6 had a near 0% success rate at autonomous exploit development. Mythos succeeded 181 times out of several hundred attempts on the same Firefox vulnerability task. It also found thousands of zero-day vulnerabilities across every major operating system and browser, including a 27-year-old bug in OpenBSD and a 16-year-old bug in FFmpeg that automated testing had missed across 5 million test runs.</p><p>Mythos is not being made generally available because Anthropic wants to first roll it out to selected companies because of cybersecurity risks. It&#8217;s deployed through Project Glasswing to 12 partner organizations (Amazon, Apple, Broadcom, Cisco, CrowdStrike, Linux Foundation, Microsoft, Palo Alto Networks, and others) plus about 40 additional organizations, with Anthropic committing $100 million in usage credits. While I am not dismissing any cyber risks that a model like this could bring, it is also convenient that now model providers will &#187;release&#171; these models to a small group of companies, because the reality is that with their current compute, they can&#8217;t even serve Opus 4.6 to their user base let alone Mythos, which is even more expensive to run. The current Mythos Preview, for which these companies got access, is around 5x more expensive than Opus 4.6 after the initial $100M credit commitment from Anthropic based on their pricing. It is also rumored that OpenAI will also &#8220;release&#8221; their newest model in a similar fashion, again citing cybersecurity risks.</p><p>There has been a lot of frustration from Claude users lately, as many have started to hit their rate limits much faster in their subscription plans, as Anthropic is having to manage this surge in demand with the amount of compute that they have.</p><p>This is what I call the Inference Trap, and both OpenAI and Anthropic have now been caught in it.</p><p>The pattern is simple: build the best model &#8594; users surge &#8594; inference compute explodes &#8594; you either throttle users, raise prices, or cannibalize training compute. OpenAI experienced it during the Ghibli moment in March 2025, when ChatGPT gained 1 million new users in a single hour and 100 million signups in a week. Sam Altman admitted they were &#8220;forced to do a lot of unnatural things,&#8221; specifically borrowing compute capacity from OpenAI&#8217;s research division and slowing down the release of new features<strong>.</strong></p><p>Anthropic is living through its own version right now. In March 2026, the company experienced five major platform outages in a single month. Claude Code users reported burning through 5-hour sessions in under 90 minutes. The problem for Anthropic is that if you are a model provider in this AI race, you don&#8217;t want to cannibalize training compute, as it means that you can lose the race for the next model.</p><p>This brings me to another point I want to make: pricing increases on frontier AI models are inevitable.<strong> </strong>When Anthropic eventually deploys Mythos-class models at scale, the inference cost per query will be higher than Opus. And they already can&#8217;t serve Opus at current demand levels without throttling. The math only works if prices go up, or if the compute infrastructure grows fast enough to meet demand, which it can&#8217;t in the short term.</p><p>Moving to OpenAI. </p><p><strong>OpenAI seems to be losing the Enterprise Race And Heading Into an IPO with some headwinds</strong></p><p>While Anthropic is sprinting ahead on enterprise revenue, OpenAI is dealing with a set of problems that are becoming hard to ignore.</p><p>The revenue gap has flipped.<strong> </strong>A year ago, OpenAI was at roughly $6 billion ARR, and Anthropic was at $1 billion. The gap looked huge. Today, Anthropic is at $30 billion, and OpenAI is at $25 or similar to Anthropic, but the pace of growth is slower. Anthropic added roughly $21 billion in net new annualized revenue in just three months. OpenAI&#8217;s enterprise business now makes up 40% of revenue (up from ~30% last year) and is &#8220;on track to reach parity with consumer by the end of 2026&#8221; &#8212; but Anthropic has been enterprise-first from the start, with 80% enterprise revenue and structurally higher retention.</p><p>To add to this, SensorTower data now show that ChatGPT&#8217;s monthly active users in the US have started to fall slightly, adding to the headwinds.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lGin!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lGin!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lGin!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lGin!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lGin!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lGin!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg" width="970" height="549" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:549,&quot;width&quot;:970,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53591,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/193796773?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lGin!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lGin!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lGin!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lGin!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a1ee6f2-9457-4399-b285-8009e38ff46d_970x549.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Even without this data, we have been waiting for quite some time for OpenAI to make the update of reaching 1 billion weekly active users, as the 800M mark was announced 6 months ago. Based on this users growth has slowed down.</p><p>The funding structure also isn&#8217;t perfect. OpenAI closed a $122 billion round at an $852 billion valuation at the end of March&#8212; the largest private funding round in history. But the structure is telling. Amazon committed $50 billion, but only $15 billion arrived as upfront cash. The remaining $35 billion is conditional, tied to milestones that some indicate may include achieving certain AI capability thresholds or pursuing an initial public offering by the end of 2026.</p><p>SoftBank pledged $30 billion structured in three equal tranches of $10 billion each, arriving in April , July, and October. SoftBank&#8217;s structure essentially assumes a liquidity event within that window. About $3 billion came from retail investors through bank channels, and OpenAI was included in ARK Invest ETFs.</p><p>In other words, a significant portion of OpenAI&#8217;s headline $122 billion raise is conditional on an IPO actually happening. This means OpenAI is under pressure to go public regardless of whether the timing is optimal. There are now reports from The Information that Altman and OpenAI&#8217;s CFO are on different sides over the IPO, as Altman is pushing for an IPO this year, while the CFO believes OpenAI is not ready yet. OpenAI has already denied this, but it's not expected that any company would confirm such rumors, even if they were true.</p><p>The profitability picture is the key thing.<strong> </strong>According to different reports, OpenAI&#8217;s gross margins sit at approximately 40%, constrained by variable compute costs. The company is generating +$2-3 billion per month but losing +$14 billion per year. Reports of internal documents project that compute costs will reach $121 billion by 2028, with a cumulative loss trajectory that doesn&#8217;t reach breakeven until 2029-2030. Compare this to Anthropic, which projects positive free cash flow by 2027-2028 while spending roughly 4x less on training.</p><p>OpenAI also has an alternative to Claude Code called Codex, but adoption there, although growing, doesn&#8217;t seem to be at the same pace as Claude Code. It&#8217;s telling that OpenAI is even offering users more token usage, while Anthropic is limiting it.</p><p>As this data from Ramp shows, the AI model share of first-time enterprise customers has heavily tilted towards Anthropic in the last few months:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iYpU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iYpU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 424w, https://substackcdn.com/image/fetch/$s_!iYpU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 848w, https://substackcdn.com/image/fetch/$s_!iYpU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 1272w, https://substackcdn.com/image/fetch/$s_!iYpU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iYpU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png" width="999" height="692" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:692,&quot;width&quot;:999,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56198,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/193796773?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iYpU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 424w, https://substackcdn.com/image/fetch/$s_!iYpU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 848w, https://substackcdn.com/image/fetch/$s_!iYpU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 1272w, https://substackcdn.com/image/fetch/$s_!iYpU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671f46db-dbea-4572-ac97-ab2539f0ea67_999x692.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The key moat that OpenAI has is the ChatGPT brand, which, as a first mover, created a verb similar to Google when it comes to consumers. In the last months, however, Claude and Anthropic have become &#8220;the verb&#8221; when it comes to enterprises and AI use cases for work. On the consumer end, OpenAI will probably have to shift hard towards an ad-supported business model to get some revenue from its big free user base. Building an efficient ad platform is much more complex than most think and requires time. At the same time, OpenAI is trying to stop Anthropic in the enterprise market, but so far, it doesn&#8217;t seem to be working as Anthropic is capturing the market at a faster pace. The question is whether OpenAI&#8217;s strategy of trying to capture both markets at once and &#8220;doing everything&#8221; is really the right one. I would argue that it is not. Now you even have Meta entering the AI arena again, with its first AI model since the formation of Meta's superintelligence unit. While their model is not SOTA, there are specific use cases where it is very competitive. Meta focused on use cases like health, social media, games, and shopping. This pattern will become more dominant in the coming years as the AI model market matures and you see model specialization rather than just general models. In this environment, the importance of having a narrow focus becomes even bigger.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p>According to The Information, OpenAI is now communicating to investors that they believe one of their important advantages going forward vs Anthropic is in the availability of compute. OpenAI said it believes Anthropic had 1.4 GW of capacity at the end of last year, while OpenAI had 1.9 GW. But OpenAI said it plans to ramp its capacity more steeply, with total gigawatts in the mid-single digit range at the end of this year and more than 10 GW in 2027.</p><p>In contrast, it believes Anthropic will have 3 to 4 GW in 2026 and 7 to 8 GW by the end of 2027. While I agree that availability of compute is a big factor going forward, I believe the main one and even more important is inference economics and the costs of serving the models to your clients. As users shift workloads to production, availability and reliability become key. Nobody wants to have a product that is unstable and sometimes works great, but other times is not available.</p><p><strong>The End of Subsidized AI Model Usage</strong></p><p>Both Anthropic and OpenAI are preparing for IPOs. And an IPO changes everything about how an AI company thinks about compute costs.</p><p>When you&#8217;re private and burning venture capital, you can subsidize inference. You can run models at a loss. You can offer $20/month unlimited plans that cost you +$100/ month to serve. You can double the rate limits as promotions. You can hand out credit packages. The goal is growth at all costs, because the next funding round values you on revenue, not margins.</p><p>When you&#8217;re public, the scrutiny shifts to unit economics.<strong> </strong>Gross margins, operating margins, cash burn trajectory, path to profitability &#8212; these become the metrics that determine your stock price. An S-1 filing forces you to disclose all of this in audited detail.</p><p>Both OpenAI and Anthropic are probably operating at approximately 40% gross margins, constrained by the variable cost of running inference. And while no company will show net profit as they are projected towards the end of 2030, the gross margin will be something that investors will particularly keep an eye on, especially the gross margin on inference. While the margin profile math works on API usage, the one in the subscription packages is still often subsidized by the model companies, and this is something they will look to tweak in the coming months before they file their S-1s.</p><p>This has a ripple effect across the entire supply chain industry.<strong>  </strong>For the past three years, the question has been: &#8220;Which chip delivers the most FLOPS?&#8221; The answer was always Nvidia, and companies paid whatever Nvidia charged because performance was the bottleneck, and money was abundant.</p><p>Going forward, the question becomes: &#8220;Which chip gives me the cheapest tokens and the best total cost of ownership (TCO)?&#8221;<strong> </strong>It&#8217;s not even about watts anymore, as you can see from recent podcast comments from Nvidia&#8217;s Jensen and Google Sundar &#8212; it&#8217;s about cost-per-token, because that&#8217;s what directly determines your gross margin as a public company, and that is what investors will be laser focused on. This shift also means the end of subsidizing usage in these subscription packages, with either usage limits or higher prices. We will also see even more resources being focused on software optimizations to run the model. Savings on memory and getting more from existing hardware will be the focus in the coming months for both labs, as they are constrained by compute.</p><p>Both of these companies also need to make the hard math of the IPO being the &#8220;last&#8221; funding round, and after the IPO, have enough capital that will be able to support their growth and cash burn for the coming years. Issuing additional stock for raising capital once you are a public company is never looked at positively by the market, so nobody wants to go down that route.</p><p>Anthropic is uniquely well-positioned here<strong> </strong>because it runs Claude on a diversified hardware stack across three suppliers: Nvidia GPUs, Google TPUs, and Amazon Trainium. This gives it real negotiating leverage and the ability to route workloads to whichever chip offers the best price-performance for each model tier. The company just announced a deal with Google and Broadcom for approximately 3.5 gigawatts of next-generation TPU capacity starting in 2027. This is on top of its existing AWS Trainium partnership and Nvidia GPU deployments. It is worth noting that AWS&#8217;s CEO just mentioned in an interview on CNBC yesterday that all of Anthropic&#8217;s AI models were trained on Amazon Trainium (even Mythos).</p><p>OpenAI, by contrast, has been more dependent on Nvidia through its Azure partnership with Microsoft, though it has been diversifying toward custom silicon.</p><p><strong>The IPO Race: Who Lists First Gets the Biggest Check</strong></p><p>There&#8217;s one final dynamic that ties all of this together: both Anthropic and OpenAI know that whoever goes public first has a significant advantage, and the window for both is narrowing. On top of those, SpaceX (which now includes xAI) is also racing towards a +$1T IPO. OpenAI is targeting a +$1T IPO, Anthropic just closed a funding round valued at $380 billion, but because of the surge in usage and revenue is already valued at around $500-$700 billion in secondary listings, so the IPO could be in the $800B-$1T range as well.</p><p>Between these three companies alone, we&#8217;re looking at potentially +$200 billion in capital being raised from public markets within a 6-12 month window. That&#8217;s an enormous liquidity event. For context, the entire US IPO market raised approximately $33 billion in 2024. Even in the hot 2021 market, total US IPO proceeds were around $140 billion.</p><p>This is why the race to go first matters so much. The first to market captures the freshest investor capital and sets the valuation benchmark. The second has to compete for the same institutional allocation. The third might struggle if the market has indigestion from the first two.</p><p>OpenAI&#8217;s board is reportedly concerned that if Anthropic lists first, it could set a valuation benchmark that makes OpenAI&#8217;s $1 trillion target look stretched &#8212; especially now that Anthropic has higher revenue, better enterprise concentration, and a more credible path to profitability. On the other hand, if OpenAI lists first, it establishes itself as the &#8220;AI category-defining IPO&#8221; and benefits from a first-mover premium in public market pricing.</p><p>Both companies know this. Both are preparing in parallel. And both are racing against time, because every month that passes, compute costs pile up, margins need to improve, and the public market window could shift with macro conditions.</p><p><strong>Summary</strong></p><p>We are entering a new key period in AI where unit economics take front stage. At the same time, I expect we will see a rapid pace of software optimizations to more efficiently serve these models in the coming months as AI labs put their best talent towards solving this task because it has now become the most important thing that is limiting growth and profitability. The software optimization will focus on resolving key bottlenecks, such as memory (KV cache, context window) and wafer availability. Model distillation and the trend toward smaller models will also grow faster than before because of this. If I were to speculate, the hardware companies might have a &#8220;less golden&#8221; time than the era they have had so far, while cloud providers might benefit the most as these software optimizations mean that they get more juice out of their existing infrastructure, while demand for compute still keeps on surging, because of wider adoption of AI.</p><p>Until next time,</p><p>Next week, we are publishing an article on some key developments in the AI space when it comes to Meta and Amazon, exclusive for paid subscribers. If you are not yet a paid subscriber, consider signing up.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe&quot;,&quot;text&quot;:&quot;Become paid subscriber&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe"><span>Become paid subscriber</span></a></p><p>Thank you!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/p/the-era-of-subsidized-ai-model-usage?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/p/the-era-of-subsidized-ai-model-usage?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><strong>Disclaimer:</strong></p><p>I own Google (GOOGL), Amazon (AMZN), Microsoft (MSFT), Meta (META) stock.</p><p>Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer&#8217;s opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.</p>]]></content:encoded></item><item><title><![CDATA[Every Memory Cycle Ends the Same. Until It Doesn't.]]></title><description><![CDATA[I&#8217;ve studied every major memory cycle of the last 30 years. In this article, we look at them and the numbers. But then I am going to make a case for why the AI era may fundamentally break that pattern.]]></description><link>https://www.uncoveralpha.com/p/every-memory-cycle-ends-the-same</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/every-memory-cycle-ends-the-same</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 12 Mar 2026 12:19:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nCTv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>For three decades, the memory semiconductor industry has followed a brutal and predictable pattern: prices boom, manufacturers over-invest, supply floods in, prices crash, everyone bleeds red ink, and then the whole thing starts over. It&#8217;s been one of the most reliably cyclical businesses in all of technology. The cycle has destroyed shareholder value, bankrupted companies, and taught every investor the same lesson: never trust the words &#8220;this time is different&#8221; when it comes to DRAM.</p><p>And yet, here I am, writing an article arguing exactly that.</p><p>Let me be clear, I know the history. I&#8217;ve studied every major memory cycle of the last 30 years. In this article, we look at them and the numbers. But then I am going to make a case for why the AI era may fundamentally break that pattern, not because demand will be infinite (it won&#8217;t), but because the nature of what memory serves has changed in a way that most investors haven&#8217;t fully internalized.</p><p>Memory is no longer just a component inside your gadget. Memory is becoming a raw input for intelligence. And the demand curve for intelligence looks a lot more like the demand curve for energy, electricity, than it does the demand curve for smartphones.</p><p>Let&#8217;s start.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nCTv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nCTv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!nCTv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!nCTv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!nCTv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nCTv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2475077,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/190710023?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nCTv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!nCTv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!nCTv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!nCTv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc048e138-3836-4705-8e7f-9d9b255003f7_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The history of memory economics</strong></p><p>For those less familiar with the space, the memory semiconductor market is dominated by three players: Samsung Electronics (South Korea), SK Hynix (South Korea), and Micron Technology (United States). Together, these three companies control approximately 95% of global DRAM production. This is an oligopoly, but not one that has historically behaved like one. Unlike OPEC, these companies can&#8217;t (legally) coordinate output. And unlike logic chips, memory is essentially a commodity&#8212;a bit is a bit. The differentiation comes from process technology, cost structure, and increasingly, product mix (more on HBM later).</p><p>The fundamental problem with memory economics is the mismatch between demand elasticity and supply inelasticity. Building a new DRAM fab costs $15-20 billion and takes 2-3 years. Once built, the economics favor running it at maximum utilization because fixed costs are enormous. So when demand rises, prices spike because supply can&#8217;t respond quickly. When manufacturers finally bring new capacity online, they tend to overshoot, because everyone is building at the same time based on the same rosy demand signals. Prices crash, margins collapse. Some companies go bankrupt or get acquired. The survivors cut capex, and the cycle begins anew.</p><p>This is the pattern. And it has repeated with remarkable consistency.</p><p><strong>Cycle 1: The Windows PC supercycle (1993-1996)</strong></p><p>The first modern memory supercycle was driven by the explosion of Windows PCs and graphical operating systems. Average DRAM content per PC jumped from roughly 1-2MB to 4-8MB&#8212;a 4x increase per device&#8212;while PC unit shipments were growing at double-digit rates.</p><p>During 1993 and 1994, DRAM demand outpaced supply despite most fabs running at full utilization. Spot and contract prices for 4Mb and 16Mb DRAM rose sharply, and gross margins for leading suppliers surged well above 50%. Korean memory makers like Samsung and Hyundai (now SK Hynix) posted record profits. Semiconductors accounted for 13.4% of Korea&#8217;s total exports. It was hailed as the greatest boom in Korean industrial history.</p><p>Then reality hit. Roughly 50 fab construction plans were announced during 1995-1996 alone. Capex as a percentage of semiconductor production exceeded 30%. The inevitable happened: DRAM prices peaked in late 1995 and then collapsed&#8212;falling 51% in 1996 and another 65% in 1997. Korea&#8217;s Big Three chipmakers suffered from overexpansion, and the resulting shock contributed to the Asian Financial Crisis that pushed Korea into a deep recession. Stock prices of memory companies fell 60-80% from peak to trough.</p><p>Looking at the data cycle duration (peak to trough): around 2 years. Price declines 51% in year one, 65% in year two, and the stock declines around 60-80%.</p><p><strong>Cycle 2: The cloud and smartphone era (2016-2019)</strong></p><p>Fast forward two decades, and the cast of characters had changed, but the script was the same. By 2016, the DRAM market had consolidated from roughly 20 players to just three. This was supposed to introduce discipline. And for a while, it seemed like it did.</p><p>The 2016-2018 &#8220;supercycle&#8221; was driven by a convergence of factors: smartphone storage capacity upgrades, the early cloud buildout, and a supply-side twist where manufacturers were shifting capacity to 3D NAND production, which temporarily constrained conventional DRAM output.</p><p>The numbers were spectacular, especially for Micron, the only publicly traded pure-play memory company in the U.S.:</p><p>Micron 2016:<strong> </strong>Revenue of $12.4 billion, gross margin of 20.2%, operating income of just $168 million (1.4% operating margin). The company was barely above breakeven.</p><p>Micron 2017:<strong> </strong>Revenue surged 64% to $20.3 billion. Gross margin expanded to 41.5%. Operating income hit $5.87 billion (28.9% margin).</p><p>Micron 2018:<strong> </strong>Revenue jumped another 50% to $30.4 billion. Gross margin peaked at 58.9%. Operating income reached an astonishing $15.0 billion&#8212;a 49.3% operating margin. From barely profitable to printing nearly 50 cents of operating profit on every dollar of revenue in two years.</p><p>SK Hynix followed a similar trajectory. At its Q3 2018 peak, SK Hynix posted an operating profit of 6.47 trillion Korean won, which at the time was a record.</p><p>DDR4 retail RAM prices doubled over the course of 2017 into early 2018. Industry inventories fell to 3-4 weeks, well below the normal 8-week average.</p><p>Micron&#8217;s stock peaked at roughly $64 in May 2018.<strong> </strong>But notice- revenue and margins didn&#8217;t peak until Q4 of calendar 2018. The stock topped out approximately two quarters before the fundamental peak. This is a classic pattern in cyclical stocks: the market discounts the turn before it shows up in the numbers.</p><p>Then came the crash:</p><p>Micron 2019<strong>: </strong>Revenue fell to $23.4 billion (-23%). Gross margin compressed to 45.7%.</p><p>Micron 2020:<strong> </strong>Revenue dropped further to $21.4 billion. Gross margin fell to 30.6%. Operating income was $3.0 billion, down 80% from the 2018 peak.</p><p>By December 2018, Micron&#8217;s stock had fallen to approximately $28&#8212;a 56% decline from the May high. The stock was pricing in the downturn even as the company was still reporting near-peak earnings.</p><p>Cycle duration (peak to trough in fundamentals): ~6-7 quarters. Revenue decline (peak to trough): ~30% Gross margin decline: from 59% to 27% (at the Q1 FY2020 low) stock decline (peak to trough): ~56%.</p><p><strong>Cycle 3: The COVID cycle (2020-2023)</strong></p><p>The pandemic created an unexpected demand surge. PC shipments exploded as the world went remote. Server demand spiked as cloud usage accelerated. 5G phones launched with higher per-device memory content. The upcycle lasted approximately 14 months before the familiar reversal kicked in.</p><p>By 2022-2023, the downturn was severe. Bloated inventories from pandemic over-ordering met weakening consumer demand. SK Hynix posted a full-year 2023 net margin of approximately negative 28%. Micron&#8217;s 2024 revenue dropped to around $25 billion with gross margins compressing toward the low 20s.</p><p>Memory stocks cratered. Micron fell from around $98 in early 2022 to roughly $49 by late 2022&#8212;a 50% haircut. SK Hynix fell similarly.</p><p>Cycle duration (peak to trough): 6-8 quarters of margin compression. Operating margins went from 30%+ to deeply negative for SK Hynix. Stock decline: ~50%</p><p>The pattern across all three cycles is strikingly consistent: a demand-driven boom lasting 4-7 quarters, followed by an oversupply-driven bust lasting 4-8 quarters, with revenue declines of 25-40%, margin compression from peak levels above 50% to the low 20s or even negative, and stock price declines of 50-60% that lead the fundamental downturn by 1-2 quarters.</p><p>The history is clear, but now let me tell you why I think this cycle might be structurally different.</p><p><strong>From gadget component to intelligence input</strong></p><p>In every previous memory cycle, the demand driver was the same: humans buying devices.<strong> </strong>PCs in the 1990s. Smartphones in the 2010s. Laptops during COVID. The demand function was ultimately capped by the number of humans and the number of devices each human needs. One person buys one phone. Maybe one laptop. Perhaps a tablet. The DRAM content per device grows, but the number of endpoints is bounded.</p><p>This meant that once the initial adoption or upgrade wave passed&#8212;once everyone who needed a new PC had bought one, or every smartphone had been upgraded to the latest generation&#8212;demand would flatten. Supply, which was ramped during the boom, would overshoot. Prices would crash.</p><p>In the AI era, the demand function for memory has fundamentally changed. Memory is no longer predominantly serving a fixed number of &#187;human endpoints&#171;. Memory, especially HBM, is now a critical input for generating intelligence.</p><p>Think about what HBM (High Bandwidth Memory) actually does inside an AI accelerator. When you ask ChatGPT a question or run an inference on a large language model, the model&#8217;s parameters&#8212;billions or trillions of numerical weights&#8212;need to be loaded from memory into the GPU&#8217;s compute cores. The KV cache, which stores the context of your conversation, grows linearly with context length, with Grouped Query Attention (GQA) consuming roughly 0.06 - 0.12 MB per token in a 7B parameter model. A model with 70 billion parameters requires more than a single 80GB GPU worth of HBM just for the weights alone.</p><p>Here&#8217;s the simplified version: More memory = the ability to run larger models, with longer context, serving more users simultaneously.<strong> </strong>Memory is not a peripheral component in AI&#8212;it is the binding constraint. The so-called &#8220;memory wall&#8221; is the single biggest bottleneck limiting AI inference performance today. GPUs often sit idle, waiting for data to be fetched from memory. More bandwidth, more capacity means more intelligence output per second.</p><p>This is where the analogy to energy becomes powerful. Think about oil. When oil prices drop, what happens? Demand for oil increases because cheaper energy enables more economic activity. The demand curve for energy is downward-sloping- lower prices stimulate consumption. There&#8217;s always more work that could be done, more goods that could be transported, more heat that could be generated, if only energy were cheaper.</p><p>I believe AI inference demand behaves similarly. If memory costs drop and inference becomes cheaper, that doesn&#8217;t mean demand for inference drops. It means more applications become economically viable. More AI agents get deployed. More models get served. More context windows get extended. The demand for intelligence, like the demand for energy, is essentially elastic in response to price declines. Cheaper intelligence leads to more consumption of intelligence, not less.</p><p>This is the polar opposite of the gadget cycle. When DRAM prices dropped after the 2018 boom, it didn&#8217;t cause people to go buy a second smartphone. The number of endpoints was fixed. But when the cost of running an AI inference call drops by 50%, you can bet that the number of inference calls per day will more than compensate. Every enterprise that was waiting on the sidelines because of cost will deploy its AI project. Every startup that couldn&#8217;t afford the compute will spin up their service.</p><p>Here&#8217;s a human analogy I think captures this well.<strong> </strong>Imagine two people: one is a genius with poor memory, and the other is of average intelligence but has extraordinary memory and recall. In many real-world tasks&#8212;medicine, law, engineering, customer service&#8212;the person with superior memory will outperform the genius. Why? Because most practical work isn&#8217;t about raw reasoning power. It&#8217;s about retrieving the right piece of information at the right time. An AI model with more memory (longer context, more parameters accessible, faster retrieval) will outperform a theoretically smarter model that is memory-constrained. Memory is intelligence in many practical applications.</p><p>This is not a theoretical argument. The industry data supports it. HBM capacity per GPU has been scaling aggressively: NVIDIA&#8217;s A100 had 80GB of HBM2e. The H200 moved to 141GB of HBM3e. The upcoming Blackwell Ultra configurations push toward 288GB. And the Rubin Ultra platform is targeting 288GB - 576GB of HBM4E per GPU. The trajectory is exponential, and every generation of GPU is constrained by memory, not compute.</p><p><strong>Where we are today</strong></p><p>The current memory cycle is already historic in scale.</p><p>DRAM prices have surged dramatically. By Q4 2025, DRAM spot prices were nearly triple their level from a year earlier. DDR5 prices jumped 30-50% per quarter through H2 2025. Samsung raised memory prices by up to 60% since September 2025. DRAM inventories at major suppliers fell to just 3.3 weeks by the end of Q3 2025&#8212;matching the 2018 supercycle lows. SK Hynix and Micron had roughly 2 weeks of inventory each.</p><p>AI is expected to consume nearly 20% of global DRAM wafer capacity in 2026 when adjusted for HBM&#8217;s 4x wafer intensity.</p><p><strong>The valuation: The market doesn&#8217;t believe in the durability of this cycle</strong></p><p>Here&#8217;s where it gets really interesting from an investment perspective.</p><p>Despite the strongest fundamental setup the memory industry has ever seen&#8212;sold-out HBM capacity through 2026, record margins, structural demand from AI, and a three-player oligopoly with pricing discipline&#8212;the market is still pricing these stocks as if a classic downturn is imminent.</p><p>Micron trades at a forward P/E of about 10x, SK Hynix trades at approximately 5.2x forward P/E, and Samsung<strong> </strong>trades at a forward P/E of roughly 5x-7x&#8212;although this includes the total company, which includes much more than just memory.</p><p>The PEG ratio makes the mismatch even clearer. Micron&#8217;s PEG is approximately 0.16x, Samsung is at 0.17, and SK Hynix is at 0.10&#8212;meaning the market is pricing almost zero growth premium into the stocks.</p><p>But at these valuation levels, the question is not whether these companies will continue to grow; it&#8217;s more about how long the current demand signals will last. If these memory demand levels and margins stay here for a few more years, that would be a scenario that markets are not pricing in.</p><p>Why? Because the market has been burned by memory cyclicality before. Investors remember that in the 2017-2018 supercycle, Micron stock peaked at ~$64 with a forward P/E of about 4-5x at the top, and then the stock fell 56% even though earnings were still rising. The conditioned response is &#8220;memory is peaking, get out before the crash.&#8221;</p><p>But this framing assumes the old cycle repeats.<strong> </strong>It assumes that the demand driver (AI infrastructure buildout and inference scaling) behaves like the demand driver in previous cycles (consumer device upgrades). And I believe that assumption could be wrong.</p><p><strong>Why the downturn when it comes might be shallower</strong></p><p>I&#8217;m not arguing that memory prices will never decline. They will. At some point, new fab capacity from current investment plans will come online. At some point, HBM4 yields will improve, and supply will catch up. The 2017-2018 cycle teaches us that supply response is inevitable.</p><p>But I believe the depth and duration of the downturn will be structurally different this time (dangerous words I know):</p><p><strong>1. The end market is not bounded by human endpoints. </strong>In the PC cycle, once every household had a PC, demand plateaued. In the smartphone cycle, once penetration hit saturation, annual unit growth went to zero. But the number of AI inference calls per day is growing exponentially and is nowhere near saturation. Every enterprise, every consumer app, every autonomous vehicle, every AI agent is an incremental consumer of memory bandwidth.</p><p>This view is also shared by many industry experts. Here is a former high-ranking employee from ASML on this topic:</p><div class="pullquote"><p>&#187;The current conditions actually have made us move away from cyclicality simply because the ratio of the chips that go into laptops and cell phones and other personal-use devices is getting lower each day as the capacity gets transferred to AI-related infrastructure. We may not be able to predict the condition or state of these memory manufacturers based on cyclicality anymore.&#171;</p><p>Source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p><strong>2. Memory content per AI unit is growing exponentially, not linearly. </strong>DRAM content per PC grew from maybe 4GB to 16GB over a decade&#8212;a 4x increase. HBM content per GPU is going from 80GB (A100) to 288GB - 576GB (Rubin Ultra) in just a few years&#8212;a 7x increase. And the number of GPUs being deployed is also growing at 30-40% annually. The compounding effect of more units &#215; more memory per unit is producing demand growth rates the industry has never seen.</p><p><strong>3. HBM is structurally supply-constrained. </strong>One gigabyte of HBM consumes approximately 4x the wafer capacity of standard DRAM. HBM also requires advanced packaging (CoWoS or its equivalents), which has its own supply bottleneck. You can&#8217;t just flip a switch and convert commodity DRAM lines to HBM production. The manufacturing complexity acts as a natural supply governor that didn&#8217;t exist in previous cycles.</p><p><strong>4. Long-term contracts are dampening volatility. </strong>In a major shift from past cycles, memory companies are increasingly locking in multi-year supply agreements with hyperscalers. SK Hynix has finalized its 2026 HBM supply plan with major clients and expects supply to remain tight through 2027. Micron has sold out its 2026 HBM capacity and has pricing agreements already in place. These contracts reduce the spot market&#8217;s influence and provide revenue visibility that the memory industry has never had before.</p><p>On top of the long-term contracts, the memory providers are much more careful with investing in new capacity this time, as the past cycle scars are a strong reminder. Here is a comment from a current Microsoft employee on what they expect in terms of memory supply coming online:</p><div class="pullquote"><p>&#187; I don&#8217;t think anyone on the buying side assumes memory suppliers will automatically rush to add unlimited supply just because demand is strong. The history of boom-bust cycles is very real, and suppliers remember that just as well as buyers do.<br><br>From my perspective, the expectation isn&#8217;t that all suppliers aggressively overbuild, but that they add capacity in much more controlled stages way than in the past cycles. What is different this time is the nature of demand. A lot of AI-driven demand is tied to long-lived infrastructure programs rather than short consumer cycles, which gives the suppliers more confidence but not enough to blindly overspend.&#171;</p><p>Source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>Perhaps the even more telling comment is this one made by a Fromer high ranking Micron employee on the internal cultural scars that the memory cycles have made:</p><div class="pullquote"><p>&#187;Micron has always positioned themselves as not the cheapest. Like I said, in the past, yes, when it was under Steve Appleton, Mark Durcan, Mark Adams, they&#8217;ve been trying to gain market share by reducing prices, but with the new CEO Sanjay, he is more focused on profitability rather than market share. Market share also is important, but if you were to choose between market share and profitability, he chooses profitability.&#171;</p><p>Source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p><strong>5. The price elasticity of AI demand works in memory&#8217;s favor. </strong>If DRAM prices decline 20-30% (as they inevitably will at some point), the cost of running AI inference drops proportionally. This makes AI deployment cheaper, expanding the addressable market, which in turn supports memory demand. The demand floor is higher than in past cycles because cheaper memory creates new demand, rather than simply being absorbed by a fixed number of devices.</p><p>At some point, we will see a correction, but one that looks more like a 15-25% revenue decline and margins compressing to the 35-40% range, rather than the historic 30-40% revenue declines and sub-25% margins of previous busts. And crucially, I think the trough will be shorter, because AI inference demand will continue growing even during the cyclical correction, providing a demand floor that didn&#8217;t exist in the consumer device era.</p><p><strong>The bottom line</strong></p><p>The memory industry has spent 30 years teaching investors the same lesson: the cycle always turns, the crash always comes, and &#8220;this time is different&#8221; are the four most expensive words in investing. I respect that history deeply, and I&#8217;ve laid out the data to show you exactly how brutal those turns have been.</p><p>But I&#8217;m willing to bet against that lesson&#8212;partially&#8212;because the underlying demand driver has genuinely changed. That is why I also own stakes in SK Hynix and Samsung. Memory was a component in your gadget. Now it&#8217;s a substrate for intelligence. And the demand for intelligence&#8212;like the demand for energy, for computing, for connectivity&#8212;doesn&#8217;t follow the same saturation dynamics as consumer electronics.</p><p>The real risk for the memory cycle at the current stage is a technical breakthrough that would require orders-of-magnitude less memory and HBM, or a change that would bypass memory altogether. The chances of that happening today are low, but it is something to keep a close eye on all the time.</p><p>In the next section of this article for paid subscribers, I analyzed in detail how long I think this memory shortage and cycle will last, the timing of memory supply coming online for memory makers, including Chinese memory providers, and their possible effect on the market. Here is my take:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/every-memory-cycle-ends-the-same">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Amazon's value in the Age of AI Agents]]></title><description><![CDATA[The changes caused by AI on Amazon: E-Commerce, AWS & Ads. How valuable each looks in a world where AI agents sit between humans and the services they use.]]></description><link>https://www.uncoveralpha.com/p/amazons-value-in-the-age-of-ai-agents</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/amazons-value-in-the-age-of-ai-agents</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 05 Mar 2026 13:51:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!JkjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi everyone,</p><p>In this article, I&#8217;m breaking down my current thinking on Amazon. My goal here is to explain in detail the changes caused by AI on three pillars of the business: E-Commerce, AWS, and Advertising, and specifically how valuable each looks like in a world where AI agents increasingly sit between humans and the services they use. At the end, I&#8217;ll do a sum-of-parts valuation that I think gives a useful anchor for where the stock sits today.</p><p>Let&#8217;s start.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JkjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JkjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!JkjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!JkjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!JkjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JkjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2826536,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/189993139?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JkjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!JkjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!JkjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!JkjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3645c396-e7ab-4c14-845b-d53d4fb45e7c_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>E-Commerce - the agentic threat and the logistics moat</strong></p><p>Amazon captures roughly 40% of all U.S. e-commerce spending. It has 240+ million Prime subscribers globally (analyst estimates; Amazon last officially disclosed &#8220;over 200 million&#8221; in 2021), of which approximately 180&#8211;185 million are in the United States, representing penetration in about 80% of U.S. households. The Prime flywheel is well-documented: members spend on average $1,400/year, compared with $600 for non-Prime customers, and the retention rate after the first year is 99%, according to CIRP data. Amazon delivered over 8 billion items same or next day to U.S. Prime members in 2025, a 30%+ increase year-over-year.</p><p>This is the business everyone knows. But here&#8217;s the question that matters for the next 3&#8211;5 years: what happens when AI agents start shopping for consumers?</p><p><strong>The agentic shopping risk</strong></p><p>I strongly believe that in the future, most e-commerce shopping will be done through AI agents acting as personal assistants to consumers, instead of direct consumers. I am not alone in those expectations. McKinsey projects agentic commerce could generate $1 trillion in U.S. retail revenue by 2030. Morgan Stanley expects nearly 50% of American shoppers will use AI agents by then, potentially adding $115 billion in e-commerce spending. Bain research shows that 30&#8211;45% of U.S. consumers already use GenAI for product research and comparison. During Cyber Week 2025, roughly 1 in 5 orders on Shopify involved an AI agent. AI-driven traffic to retailer sites has surged 7x since January 2025, according to Shopify data, with AI-driven orders up 11x.</p><p>What&#8217;s happening is this: instead of opening the Amazon app, a consumer tells ChatGPT, Claude, or Gemini what they need. The agent searches across retailers, compares prices, checks reviews, and either completes the purchase or presents a shortlist. OpenAI has already embedded checkout directly into ChatGPT. Perplexity launched its Comet browser agent. Google is rolling out agentic AI shopping tools.</p><p>This is a major shift in consumer behaviour, and Amazon knows it. In November 2025, Amazon sued Perplexity for its AI browser agent making purchases on Amazon&#8217;s marketplace. The company has blocked 47 AI bots from crawling its site. But at the same time, CEO Andy Jassy acknowledged on their recent earnings call that agentic commerce &#8220;has a chance to be really good for e-commerce.&#8221; Amazon recently even posted a job for a principal corporate development officer specifically for &#8220;agentic commerce&#8221; partnerships.</p><p>Forrester retail analyst Sucharita Kodali captured the tension perfectly: </p><div class="pullquote"><p>&#8220;With an agent on ChatGPT, retailers risk relinquishing transactions on their site to pay a toll on someone else&#8217;s highway.&#8221;</p></div><p><strong>Amazon&#8217;s shot at owning the application layer</strong></p><p>That said, Amazon isn&#8217;t conceding the front-end. They have several assets that give them a legitimate shot at being a surface where agentic shopping enters:</p><p>Rufus &#8212; Amazon&#8217;s AI shopping assistant, used by more than 300 million customers in 2025. Customers using Rufus complete purchases at a 60% higher rate. It can now auto-purchase items when prices hit thresholds.</p><p>The most interesting recent project is the &#187;Buy For Me&#171; project. This is Amazon&#8217;s experimental agent that can purchase from other retailers within the Amazon app. This is a smart flip from Amazon: instead of being the store that other agents shop, Amazon becomes the agent that shops everywhere else. Amazon does have some unique assets that make it valuable as the front-end touchpoint, and the key is around Prime Subscriptions.</p><p>Prime Video &#8212; 315 million ad-supported viewers globally, up from 200 million in early 2024. This is a massive surface for product discovery and agentic commerce integration, especially through interactive shoppable ads during live sports (Thursday Night Football averaged 15.3 million viewers, +16% YoY). Twitch &#8212; 105+ million monthly users, heavily Gen Z. An engaged, commerce-friendly audience. Alexa &#8212; still the most widely deployed voice assistant in smart home devices. If agentic commerce moves to a voice-first or ambient-first paradigm, Alexa has a head start.</p><p>The risk here is that those surfaces might not be enough and that Amazon might not be aggressive enough in the early days of where we are today. From today&#8217;s vantage point, the dominant surfaces if I had to choose would still be the smartphone assistant, or a standalone AI app (similar to ChatGPT, Gemini, Claude), and later on the AI glasses and personal assistant given by that provider. Prime Video and Twitch will still serve as important discovery platforms and could turn out to be much more valuable in terms of ads in a world where it will become increasingly hard to reach a human via digital channels, as internet usage will be dominated by AI agents instead of humans. Still, it doesn&#8217;t solve the fact that the application layer, where most of the e-commerce starts, moves to other providers. Even if Amazon were to launch an independent AI shopping assistant app, I don&#8217;t think in the long-term that would be &#187;moaty&#171; enough. My view is that the dominant provider will be the one that can offer a full AI personal assistant, with shopping as one of its features, not the only or main one. For that to be Amazon, they would need to make an aggressive pivot from current levels and a possibly strong shift into consumer hardware, which I don&#8217;t think is their plan.</p><p>With all that said, my base case is that Amazon will not be the application layer of agentic shopping and that its e-commerce business will move to the backend part of the shopping experience (still being important). Even in this scenario, Amazon still makes a decent margin given the logistics, payment, and fulfillment infrastructure that it offers at scale.</p><p><strong>Advertising</strong></p><p>Amazon&#8217;s advertising revenue hit $68.6B in 2025, growing 22% YoY in Q4. This is now 9.6% of Amazon&#8217;s total revenue, up from 5.9% in 2021. To put it in context, Amazon&#8217;s ad business alone is larger than the total revenue of companies like Netflix, Uber, or Salesforce.</p><p>But here&#8217;s the nuance that most analysts don&#8217;t discuss: Amazon&#8217;s ad business is really two very different businesses glued together.</p><p><strong>Search ads</strong></p><p>The vast majority of Amazon&#8217;s advertising revenue comes from Sponsored Products: essentially search ads within Amazon&#8217;s marketplace. When you search for &#8220;wireless headphones&#8221; on Amazon, the first several results are paid placements. Amazon doesn&#8217;t break this out precisely, but based on WARC data, the retail media component (primarily search ads) accounts for roughly $60.6B of the estimated total, with Prime Video and other upper-funnel formats making up the incremental portion.</p><p>Here is my concern: search ads on Amazon are fundamentally tied to humans browsing Amazon&#8217;s website and app. If an AI agent shops for you, it doesn&#8217;t look at sponsored listings. It doesn&#8217;t scroll past display ads. It skips right to the product that best matches your criteria and places the order. As Bain research noted, about 65% of retail media spending still occurs onsite, and that entire bucket is at risk if product discovery shifts to AI-driven search.</p><p>This is why I think the search ad portion of Amazon&#8217;s advertising business is on a disruption clock. Not tomorrow, not next quarter, but over a 3&#8211;5 year horizon, the economics of Sponsored Products face a structural headwind as agentic interfaces capture more of the purchase journey and as we talked in the previous section I give it a low probabiliticy chance that Amazon is able to capture the AI agent assistant application layer so the eyeballs switch from amazon&#8217;s site and apps towards the AI assistant owners.</p><p><strong>Prime Video ads</strong></p><p>The other side of Amazon&#8217;s ad business is Prime Video advertising, and this is the piece I think is defensible. Amazon introduced ads on Prime Video in January 2024. S&amp;P Global Market Intelligence Kagan estimated Prime Video&#8217;s ad revenue at $433M in 2024 and forecast it to reach $806M in 2025. This is still a small fraction of total ad revenue, but it&#8217;s growing fast and serves a different function: brand advertising through streaming video is not susceptible to agentic disintermediation the same way search ads are.</p><p>Prime Video reaches 315 million monthly ad-supported viewers globally. That&#8217;s larger than Netflix&#8217;s ad-supported tier at 190 million. Thursday Night Football alone averaged 15.3 million viewers with 16% growth YoY, and the Packers-Bears wild-card playoff game drew 31.6 million viewers, the most-streamed NFL game in history. Amazon has also integrated Netflix and Spotify inventory into its Amazon DSP, giving advertisers a broader programmatic buying platform.</p><p>My estimate is that by 2027&#8211;2028, Prime Video ads could reasonably be a $3&#8211;5B annual revenue stream, growing at 40%+ rates as ad loads increase and live sports inventory expands (NBA deal kicks in, international sports expansion). This business is much more structurally defensible because people watch content &#8212; AI agents don&#8217;t.</p><p>But even that revenue doesn&#8217;t materially change my thesis that the majority of Amazon&#8217;s ad business is at risk of serious disruption.</p><p>For the sum-of-parts analysis in the last part of this article, I&#8217;m splitting the ad business into two buckets. For the search/retail media portion (~$60&#8211;63B), I&#8217;m assigning it a terminal value as if profits only last 4 more years with zero terminal value after that. That&#8217;s deliberately punitive - I&#8217;m assuming this revenue stream is structurally impaired. For Prime Video ads, I&#8217;ll fold it into the e- commerce/subscription ecosystem, where it has long-term durability.</p><p><strong>AWS - the cloud business</strong></p><p>AWS is the most important reason why I own Amazon stock and why it has now become my biggest portfolio position.</p><p>The biggest fear around AWS has been that AI-related capital expenditures would permanently compress margins. And yes, there was a dip: AWS&#8217;s operating margin fell to 32.9% in Q2 2025 as the company ramped up spending aggressively. But by Q4, it had recovered to 35.0%, and the full-year margin was 35.4%.</p><p>Here is my core argument: we are severely compute-constrained for the foreseeable future. Amazon has invested $131.8B in capex for 2025 and has guided to approximately $200B for 2026, predominantly for AWS infrastructure. The company added more than 1 gigawatt of data center capacity in Q4 alone and 3.9 gigawatts in the trailing 12 months, which is double what AWS had in total in 2022. And Andy Jassy expects to double power capacity again by the end of 2027.</p><p>Despite this massive buildout, demand continues to outstrip supply. Jassy noted on the Q1 call that GPU and motherboard shortages were limiting the pace of AI workload onboarding. Bedrock (Amazon&#8217;s managed AI service) reached a multi-billion-dollar annualized run rate with customer spend growing 60% quarter-over-quarter to a base of over 100,000 customers. Trainium2 is fully subscribed with 1.4 million chips deployed.</p><p>In this environment, there is no incentive for hyperscalers to engage in a pricing war. When every chip you install is immediately monetized, you don&#8217;t cut prices &#8212; you add capacity. Until compute supply catches up with demand (which I don&#8217;t expect before 2029 at the earliest), AWS can maintain mid-30%+ operating margins without sacrificing growth. The margin should hold around pre-AI era levels (AWS operated in the 28&#8211;35% range historically, with 2024 averaging 37%) because the scarcity dynamic supports pricing power.</p><p><strong>Trainium and Custom Silicon are key things for long-term margins</strong></p><p>This is a point I don&#8217;t think gets enough attention. NVIDIA&#8217;s gross margin sits at roughly 73&#8211;75%. Every cloud provider that is 100% dependent on NVIDIA for AI compute is paying that tax on every GPU. That cost flows through to the cloud provider&#8217;s cost of revenue and structurally limits the margin they can earn on AI workloads.</p><p>Amazon, through its Annapurna Labs subsidiary, has developed Trainium and Inferentia custom ASICs, as well as Graviton CPUs for general compute. Combined, these custom chips have surpassed a $10B annualized revenue run rate, growing at triple-digit percentages YoY. According to Amazon, Graviton provides 40% better price-performance than x86 processors and is adopted by 90% of AWS&#8217;s top 1,000 customers.</p><p>Trainium2 powers Project Rainier, the world&#8217;s largest operational AI compute cluster with 500,000+ Trainium2 chips, which Anthropic uses to train its Claude models. Trainium3 is in preview with broader volumes expected in early 2026, and Trainium4 is targeted for 2027.</p><p>I am sharing here the chart that we made some months ago in our detailed Amazon <a href="https://www.uncoveralpha.com/p/amazon-trainium-scaling-ai-without">Trainium piece</a>, where we calculated the manufacturing costs of Amazon Trainium, Google TPUs, and Nvidia&#8217;s Blackwell B200:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H8dN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H8dN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 424w, https://substackcdn.com/image/fetch/$s_!H8dN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 848w, https://substackcdn.com/image/fetch/$s_!H8dN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 1272w, https://substackcdn.com/image/fetch/$s_!H8dN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H8dN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp" width="761" height="162" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:162,&quot;width&quot;:761,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15010,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/189993139?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H8dN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 424w, https://substackcdn.com/image/fetch/$s_!H8dN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 848w, https://substackcdn.com/image/fetch/$s_!H8dN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 1272w, https://substackcdn.com/image/fetch/$s_!H8dN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff26a195-f827-4296-8139-ccd03db52b7b_761x162.webp 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>You can see the most significant difference: if it costs Amazon $ 3,000-$ 3,500 to produce a Trainium3 chip, it costs them $35k-$40k to buy an Nvidia B200 chip. Even though B200 is much more performant from a cost-of-ownership perspective, Trainium3 gives B200 a run for its money.</p><p>The margin math is straightforward. When you design and manufacture your own silicon (using TSMC for fabrication and a design partner like Broadcom, Marvell, MediaTek), your cost per unit of compute is significantly lower than buying merchant silicon from NVIDIA at a 73&#8211;75% gross margin. This gives AWS a structural margin advantage for AI workloads vs. a competitor that sources 100% from NVIDIA. It doesn&#8217;t mean AWS abandons NVIDIA (it still offers NVIDIA instances), but having an alternative lets AWS capture more of the AI value chain and maintain margin in ways that someone who is entirely dependent on NVIDIA simply cannot.</p><p>This key difference will prove even more important in the coming years, especially once demand/supply for compute is more in balance and the hyperscalers&#8217; focus shifts from capturing revenue growth to profitability and customer optimization.</p><p><strong>Traditional cloud demand is actually accelerating because of AI agents</strong></p><p>There&#8217;s a narrative that AI is all that matters for AWS growth. That misses something important: AI agents themselves create enormous demand for traditional cloud services, as we already discussed in part in our <a href="https://www.uncoveralpha.com/p/the-forgotten-chip-cpus-the-new-bottleneck">The Forgotten Chip: CPU the New Bottleneck of the Agentic AI er</a>a article. Every AI agent needs storage (S3), compute (EC2, powered increasingly by Graviton), databases, networking, and monitoring. The more AI agents there are in production, the more traditional cloud infrastructure gets consumed.</p><p>The number of AI agents and their deployment is rapidly surging right now. Here is an alt provider that tracks the Model Context Protocol (MCP), an open-source standard for connecting AI applications to external systems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2fB-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2fB-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 424w, https://substackcdn.com/image/fetch/$s_!2fB-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 848w, https://substackcdn.com/image/fetch/$s_!2fB-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 1272w, https://substackcdn.com/image/fetch/$s_!2fB-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2fB-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png" width="808" height="411" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:411,&quot;width&quot;:808,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:62841,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/189993139?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2fB-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 424w, https://substackcdn.com/image/fetch/$s_!2fB-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 848w, https://substackcdn.com/image/fetch/$s_!2fB-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 1272w, https://substackcdn.com/image/fetch/$s_!2fB-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F413d3f3e-93e1-45df-bab0-17352c13ef4b_808x411.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">source: <a href="https://bloomberry.com/">Bloomberry</a></figcaption></figure></div><p>The number of MCP servers being set up every month is growing exponentially, and the MoM pace is accelerating. The market is still in very early stages, as the current number of MCP servers is probably less than 1% of the API market. But the interesting thing was comparing where these MCP servers were being deployed with where API deployments are. Both Azure and GCP % of these MCP server deployments were lower compared to their API deployments, while AWS MCP deployments actually rose compared to API deployments:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z5Pt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 424w, https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 848w, https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 1272w, https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png" width="821" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:821,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37571,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/189993139?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 424w, https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 848w, https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 1272w, https://substackcdn.com/image/fetch/$s_!Z5Pt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98469a4a-a09e-4294-ac99-e58080c5c9bc_821x522.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">source: <a href="https://bloomberry.com/">Bloomberry</a></figcaption></figure></div><p>While this data is not large enough yet, it could indicate that more smaller companies are on AWS than on the other two hyperscalers and that they are early adopters. The data in some form also shows the importance of AWS&#8217;s &#187;legacy&#171; cloud infrastructure, which is very much needed in the Agentic AI phase.</p><p>The demand for traditional AI infrastructure is skyrocketing, and you can see it from comments from CEOs of AMD and Intel, where they are basically sold out of CPUs, and you can see it when talking to other industry experts.</p><p>This is a former Amazon employee talking about the usage of the AWS S3 service (storage):</p><div class="pullquote"><p>&#187;Right now, S3, nobody is thinking about how S3 is exploding. It&#8217;s quite an explosion because what are AI systems doing? They&#8217;re generating embeddings, they&#8217;re storing the prompts and the responses. Where are they storing this? S3. They&#8217;re logging every interaction for auditing, tuning, safety. All of this is going into S3. Remember when we used to think of S3 as just their cheap storage? The storage is still cheap, but you&#8217;re using more of it&#171;.</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>The key takeaway from this comment is the need to store this data for audit and safety purposes. Companies running these AI agents need clear, auditable trails of what an AI agent has done, so they can track and monitor as problems arise and fix them. Nobody wants to give an AI agent full permission to run freely across the company's stack and make changes and tasks that nobody has visibility into. This human visibility means services like Storage grow even more in usage.</p><p>A big tell was also the November OpenAI- AWS deal. The press release stated that OpenAI would access &#8220;hundreds of thousands of state-of-the-art NVIDIA GPUs, <strong>with the ability to expand to tens of millions of CPUs </strong>to rapidly scale agentic workloads.&#8221;</p><p>The GPU part is known, but the CPU part is the most interesting one. We need &#187;legacy&#171; cloud workloads and CPUs to enable the AI agent economy; that is just the way it is, and this is a big uplift for AWS, which has the largest fleet of optimized cloud services out there.</p><p>Amazon noted that more of the top 500 U.S. startups use AWS as their primary cloud provider than the next two providers combined. That startup and scale-up cohort is building AI-native applications that are heavily cloud-intensive.</p><p><strong>The on-prem fallacy and SMB lock-in</strong></p><p>Some investors argue that AI inference will eventually move to the edge or on-prem, killing the cloud growth story. Let me push back on this.</p><p>First, even if 90% of personal AI assistant use cases eventually run on edge devices (phones, laptops, local hardware), the remaining 10% that stays on cloud or on-prem infrastructure is still an enormous market. These are the &#8220;god-like AI&#8221; use cases: complex enterprise reasoning, multi-step agentic workflows, financial modeling, drug discovery, code generation at scale. These require the kind of compute density and model size that doesn&#8217;t fit on a phone. And these use-cases are the most profitable, as their outputs are the most valuable.</p><p>Second, on-prem AI infrastructure is radically more complex than anything businesses have managed before. Running an AI inference cluster on-prem means managing GPU and CPU servers, networking fabric, cooling systems, model deployment pipelines, and monitoring at a level of sophistication that most IT departments have never dealt with. For any small or medium-sized business, the cost and complexity of running your own AI infrastructure to have your &#8220;AI accountant&#8221; or &#8220;AI customer service agent&#8221; simply doesn&#8217;t make sense when you can rent it from AWS for a fraction of the upfront cost with zero operational hassle.</p><p>The cloud is the natural home for AI workloads for the vast majority of companies, and that reality isn&#8217;t changing anytime soon. If anything, as AI becomes more central to knowledge work, more companies will move to the cloud specifically to access AI capabilities they can&#8217;t build or run themselves.</p><p><strong>The revenue trajectory of AWS</strong></p><p>With all that in mind, I believe AWS, with its power capacity availability, which I already discussed in my previous articles, is well-positioned for multiple quarters of accelerating growth. I believe, despite AWS&#8217;s size, we will soon see the segment grow by +30% YoY. AWS also exited Q4 2025 with a backlog of $244B with a weighted average remaining life of 4.1 years. Capacity is being installed and monetized as fast as it comes online.</p><p>If AI agents truly absorb a meaningful portion of knowledge work over the next 5&#8211;10 years &#8212; and companies like Anthropic ($19B ARR rate up from $9B just two months ago) and OpenAI are building the models to do exactly that &#8212; then the total demand for cloud inference is going to be multiples of what it is today. Every AI-powered accountant, lawyer, engineer, customer service agent, and analyst running in the cloud creates recurring compute demand.</p><p><strong>The Anthropic and OpenAI stakes are hedges</strong></p><p>Besides the already mentioned segments, Amazon also has other important aspects such as Project Kupier, Subscription business, and stakes in Anthropic and OpenAI, which are now becoming increasingly important.</p><p>Amazon has invested approximately $8B in Anthropic (capped below 33% ownership) and recently announced a strategic partnership with OpenAI that includes an investment of up to $50B (starting with an initial $15B commitment, with the remainder tied to milestones and a potential OpenAI IPO).</p><p>Anthropic just closed a $30B funding round at a $380B post-money valuation in February 2026. If Amazon holds roughly 20% of Anthropic (estimates vary given the cap structure), that stake is worth $76B on paper. But in the last few weeks, Anthropic has accelerated its adoption and revenue growth so much that a $500B valuation for a company that will probably exit 2026 at a $50B ARR growing 5x YoY and disrupting the whole knowledge work economy is nothing extraordinary, which would add $100B of value or almost 5% of Amazon&#8217;s current market cap.</p><p>For OpenAI, the proposed $100B funding round would value the company at approximately $830B. Amazon&#8217;s $50B investment at those terms would represent roughly a 6% stake.</p><p>Combined, these stakes could be worth +$145B. And here&#8217;s the real value: in a world where Anthropic, OpenAI, and Gemini become the application layer, having significant stakes in two of those companies isn&#8217;t just financial investments. They are Amazon&#8217;s guarantee that the biggest AI consumers remain AWS customers. OpenAI has committed to spending $100B on AWS over the next eight years. Anthropic is using Project Rainier (500,000+ Trainium2 chips) for training. Both are locked in as massive cloud customers.</p><p><strong>Valuation</strong></p><p>Now let&#8217;s put the numbers together. I&#8217;m deliberately being conservative in places and factoring in serious disruption risk. Here are my numbers:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/amazons-value-in-the-age-of-ai-agents">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Forgotten Chip: CPUs the New Bottleneck of the Agentic AI Era]]></title><description><![CDATA[For three years, GPUs have been the only chip that mattered in AI. CPUs? The boring, commodity chip that just sat next to the GPU and passed data along. Nobody cared. That&#8217;s changing fast.]]></description><link>https://www.uncoveralpha.com/p/the-forgotten-chip-cpus-the-new-bottleneck</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-forgotten-chip-cpus-the-new-bottleneck</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Mon, 23 Feb 2026 13:50:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zo6V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba3c69d-b335-4e71-94ba-046238903fac_1540x1809.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>For three years, GPUs have been the only chip that mattered in AI. Every investor pitch, every earnings call, every CapEx headline was about who could get more Nvidia GPUs. </p><p>CPUs? An afterthought. The boring, commodity chip that just sat next to the GPU and passed data along. Nobody cared. That&#8217;s changing fast. And if you&#8217;re not paying attention to the &#8220;CPU renaissance&#8221; happening right now, you&#8217;re missing what I believe is one of the more important infrastructure shifts in this AI cycle.</p><p>In this article, I will break down exactly why agentic AI is changing the CPU demand, how exactly CPUs are used in agentic AI, how big the CPU market can become because of AI agents, and which public companies stand to benefit. I&#8217;ll also discuss whether we&#8217;re heading into a genuine CPU bottleneck and how long it could last.</p><p><strong>Why Agentic AI Changes Everything for CPUs</strong></p><p>To understand why CPUs suddenly matter, you need to first understand how agentic AI workloads are fundamentally different from the &#187;classic&#171; chatbot-style AI we&#8217;ve been running for the past three years.</p><p><strong>The old workflow &#8212; chatbot:</strong></p><p>When you use ChatGPT or any standard AI chatbot, the process is straightforward. You type a question, the CPU tokenizes it (converts your text into numerical tokens the model can process), ships it over to the GPU, the GPU runs the tokens through the model and generates a response, then ships the output back to the CPU, which de-tokenizes it and delivers the answer. In this workflow, the CPU does very little. Maybe 5-10% of the total compute. The GPU is doing all the heavy lifting with its matrix multiplications, attention calculations, and token generation. This is why, for three years, the entire industry was laser-focused on GPUs.</p><p><strong>The new workflow &#8212; agentic AI:</strong></p><p>Agentic AI is fundamentally different. Instead of a simple question-answer loop, you&#8217;re dealing with autonomous systems that plan, execute, use tools, browse the web, query databases, make API calls, write and run code, and then reflect on whether they did a good job before deciding what to do next. A single user request can spin off dozens or even hundreds of sub-agents, each running their own loops of reasoning and action in parallel.</p><p>All of that orchestration, tool calling, API handling, memory management, and coordination between sub-agents happens on the CPU, not the GPU.<strong> </strong>The GPU still handles the inference (the &#8220;thinking&#8221; part), but between each inference call, the CPU is doing an enormous amount of work. It&#8217;s parsing responses, deciding which tool to call next, managing the execution plan, handling file I/O, running code, making network requests, and coordinating which sub-agents depend on which other sub-agents&#8217; results.</p><p>In an interview, a VP at Intel explained:</p><div class="pullquote"><p>&#8220;Agentic AI is nothing but a combination of independent agents... If there are in workflow, say, 10, 20, 30, 40, 100 agents, and they all need to talk to them, then they need different locations to operate. When I say location, I talk about CPUs.&#8221;</p><p>source: AlphaSense</p></div><p>A Georgia Tech and Intel research paper from November 2025 quantified this, and the findings are striking: tool processing on CPUs accounts for between 50% and 90% of total latency in agentic workloads.<strong> </strong>In many agentic workflows, the CPU is responsible for the majority of the wait time, not the GPU. The GPU sits idle, waiting for the CPU to finish its work before it gets the next batch of tokens to process.</p><p>This completely inverts the infrastructure economics we&#8217;ve been operating under. In the chatbot era, you needed a small number of high-end CPUs paired with massive GPU clusters. In the agentic era, you potentially need more CPUs than GPUs, and the CPU-to-GPU ratio in a rack or cluster needs to go up significantly.</p><div class="pullquote"><p>&#8220;For every GPU workload, there is a supporting CPU demand. The CPU is going to handle the data processing, the orchestration, the API layers, post processing.&#8221;</p><p>Source: AWS employee on AlphaSense</p></div><p><strong>Breaking Down the CPU Workload in Agent Systems</strong></p><p>Let me walk through what the CPU actually does in an agentic workflow, because I think understanding the details here is important for appreciating why this demand is structural and not a temporary blip.</p><p><strong>Step 1: Planning: </strong>The user gives a broad instruction (e.g., &#8220;Research the competitive landscape of the DRAM industry and write me a report&#8221;). The CPU tokenizes this and sends it to the GPU for an initial inference call. The GPU generates a plan of execution, not a final answer. That plan comes back to the CPU.</p><p><strong>Step 2: Orchestration: </strong>The CPU now breaks that plan into sub-tasks and assigns them to multiple agents. This is pure CPU work. It&#8217;s managing a directed acyclic graph of tasks, determining which ones can run in parallel, which depend on others, and in what order they should execute. If you have 10 research sub-topics, you might have 10 sub-agents that can all run simultaneously.</p><p><strong>Step 3: Tool execution: </strong>Each sub-agent starts working. This is where CPUs get extremely busy. Sub-agent 1 might make a web search API call, wait for results, parse the JSON response, extract relevant text, and package it for another inference call. Sub-agent 2 might query a database, run a SQL query, and process the results. Sub-agent 3 might open a file, read its contents, and prepare them for analysis. All of this &#8212; the API calls, network I/O, file handling, data parsing, JSON processing &#8212; is CPU work. The GPU is idle during these operations.</p><p><strong>Step 4: Inference loops: </strong>Each sub-agent may also run its own chain-of-thought reasoning, sending multiple inference requests to the GPU. Between each inference call, the CPU processes the output, decides if the agent is done, and either feeds the next prompt or moves to the next step.</p><p><strong>Step 5: Reflection: </strong>Once all sub-agents complete, the CPU gathers all their outputs and sends them to the GPU for a reflection inference loop &#8212; essentially asking the model, &#8220;did we answer the original question well enough?&#8221; If not, the whole cycle restarts. The key characteristics a CPU needs for this kind of workload are: high single-core clock speed (to minimize orchestration latency), high core count<strong> </strong>(to run many agents in parallel), fast memory access and large caches<strong> </strong>(to manage all the context and intermediate state), and strong I/O connectivity<strong> </strong>(PCIe lanes for network and storage, because agents are constantly hitting APIs and databases).</p><p>The AI server factories sitting above your general-purpose compute infrastructure don&#8217;t replace those traditional CPU servers. They create more demand for them. Because now, instead of one human slowly browsing the web and running a few apps, you have hundreds of AI agents aggressively consuming CPU resources at machine speed.</p><p><strong>The demand for CPUs is already showing up in earnings calls</strong></p><p>This new CPU demand has already been shown in recent earnings calls.</p><p>On AMD&#8217;s Q4 earnings, AMD&#8217;s data center segment posted record revenue of $5.4 billion in Q4 2025, up 39% year-over-year and 24% sequentially.</p><p>But the key wasn&#8217;t the GPUs but the CPUs. Lisa Su explicitly called out CPUs as a major growth driver, stating:</p><div class="pullquote"><p>&#8220;demand for EPYC CPUs is surging as agentic and emerging AI workloads require high-performance CPUs to power head nodes and run parallel tasks alongside GPUs.&#8221;</p></div><p>AMD&#8217;s 5th Gen EPYC Turin CPUs accounted for more than half of total server CPU revenue by the end of Q4, and the number of EPYC cloud instances grew more than 50% year-over-year to nearly 1,600 instances. The number of large enterprises deploying EPYC on-premises more than doubled in 2025.<strong> </strong>Su specifically highlighted that in agentic workflows, when AI agents spin off work in an enterprise, &#8220;they&#8217;re actually going to a lot of traditional CPU tasks.&#8221; She expects the server CPU market to grow by &#8220;strong double digits&#8221; in 2026.</p><p>Su also noted that &#8220;x86 processors have a particular edge in agentic workloads where AI agents spin off work to traditional CPU tasks, with the vast majority of such tasks running on x86 today.&#8221;</p><p>Looking ahead, Su guided for data center segment revenue to grow more than 60% annually over the next three to five years and for AMD&#8217;s AI business to scale to tens of billions in annual revenue by 2027. CPUs are a meaningful piece of that equation, not just GPUs.</p><p>And it&#8217;s not just the earnings call, you can also see it from multiple conversations with industry experts.</p><p>A former CTO of a HP competitor highlights that infrastructure is moving from static policy-based routing to &#8220;inference-based&#8221; routing. An AI-powered controller layer, running on CPUs, dynamically analyzes incoming workloads to determine whether they require expensive GPU cycles or can be offloaded to traditional x86 CPUs, optimizing resource allocation.</p><p>Agentic AI often involves deterministic tasks&#8212;such as following a specific rule set or executing a defined API call&#8212;that do not require the probabilistic power of a GPU. A Director at a Global Consultancy notes that these deterministic aspects of agentic workflows are most efficiently executed by CPUs, reinforcing the need for a balanced infrastructure where GPUs handle the &#8220;thinking&#8221; and CPUs handle the &#8220;doing&#8221;</p><p><strong>The CPU demand was a shock for Intel</strong></p><p>If AMD saw the CPU demand wave coming, Intel was genuinely surprised by it. Intel&#8217;s Q4 revenue came in at $13.7 billion, above guidance, with data center and AI revenue rising 15% sequentially &#8212; the fastest sequential growth this decade. But here&#8217;s the key: Intel admitted it couldn&#8217;t meet all the demand.</p><p>CEO Lip-Bu Tan said the company &#8220;delivered these results despite supply constraints, which meaningfully limited our ability to capture all of the strengths in our underwriting markets.&#8221; CFO David Zinsner was even more direct, admitting that Intel &#8220;misjudged&#8221; the pace of data center CPU demand and that the company is now &#8220;shifting as much as we can over to the data center&#8221; by reallocating wafer capacity from client (PC) CPUs to server CPUs.</p><p>Zinsner acknowledged that Intel is &#8220;absolutely constrained&#8221; and is deprioritizing the low-end client market to push capacity into data center products. Intel expects its supply to hit a low point in Q1 2026 before improving in Q2, but in the meantime, revenue &#8220;would have been higher if we had more supply. Management explicitly positioned CPUs as &#8220;central to AI orchestration and scaling inference.&#8221;</p><p><strong>The AWS-OpenAI Deal was the tell</strong></p><p>The most interesting data point on CPU demand came not from a chip company but from a cloud infrastructure deal back in November 2025. AWS and OpenAI announced a $38 billion, seven-year strategic partnership.<strong> </strong>The press release stated that OpenAI would access &#8220;hundreds of thousands of state-of-the-art NVIDIA GPUs, <strong>with the ability to expand to tens of millions of CPUs </strong>to rapidly scale agentic workloads.&#8221;</p><p>People wrongly focused on the Nvidia GPU part, but the CPU part is far more interesting. Tens of millions of CPUs. For agentic workloads. They didn&#8217;t have to include that detail. The fact that it&#8217;s in the official announcement tells you how seriously the frontier AI labs are thinking about CPU compute as a scaling requirement. All capacity under this agreement was targeted for deployment before the end of 2026, with options to expand into 2027.</p><p><strong>Nvidia &#8212; The Vera CPU</strong></p><p>Nvidia itself is making a big bet on the CPU side. Its upcoming Vera CPU, part of the Rubin platform announced at CES 2026, is specifically designed for agentic reasoning workloads. Vera delivers up to 2x the performance of the previous Grace CPU, with 88 cores per die and significant uplifts in memory and chip-to-chip bandwidth.</p><p>What&#8217;s particularly notable is that Nvidia announced Vera can be deployed as a standalone platform for agentic processing, separate from the GPU. CoreWeave is set to use standalone Vera CPUs, and Jensen hinted in a Bloomberg interview that &#8220;there are going to be many more&#8221; standalone CPU deployments. And it didn&#8217;t take long the Meta &amp; Nvidia deal was announced a few days ago:</p><div class="pullquote"><p>&#187;This partnership will enable the large-scale deployment of NVIDIA CPUs and millions of NVIDIA Blackwell and Rubin GPUs, as well as the integration of NVIDIA Spectrum-X&#8482; Ethernet switches for Meta&#8217;s Facebook Open Switching System platform&#8230;The collaboration represents the first large-scale NVIDIA Grace-only deployment.&#171;</p></div><p>This is Nvidia essentially confirming the thesis: in agentic AI, the CPU-to-GPU ratio needs to go up, and some workloads may be purely CPU-bound.</p><p><strong>Are We Heading Into a CPU Bottleneck?</strong></p><p>We&#8217;re already in one. The server CPU supply chain is under significant stress, and the constraints are coming from multiple directions simultaneously.</p><p>Intel is struggling with yield issues at some of its fabs, slowing the production ramp for newer Xeon parts. The company has admitted it cannot meet demand and is reallocating capacity from PC CPUs to server CPUs, meaning the PC segment will take a hit. Intel expects supply to improve starting Q2 2026, but the situation remains &#8220;acute&#8221; in Q1.</p><p>TSMC is prioritizing AI accelerators, which means less capacity for CPUs. AMD&#8217;s server CPUs are manufactured by TSMC, but TSMC is aggressively prioritizing its advanced node capacity for higher-margin AI accelerator chips (GPUs and custom ASICs). TSMC chairman C.C. Wei publicly stated that advanced-node capacity is &#8220;about three times short&#8221;<strong> </strong>of what major customers plan to consume. When TSMC&#8217;s 3nm process is running at 160,000 wafers per month and that&#8217;s still not enough, and when CoWoS advanced packaging capacity is sold out through 2026, CPU wafer allocation gets squeezed as a collateral effect.</p><p>Intel has also already warned Chinese customers of delivery lead times of up to six months<strong> </strong>for certain server CPUs. AMD&#8217;s lead times have stretched to 8-10 weeks<strong> </strong>for some products. Intel server chip prices in China have risen more than 10%.<strong> </strong>China represents over 20% of Intel&#8217;s total revenue, and major customers like Alibaba and Tencent are affected.</p><p>An additional problem to supply is the memory-driven pull-forward.<strong> </strong>The severe global memory shortage is creating a rush effect on CPU purchases. When memory prices started rising in China late 2025, customers accelerated CPU purchases to lock in system-level pricing before costs spiraled further. This pull-forward exacerbated the existing supply tightness.</p><p>A cloud computing materials manager reports:</p><div class="pullquote"><p>&#8220;Our supply chain was a constraining factor... GPU, CPU, and RAM were the top three drivers for us being constrained&#8221; as customers convert to &#8220;more powerful CPUs that can run higher AI workloads.&#8221;</p><p>Source: AlphaSense</p></div><p>A global IT distributor reports CPU shortages are &#8220;directly driving a 30% increase in average selling prices (ASPs) during the fourth quarter of 2025&#8221; with &#8220;increased backlogs&#8221; as order intake exceeds expectations.</p><p>So the CPU bottleneck is already here; the question now is how long it will last.</p><p>In the next section, I analyzed how many CPUs we will need in this agentic AI and gave a timeline of when supply could meet the demand, on top of which companies stand to benefit most from this trend:</p><p></p>
      <p>
          <a href="https://www.uncoveralpha.com/p/the-forgotten-chip-cpus-the-new-bottleneck">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Market Hates Big Cloud Spending. The Data Says The Market Is Wrong.]]></title><description><![CDATA[There has been an emergence of fear after big tech earnings related to CapEx AI spending. I decided to share my views on this topic and why I believe the fears around it are wrong at this point.]]></description><link>https://www.uncoveralpha.com/p/the-market-hates-big-cloud-spending</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-market-hates-big-cloud-spending</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Wed, 11 Feb 2026 14:17:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tWTP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>Because there has been an emergence of fear after big tech earnings related to CapEx AI spending, I decided to share my views on this topic and why I believe the fears around it are wrong at this point.</p><p>We had earnings from Meta, Microsoft, Google, and Amazon, and all of them increased AI CapEx substantially. Microsoft CapEx went from $63 billion in 2025 to a guide of over $100 billion for 2026; Google (Alphabet) went from $91 billion to a range of $175 billion to $185 billion; Meta went from $72 billion to a range of $115 billion to $135 billion; and Amazon went from $131 billion to a $200 billion guidance for 2026.</p><p>At this point, given the massive increases in these CapEx number as an investor, you are either on the side of the group of investors who don&#8217;t believe the companies will be able to deliver revenues and profits on these investments, or you are on the side who do believe, and based on that, the future revenue and profit outlooks are very high. Given the stocks mostly sold off on this CapEx news, the &#187;bears&#171; took over, but I don&#8217;t agree with them, and this article explains why. In the last part of the article, I also break down which hyperscaler looks best positioned to further accelerate their growth in 2026 and 2027.</p><p>The CEOs of all these businesses are not only telling you they believe the profits will be there, they are already showing you growth that came from 2023 AI investments (that keep in mind often were also criticized for being outlandish), and more importantly, they are showing you PROFITS on these AI investments.</p><p>Here is what the market has missed.</p><p><strong>The Q4 2025 AI Earnings profits were overshadowed by future CapEx numbers</strong></p><p>Before we go into the actual numbers, the fact is that we got really strong commentary from nearly all the big tech CEOs on AI revenue and returns from these investments. One might think that they are saying this because it is in their interest, but that is not really true. For most big tech companies like Google and Microsoft, it is actually in their interest for AI progress to grow at a more gradual rate than the exponential one it has today. The reason is that a lot of their business lines face disruption risk (Google Search, Microsoft software business, etc.). So these strong commentaries from these CEOs should be taken differently than comments coming from startups like OpenAI, Anthropic, and xAI, who, in some way, have to project the fast growth curve of AI as they need to raise new capital rounds, so they naturally have to project confidence both in terms of companies as well as the market in general.</p><p>We got some interesting comments from Amazon, which has a history of being very strict and efficient in its data center business. Andy Jassy confirming multiple times the confidence in the return on invested capital:</p><div class="pullquote"><p>&#187;We have deep experience understanding demand signals in the AWS business and then turning that capacity into strong return on invested capital. We&#8217;re confident this will be the case here as well.&#171;</p><p>&#187;We have, I think, a fair bit of experience over the years in AWS of forecasting demand signals and doing it in such a way that we don&#8217;t have a lot of wasted capacity and that we also have enough capacity to serve the demand that&#8217;s there.&#171;</p><p>&#187;And I think we&#8217;ve also proven with AWS over the years in how we build data centers and how we run them and how we invent in there, if you think about our chips and our hardware and our networking gear and how we&#8217;ve invented in power that this isn&#8217;t some sort of quixotic top line grab, we have confidence that we -- that these investments will yield strong returns on invested capital. We&#8217;ve done that with our core AWS business. I think that will very much be true here as well.&#171;</p></div><p>Jassy even confirmed that as soon as they bring new capacity online, it&#8217;s essentially sold out:</p><div class="pullquote"><p>&#187;And what we&#8217;re continuing to see is as fast as we install this capacity, this AI capacity, we are monetizing it. And so it&#8217;s just a very unusual opportunity. And so we see that following the same sorts of patterns we saw in the early days of our core AWS investment. I&#8217;m very confident we&#8217;re going to have strong return on invested capital here.&#171;</p></div><p>From the historic understanding of Amazon in terms of words, they often underhype, so a comment like this was very telling:</p><div class="pullquote"><p>&#187;I think this is an extraordinarily unusual opportunity to forever change the size of AWS and Amazon as a whole.&#171;</p></div><p>Remember, just before AI, there was a big trend of companies moving workloads from the cloud back to on-prem because they thought many cloud workloads were too expensive. Now companies are realizing that AI workloads will need to be on cloud, because companies don&#8217;t have the resources or even the possibility to manage complex data centers with liquid cooling requirements (most data centers don&#8217;t have the option of liquid cooling), GPU utilization rates and managing multiple AI accelerators (Nvidia GPUs, AMD GPUs, ASICs like TPUs, Tranium). Because of this, they have also started moving non-AI workloads to the cloud, as the data needs to be close to the AI workloads for them to run properly.</p><div class="pullquote"><p>&#187;We&#8217;re continuing to see strong growth in core non-AI workloads as enterprises return to focusing on moving infrastructure from on-premises to the cloud&#171;</p></div><p>Because of AI, the cloud providers have increased their cloud &#187; lock-in &#171; and are growing even non-AI workloads.</p><p>I already talked about this trend just a few weeks ago in my <a href="https://www.uncoveralpha.com/p/q4-2025-channel-checks-and-alternative">Q4 alternative report article</a>, where we showed this chart confirming that companies are going to move to the cloud at an accelerated pace again over the next 2 years:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5JAM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5JAM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 424w, https://substackcdn.com/image/fetch/$s_!5JAM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 848w, https://substackcdn.com/image/fetch/$s_!5JAM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 1272w, https://substackcdn.com/image/fetch/$s_!5JAM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5JAM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png" width="861" height="563" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:563,&quot;width&quot;:861,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49197,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/187624962?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5JAM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 424w, https://substackcdn.com/image/fetch/$s_!5JAM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 848w, https://substackcdn.com/image/fetch/$s_!5JAM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 1272w, https://substackcdn.com/image/fetch/$s_!5JAM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb11d3cc0-5b95-4b2b-bdfc-f03f471ead9d_861x563.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But it wasn&#8217;t just Amazon talking about AI returns; the other Big Tech companies were, too. Meta gave a lot of color on how AI investments are already showing up in their business results:</p><div class="pullquote"><p>&#187;In Q4, we doubled the number of GPUs we used to train our GEM model for ads ranking. We also adopted a new sequence learning model architecture, which is capable of using longer sequences of user behavior and processing much richer information about each piece of content. The GEM and sequence learning improvements together grow a 3.5% lift in ad clicks on Facebook and a more than 1% gain in conversions on Instagram in Q4.&#171;</p><p>&#187;Instagram Reels had another strong quarter with watch time up more than 30% year-over-year in the U.S. Engagement is benefiting from several optimizations we made to improve the quality of recommendations including simplifying our ranking architecture to enable more efficient model scaling.&#8221;</p></div><p>On Facebook, video time continued to grow double digits year-over-year in the U.S., and we&#8217;re seeing strong results from our ranking and product efforts on both feed and video surfaces.&#171;</p><div class="pullquote"><p>&#187;The optimizations we made in Q4 drove a 7% lift in views of organic feed and video posts on Facebook, resulting in the largest quarterly revenue impact from Facebook product launches in the past two years.&#171;</p></div><p>Meta is seeing results from AI in both better ad targeting and engagement trends. The results actually &#187;revived&#171; Meta&#8217;s core and oldest platform, Facebook, which is seeing growth rates it hasn&#8217;t seen in years. But AI is opening up other avenues of growth at Meta:</p><div class="pullquote"><p>&#187;Another area we&#8217;re deploying AI to improve performance is ad creative. The combined revenue run rate of video generation tools hit $10 billion in Q4, with quarter-over-quarter growth outpacing the increase in overall ads revenue by nearly 3x.&#171;</p></div><p>The returns are not only affecting their revenue but also the productivity of their teams:</p><div class="pullquote"><p>&#187;Since the beginning of 2025, we&#8217;ve seen a 30% increase in output per engineer with the majority of that growth coming from the adoption of agenetic coding, which saw a big jump in Q4. We&#8217;re seeing even stronger gains with power users of AI coding tools, whose output has increased 80% year-over-year. We expect this growth to accelerate through the next half. &#171;</p></div><p>But despite these gains, Meta is telling us that it&#8217;s still very early as they are still using a limited amount of LLMs, as they have to either optimize them with SLMs because of compute limitations, or are still in the early stages of deploying these LLMs through their product stack:</p><div class="pullquote"><p>&#187;We&#8217;re also working on merging LLMs with the recommendation systems that power Facebook, Instagram, Threads and our ad system. Our world-class recommendation systems are already driving meaningful growth across our apps and ads business, but we think that the current systems are primitive compared to what will be possible soon.&#171;</p><p>&#187;We don&#8217;t typically use our larger model architectures like GEM for inference because their size and complexity would make it too cost prohibitive. So the way that we drive performance from those models is by using them to transfer knowledge to smaller lightweight models used at run time. But I would say that we think that there is room for our larger models to benefit from having more compute.&#171;</p></div><p>All of this resulted in Meta giving the highest revenue growth guide in almost 5 years. And despite the higher CapEx guide and costs stemming from both OpEx (new AI team costs + compute costs on public cloud providers) and higher amortization costs, Meta confirmed that they expect 2026 to deliver operating income above 2025.</p><p>In terms of Google, Google Cloud grew 48% YoY, one of the highest growth rates among businesses of this scale. Google Search actually grew 17% YoY, which is another growth rate for Search that hasn&#8217;t been seen for quite some time. On the call, management even commented that Search saw more usage in Q4 than ever before, as &#187;AI continues to drive an expansionary moment for Search&#171;.</p><p>But even ignoring all the commentary from these companies&#8217; management, let&#8217;s look at the hard numbers.</p><p>First, starting with revenue. All three hyperscalers are essentially selling all the compute they have available; if they had more, they would grow revenue even faster.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tWTP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tWTP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 424w, https://substackcdn.com/image/fetch/$s_!tWTP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 848w, https://substackcdn.com/image/fetch/$s_!tWTP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 1272w, https://substackcdn.com/image/fetch/$s_!tWTP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tWTP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png" width="1456" height="1129" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1129,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:753267,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/187624962?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tWTP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 424w, https://substackcdn.com/image/fetch/$s_!tWTP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 848w, https://substackcdn.com/image/fetch/$s_!tWTP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 1272w, https://substackcdn.com/image/fetch/$s_!tWTP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0c8fea0-2c07-46ea-9bee-99b880ebee18_5370x4163.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AWS grew 24% YoY, Azure grew 39% YoY, and Google Cloud grew 48% YoY. Their backlogs are growing even faster.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s877!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s877!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 424w, https://substackcdn.com/image/fetch/$s_!s877!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 848w, https://substackcdn.com/image/fetch/$s_!s877!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 1272w, https://substackcdn.com/image/fetch/$s_!s877!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s877!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png" width="1456" height="699" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:699,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:344389,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/187624962?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s877!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 424w, https://substackcdn.com/image/fetch/$s_!s877!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 848w, https://substackcdn.com/image/fetch/$s_!s877!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 1272w, https://substackcdn.com/image/fetch/$s_!s877!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cee2578-ee2c-466b-87be-270f68fb43c1_4099x1968.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>From the current revenue growth on top of the backlogs, we can clearly see that the hyperscalers are again growing significantly due to AI workloads. The standout in the quarter, as we correctly pointed out in our alternative data report before earnings, was Google Cloud. It is clear that the AI spend in the past is translating to real revenue growth. So the notion that these companies are spending only on CapEx and we can&#8217;t see revenue from it is false. Now, the questions and the narrative in the market are that the profits won&#8217;t come from this revenue stream.</p><p>The main argument for this thesis is that AI workloads will have a lower long-term profile margin, and, secondly, that people are calculating returns based on projected CapEx guides and comparing them to current revenues.</p><p>If we first tackle the CapEx argument. It is important to understand that the CapEx a hyperscaler spends on a data center this year will be utilized over a 2-year period, as it takes around 2 years to build and operationalize a data center. So when people look at 2025 revenue growth for the hyperscalers, they should translate that into CapEx spent in 2023, not in 2024 or 2025. When we are in a period like we are today, when YoY CapEx growth (estimates for 2026) are +53% (AWS), +93% (Google Cloud), +59% (Microsoft Cloud), the math doesn&#8217;t make much sense when we compared to 2025 revenues, because we should be really comparing 2023 CapEx to 2025 revenue growth.</p><p>If we look at 2023 CapEx numbers, we can see that both Microsoft and Google increased CapEx in 2023 by 17.5% to $32.3B and $28.1B vs 2022 levels, while Amazon reduced CapEx by 17% YoY to $52.7B, although based on my calculations, only a -10% reduction of CapEx in AWS to $24.8B. Now, if we compare those CapEx numbers to the revenues generated by hyperscalers in 2025, the math makes a lot of sense, as yearly revenue additions are outpacing CapEx spending.</p><p>Even for the most conservative investors out there, we can take the example of Google Cloud and even take the 2024 CapEx and compare it to the Q4 2025 results:</p><p>Google&#8217;s 2024 CapEx was $52.5 billion, with roughly $42 billion going to technical infrastructure (cloud/AI). Google Cloud grew from $48 billion (2024) to $70.8 billion (2025)&#8212;a $22.8 billion increase.</p><p>At the new 30.1% operating margins:</p><p>$6.9 billion in first-year operating income from 2024 CapEx</p><p>Add depreciation (as operating margin already includes that): +$7.0 billion (6-year schedule at Google)</p><p>Total first-year cash: $13.9 billion</p><p>First-year return: 33%</p><p>But here&#8217;s where Google&#8217;s trajectory gets interesting. They went from 5% margins (2023) to 17.5% (Q4 2024) to 30.1% (Q4 2025). If margins stabilize at 30% (which I actually think will grow even further) and they run that 2024 infrastructure for five years:</p><p>Cumulative OI:  around $45 billion</p><p>Add depreciation: +$42 billion</p><p>Residual value (data center shell): +$8 billion</p><p>Total: $95 billion on $42 billion invested</p><p>ROI: 126% over 5 years, or a 18% IRR</p><p>And that still assumes growth moderates significantly from the current 48% YoY pace, while the margin stays at the 30% level and doesn&#8217;t improve.</p><p>With the increased pace of 2026 CapEx growth, the hyperscalers are essentially telling us what the revenue additions and, with it, growth rates will be for 2028.</p><p>Moving to the argument that the long-term margin on AI workloads will not be good compared to the pre-AI period. The numbers so far do not suggest this at all. Here is a look at AWS and Google Cloud&#8217;s operating margins over the last quarters, where AI workloads accounted for the majority of growth.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Dpv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Dpv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 424w, https://substackcdn.com/image/fetch/$s_!6Dpv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 848w, https://substackcdn.com/image/fetch/$s_!6Dpv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 1272w, https://substackcdn.com/image/fetch/$s_!6Dpv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Dpv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png" width="1456" height="908" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:908,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:368786,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/187624962?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Dpv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 424w, https://substackcdn.com/image/fetch/$s_!6Dpv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 848w, https://substackcdn.com/image/fetch/$s_!6Dpv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 1272w, https://substackcdn.com/image/fetch/$s_!6Dpv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc20176-2d56-4dac-b89f-9d18b144c617_4768x2973.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The operating margin either held up in the same range as AWS (where the % of AI workloads compared to others is still smaller) or increased significantly at Google Cloud, where AI workloads are a bigger piece of the pie. Here, we have to acknowledge that Google Cloud is not only GCP, but nonetheless, commentary from management in all the latest quarters has been that GCP growth rates are even higher than total Google Cloud growth rates, so we should have seen a trend of lower operating margin, not higher, if AI workloads carried a low margin profile. An additional point to consider is also that a lot of the AI workload spend at GCP in this period were coming from Anthropic, which is one client that has much more negotiating power in terms of pricing then a bunch of smaller clients where the cloud providers are moving now as inference AI workloads start to take up more space as companies move their AI use-cases to production. Important in the context of margin is also the statement made by Google on its last earnings call:</p><div class="pullquote"><p>&#8220;We were able to lower Gemini serving unit cost by 78% over 2025 through model optimizations, efficiency and utilization improvements.&#8221;</p></div><p>What this tells us is that as these hyperscalers get even larger, they can optimize and squeeze more out of existing infrastructure. While some of those cost optimizations will be passed on to the cloud client, it is very clear that the ones with the most scale will also be able to use them to further expand their margin profile. Scale, but also custom ASICs play a key role here.</p><p><strong>Custom ASIC is the key</strong></p><p>Another strong argument that I already laid out in many of my previous articles is the custom silicon that cloud providers are designing. I continue to believe that this will be a critical element for any cloud provider to maintain healthy margins in the long term and avoid becoming overly dependent on a provider like Nvidia, which now has gross margins of almost 75%. In terms of custom ASICs, Google is best positioned with its TPUs, as we already laid out in the <a href="https://www.uncoveralpha.com/p/the-chip-made-for-the-ai-inference">TPU article</a>, followed by Amazon with Tranium. While Microsoft&#8217;s efforts here lag those of the other two, it is important to note that Microsoft also owns full IP rights to the custom ASICs that OpenAI will develop.</p><p>No surprise that on the Amazon earnings call, Tranium was mentioned 27 times, while Nvidia was not mentioned at all. We got even so far that the CEO called out specifically Amazon&#8217;s chip business and segmented revenue for us as a separate category:</p><div class="pullquote"><p>&#187;I think people know about our chips capability and our chips business, but I&#8217;m not sure folks realize how strong a chips company we&#8217;ve become over the last 10 years.</p><p>If you look at what we&#8217;ve done with Trainium, if you look at what we&#8217;ve done with Graviton, which is our CPU chip, which is about 40% better price performance than comparable x86 processors, 90% of the top 1,000 AWS customers are using Graviton very expansively. If you combine Trainium and Graviton, it&#8217;s well over a $10 billion annualized run rate business, and it&#8217;s still very early there.&#171;</p></div><p>Even though they lag from a product perspective, Microsoft also talked about its custom ASIC business very early in the call:</p><div class="pullquote"><p>&#187;Earlier this week, we brought online our Maia 200 accelerator. Maia 200 delivers 10-plus petaFLOPS at FP4 precision with over 30% improved TCO compared to the latest generation hardware in our fleet. We will be scaling this starting with inferencing and synthetic data gen for our Superintelligence Team as well as doing inferencing for Copilot and Foundry.&#171;</p></div><p>Custom silicon is what ensures hyperscalers can control their margin profile and market share, even in a more heated market where neoclouds and companies like Oracle have entered.</p><p><strong>Investors are questioning the AI compute demand, but in reality, we are just getting started</strong></p><p>A lot of investors are looking at the +$600 billion in combined hyperscaler CapEx projected for 2026  and questioning whether this is too much.  What most people are missing is that we are still in the very early innings of AI compute demand, and the data backs this up. Right now, coding and developer tools have emerged as the single breakout vertical for AI. For those who don&#8217;t follow the industry closely or took a break in January, the difference in usage in 1 month is staggering. Daily install counts on VS Code basically more than doubled in just one month, whereas usage is growing even faster.  Here is data from the usage of VS Code for Anthropic&#8217;s Claude Code and OpenAI Codex. The demand is going off the charts as developers are now not using these LLMs as tools anymore, but as junior to mid programers, where they now only review the code after the AI:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KgI7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KgI7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 424w, https://substackcdn.com/image/fetch/$s_!KgI7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 848w, https://substackcdn.com/image/fetch/$s_!KgI7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 1272w, https://substackcdn.com/image/fetch/$s_!KgI7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KgI7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png" width="601" height="485" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:485,&quot;width&quot;:601,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:66451,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/187624962?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KgI7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 424w, https://substackcdn.com/image/fetch/$s_!KgI7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 848w, https://substackcdn.com/image/fetch/$s_!KgI7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 1272w, https://substackcdn.com/image/fetch/$s_!KgI7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3b97732-fafd-4b27-82cb-e12704db06e3_601x485.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s the thing, though: coding is essentially one vertical. And it&#8217;s already consuming an enormous share of the available inference compute.  Now think about what happens when finance, legal, healthcare, customer operations, and other enterprise verticals start scaling their AI workloads to the same degree. According to Menlo Ventures, enterprise AI investment tripled from $11.5 billion to $37 billion in just one year, yet only 16% of enterprise deployments today qualify as true AI agents&#8212;most are still fixed-sequence workflows. We are nowhere close to saturation. McKinsey&#8217;s data shows 78% of organizations are now using AI in at least one business function, but the actual conversion to heavy inference workloads across non-coding departments is still nascent.  These numbers are tiny compared to where coding already is.</p><p>The market is pricing in CapEx as if coding-level adoption is the ceiling, when in reality, it is the floor.</p><p>These aren&#8217;t businesses lighting money on fire. These are businesses generating 30-35% operating margins on the largest infrastructure buildout in corporate history.</p><p>The custom chip businesses (Trainium, Graviton, TPUs) are growing triple-digits and creating structural moats that compound over time.</p><p>The market is treating this like the 2000 fiber glut. That was infrastructure built for demand that didn&#8217;t exist.</p><p>This is infrastructure being absorbed as fast as it&#8217;s deployed. Hyperscaler CapEx isn&#8217;t irrational exuberance. It&#8217;s the most rational investment decision these companies can make. Amazon, Microsoft, and Google aren&#8217;t hoping for AI to work out. They&#8217;re reporting the P&amp;L that shows it already has.</p><p><strong>Not all hyperscalers will be able to capture market share this year, though. The limiting factor is availability.</strong></p><p>Based on the past capacity commitements I calculated which cloud provider should grow the fastest in 2026 and beyond, and here are the numbers:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/the-market-hates-big-cloud-spending">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Great SaaS Unbundling: Why AI Will Destroy Half the Industry and Supercharge the Other Half]]></title><description><![CDATA[Everyone&#8217;s talking about how AI will &#8220;transform&#8221; software, but I think most people are getting it wrong. The real story isn&#8217;t about transformation&#8212;it&#8217;s about bifurcation]]></description><link>https://www.uncoveralpha.com/p/the-great-saas-unbundling-why-ai</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-great-saas-unbundling-why-ai</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Mon, 02 Feb 2026 14:46:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VTCB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I&#8217;ve been thinking a lot about the AI disruption narrative in SaaS. Everyone&#8217;s talking about how AI will &#8220;transform&#8221; software, but I think most people are getting it wrong. The real story isn&#8217;t about transformation&#8212;it&#8217;s about bifurcation. Some SaaS companies are about to get absolutely demolished, while others will emerge stronger than ever. It&#8217;s not about looking at valuation levels for some of these SaaS companies and buying what is cheap on a valuation metric.</p><p>The determining factor for survival isn&#8217;t the brand, or even the data that the SaaS companies have&#8212;it&#8217;s whether their core system is deterministic or probabilistic.</p><p>Let me explain what I mean and why this matters for investors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VTCB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VTCB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!VTCB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!VTCB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!VTCB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VTCB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1731291,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/186616042?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VTCB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!VTCB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!VTCB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!VTCB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F263462e3-7978-4a86-880e-fcb111dcf621_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The Core Thesis: Deterministic vs. Probabilistic Systems</strong></p><p>Deterministic systems<strong> </strong>are those where precision is critical, state management is complex, and errors cascade into serious consequences. Think accounting software, ERP systems, compliance platforms, healthcare system, payment processors, and sophisticated workflow engines. These systems need to be right 100% of the time&#8212;not 95%, not 99%, but 100%. When you&#8217;re reconciling a billion-dollar balance sheet or processing payroll for 50,000 employees, &#8220;close enough&#8221; isn&#8217;t acceptable.</p><div class="pullquote"><p>&#8220;Traditional enterprise functions, such as HR, are inherently deterministic; decisions like employee termination are binary and require rigid logic where specific inputs trigger precise, unvarying sequences. In contrast, LLMs are inherently probabilistic, determining the confidence level of the next token rather than following a hard-coded decision tree.&#171; </p><p>Source: Employee at Rippling (AlphaSense)</p></div><p>Probabilistic systems<strong> </strong>are those where the core value proposition is pattern recognition, content generation, basic automation, or simple decision-making. Think chatbots, content recommendation engines, basic customer support automation, simple workflow tools, and generic productivity software. These systems can tolerate errors and are often based on &#8220;good enough&#8221; outputs.</p><p>More likely, AI is going to eat the probabilistic category, while some deterministic systems will become more valuable by integrating AI as a complementary layer and start expanding into other layers.</p><p><strong>Why Deterministic Systems Are Actually Strengthened by AI</strong></p><p>This might seem counterintuitive. If AI is so powerful, why wouldn&#8217;t it disrupt the complex systems?</p><div class="pullquote"><p>&#8220;AI succeeds when autonomy is constrained, execution is owned, and determinism is treated as an asset rather than a limitation.&#8221;<strong> </strong></p><p>Jens Eriksvik</p></div><p>When you look at how enterprises are actually deploying AI agents in 2025/2026, they&#8217;re not replacing their systems of record&#8212;they&#8217;re building orchestration layers on top of them. As a Former Microsoft Manager put it: </p><div class="pullquote"><p>&#187;A &#8216;reality check&#8217; is occurring among CIOs as they realize LLMs lack the deterministic consistency required for critical industries like financial services. For use cases such as underwriting, a system that provides a correct answer &#8216;six out of ten times&#8217; is insufficient; these processes demand 100% consistency, which current probabilistic models struggle to guarantee without extensive re-engineering.&#171;</p></div><p>LLMs interpret human intent, deterministic systems execute the actual work. The deterministic systems are not being disrupted; the operator is (you).<strong> </strong>This is the architecture that&#8217;s winning in production environments.</p><p>Why does this matter? Because the companies that own these deterministic platforms become more valuable in an AI world, not less. They become the essential execution layer that AI needs to actually accomplish tasks.</p><p>The use of these deterministic platforms should rise substantially as more people gain valuable information from them with the help of AI. Usage goes up, but only with platforms that integrate these AI tools well into their deterministic platform cores.</p><p>But even with deterministic systems, there are challenges. The seat-based pricing must be converted to usage pricing. SaaS companies right now have to aggressively cut costs, specifically labour costs, SBC, etc., and get in front of the curve. As a deterministic platform, you can charge a premium for your deterministic core offering. On top of that, you will be able to offer probabilistic tools that complement the deterministic core. Here, the pricing logic is simple: you price it at inference cost + 30% margin. Over time, as you build out your sticky offering as a platform, you can gradually try to expand that margin once again, but right now, that time is not there yet.</p><p>As a deterministic platform provider, the goals are clear: provide a clear deterministic core, execute great probabilistic offerings that enhance the core, and cut cost AGGRESSIVELY in terms of labour as you increase OpEx spend on cloud infrastructure to reach mass scale and then negotiate better inference costs because of that scale. </p><p>The companies that do this will come out as big winners, as they will be able to consolidate and offer probabilistic features on top of their offering, at inference +30% margins, and, with it, expand their TAM.</p><p><strong>The Probabilistic SaaS Bloodbath</strong></p><p>Now let&#8217;s talk about the other side of this equation&#8212;the SaaS companies that are in trouble.</p><p>If your core value proposition can be replicated by an LLM with 90% of the quality at 1% of the cost, and you provide a probabilistic product, you don&#8217;t have a sound business model anymore. The problem becomes if your core value proposition is pattern matching, content generation, recommendations, or simple automation. Foundation models have gotten so good at these exact tasks that they can replicate your entire product in a few lines of code. The problem is not only the costs (which is a big one), but the problem also extends to the user interface, data, integration, and brand moats.</p><p>Having a &#187;great UX&#8221; as a SaaS provider is irrelevant when natural language becomes the interface. Users would rather type &#8220;generate 10 marketing emails for our Q1 launch&#8221; into ChatGPT than navigate through HubSpot&#8217;s 47-screen workflow builder. While some call out proprietary data as the strong moat for these kinds of businesses, I would argue otherwise. Modern LLMs can learn from a small example set and perform as well as a model that has thousands of examples. The accelerating nature of LLMs and the emergence of synthetic data also hurt the incumbent data holders. Research from Meta in 2024 showed that models trained on synthetic data generated by GPT-4 perform within 2% of models trained on real data for most classification tasks. And this was in 2024, till today, this only gotten better. Even if the proprietary data gives you your own model with 2-3% better accuracy, because of the probabilistic nature of your business, customers are not willing to pay 100x premiums for 2-3% better outcomes. They might do that if you had a deterministic system, however.</p><p>Moving to the &#187;integration moat&#171;. The key emphasis here is that these SaaS solutions are integrated with thousands of other apps and that this ecosystem is hard to replicate. Most SaaS products have well-documented APIs. AI excels as an integration layer without the need for pre-built connectors. With AI agents, these integrations and connections will become even more seamless as adoption accelerates and the SaaS companies want to stay &#8220;useful&#8221; in the age of agentic AI, making their APIs even more open and clear.</p><p>Now to the moat called the brand. There is some merit that enterprises, to some extent, are loyal to brands as they build trust in those brands. But with probabilistic systems, that trust is less strong and loyal than it is with deterministic systems, where you know you get those results 100% accurate. Enterprises are loyal to a degree until the cost gap becomes too big. If the discount is 20-30%, most won&#8217;t switch, but if that discount grows to 50-+70% switching starts. The trust factor is also something very fluid. AI startups with low-cost probabilistic system solutions gain trust via media coverage, raising billions in new VC funds, and hiring high-profile people from the incumbents.</p><p>The cutting of probabilistic SaaS is already underway, and this is more than just cutting seats.</p><p>Publicis Sapient reports actively reducing traditional SaaS licenses by approximately 50%&#8212;including major platforms like Adobe&#8212;by substituting them with generative AI tools and chatbots. An executive at the firm in an expert interview explains that AI agents are &#8220;10x faster, 100x smarter&#8221; than junior staff, creating a redundancy that directly cannibalizes the seat-based revenue underpinning commercial SaaS models.</p><p>For probabilistic SaaS, the only viable model is to cut costs to a minimum and price your product with a 30%+ margin on inference, but even that might not be sticky enough, especially if you don&#8217;t have any deterministic offering and if your clients are primarily SMBs. These companies will not be disrupted directly by AI, but by deterministic systems competing with them, offering AI-generated probabilistic offerings and bundling them into a single offering with a core deterministic system holding it together. If your ERP provider starts offering you a customer service system that works flawlessly with your ERP and uses your inference credits for both use-cases, you will likely switch over rather than have a separate customer service offering even if it is at the same cost.</p><p><strong>Valuation Compression is Already Here but it&#8217;s across the board</strong></p><p>Right now, the market is hitting SaaS across the board as it sees the risk of AI disruption. As of December 2025, the median EV/Revenue multiple for public SaaS companies stands at 5.1x, down from the pandemic peak of 18-19x and much lower than the historic average.</p><p>The thing the market hasn&#8217;t fully priced in yet is the deterministic and probabilistic platform differences that I laid out here, so the opportunity to own a deterministic SaaS platform at reasonable prices is definitely here.</p><p>Based on the criteria laid out in this article, I made a list of public SaaS companies and ranked them in deterministic/probabilistic order, and some of the ones I would highlight as being the least at risk of AI disruption:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/the-great-saas-unbundling-why-ai">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Anthropic's Claude Code is having its "ChatGPT" moment]]></title><description><![CDATA[I am posting an article on Anthropic Claude Code, which has been growing very significantly lately and, I believe, has developed an important product fit in its category.]]></description><link>https://www.uncoveralpha.com/p/anthropics-claude-code-is-having</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/anthropics-claude-code-is-having</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Mon, 26 Jan 2026 16:34:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Yr4h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I am posting an article on Anthropic Claude Code, which has been growing very significantly lately and, I believe, has developed an important product fit in its category.</p><p>Claude Code is going from just another AI coding assistant to a fundamental new architecture that developers need to stay competitive.</p><p>In the final months of 2025 and opening weeks of 2026, Claude Code reached a $1 billion annualized run rate just six months after launch&#8212;a velocity that even ChatGPT didn&#8217;t match. Based on my analysis and data, which I will share in this article, I believe that Claude Code is today closer to $2B ARR than $1B, as it has accelerated significantly in January.</p><p>At the same time, Anthropic&#8217;s overall annualized revenue jumped from approximately $1 billion at the start of 2025 to $5 billion by August&#8212;a 5x increase in eight months&#8212;with projections reaching $9 billion by year-end 2025.</p><p>But raw revenue growth, while impressive, misses the deeper structural shift. Claude Code has achieved what competitors couldn&#8217;t: it&#8217;s become the tool developers reach for when facing their hardest problems. At a Seattle meetup in mid-January 2026, over 150 engineers packed the house to trade use cases. One Google principal engineer publicly acknowledged that Claude reproduced a year of architectural work in one hour. Microsoft&#8212;which sells GitHub Copilot&#8212;has widely adopted Claude Code internally across major engineering teams, with even non-developers reportedly encouraged to use it.</p><p>Let&#8217;s dive in.</p><p><strong>Anthropic is building a defensible moat in enterprise AI.</strong></p><p>Anthropic reached 300,000+ business customers by August 2025, up from fewer than 1,000 businesses two years prior. According to Thunderbit, Claude&#8217;s enterprise AI assistant market share rose from 18% in 2024 to 29% in 2025&#8212;a 61% year-over-year increase&#8212;closing the gap with ChatGPT.</p><p>Anthropic just recently signed a term sheet for a $10 billion funding round at a $350 billion valuation&#8212;nearly double the $183 billion valuation from September 2025. That September round itself represented a massive step up from the $61.5 billion valuation in March 2025. The valuation has grown nearly six-fold in ten months&#8212;a trajectory that few technology companies have ever achieved, and the success is mostly tied to their developer clients.</p><p><strong>Why are developers choosing Claude?</strong></p><p>The market is littered with AI coding tools&#8212;GitHub Copilot, Cursor, Amazon CodeWhisperer, Tabnine, Codex, and dozens more. Yet Claude Code captured the developer community in ways its competitors haven&#8217;t.</p><p>The Architecture!</p><p>Claude Code&#8217;s distinguishing characteristic isn&#8217;t its AI model&#8212;though Claude 4&#8217;s coding capabilities are state-of-the-art. It&#8217;s the architectural decision to operate directly in the terminal with full file system and command-line access. This matters because it changes the fundamental relationship between developer and AI.</p><p>Traditional coding assistants like GitHub Copilot work as IDE extensions, offering autocomplete suggestions and chat interfaces. They&#8217;re stateless&#8212;every interaction starts fresh, with limited context beyond the current file. Claude Code operates differently. It reads and writes files directly, executes bash commands, maintains state across sessions, and coordinates multi-step processes spanning days. </p><p>As Noah Brier, an early LLM adopter who discussed the tool on Bloomberg&#8217;s Odd Lots podcast explained:</p><div class="pullquote"><p>&#8220; <em>it&#8217;s more like hiring a junior developer than using autocomplete.&#8221;</em></p></div><p>The terminal-native design solves two problems that plague competing tools. First, it enables persistent state management. Claude Code stores information in files, building up context and knowledge over time. When working on a multi-day refactor, it remembers architectural decisions, maintains to-do lists, and tracks completed work&#8212;capabilities that chat-based assistants simply can&#8217;t match. Second, it leverages composable Unix commands. Instead of reinventing wheels, Claude Code chains together grep, sed, git, and other standard tools that developers already trust.</p><p>This architectural choice has profound implications for adoption. Developers don&#8217;t need to learn new interfaces or workflows. They work in the environment they already use&#8212;the terminal&#8212;with a tool that speaks their language. And because Claude Code operates as a true agent rather than an assistant, it can handle entire projects autonomously while developers focus on architecture and business logic.</p><p><strong>The model advantage: Claude 4 and Sonnet 4.5</strong></p><p>Ofcourse the underlying AI models matter enormously as well. Anthropic released Claude 4 (Opus and Sonnet) in May 2025, introducing what the company called &#8220;the world&#8217;s best coding model.&#8221; The benchmarks backed up the claim:</p><p>Claude Opus 4: 72.5% on SWE-bench (measuring ability to solve real GitHub issues), 43.2% on Terminal-bench (command-line tasks). Claude Sonnet 4: 72.7% on SWE-bench, balancing performance with cost-efficiency. Extended thinking with tool use: Models can now alternate between reasoning and tool use (like web search) during extended thinking sessions. Memory capabilities: When given file access, Claude 4 creates and maintains &#8216;memory files&#8217; to store key information, dramatically improving performance on long-running agent tasks</p><p>Then in September 2025, Anthropic released Claude Sonnet 4.5, which became their most powerful model to date. The improvements were dramatic:</p><p>&#8226; 77.2% on SWE-bench Verified (82.0% with parallel compute)</p><p>&#8226; Code editing error rate: Dropped from 9% to 0% on Anthropic&#8217;s internal benchmarks</p><p>&#8226; Long-horizon task performance: Maintains focus for more than 30 hours on complex, multi-step tasks (vs. ~7 hours for Opus 4)</p><p>&#8226; 61.4% on OSWorld (desktop/browser interaction), up from 42.2% just four months prior</p><p>In November 2025, Anthropic released Claude Opus 4.5, which achieved 80.9% on SWE-bench Verified while using up to 65% fewer tokens than previous models. This efficiency translates directly to cost savings for developers running complex workflows.</p><p>Critically, these weren&#8217;t just benchmark improvements&#8212;they showed up in production. GitHub integrated Claude Sonnet 4 to power GitHub Copilot&#8217;s new coding agent. Cursor called Opus 4 &#8220;state-of-the-art for coding and a leap forward in complex codebase understanding.&#8221; Replit reported &#8220;dramatic advancements for complex changes across multiple files.&#8221; Block noted it was &#8220;the first model to boost code quality during editing and debugging.&#8221;</p><p>Bloomberry conducted research on over 45k companies, and the results are very insightful into which industries Anthropic dominates vs OpenAI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0jcA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0jcA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 424w, https://substackcdn.com/image/fetch/$s_!0jcA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 848w, https://substackcdn.com/image/fetch/$s_!0jcA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 1272w, https://substackcdn.com/image/fetch/$s_!0jcA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0jcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png" width="1011" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1011,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81315,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/185856188?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0jcA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 424w, https://substackcdn.com/image/fetch/$s_!0jcA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 848w, https://substackcdn.com/image/fetch/$s_!0jcA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 1272w, https://substackcdn.com/image/fetch/$s_!0jcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4655461-dd0e-429c-b995-5b8fdddb490b_1011x674.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">source: <a href="https://bloomberry.com/blog/we-analyzed-44k-companies-to-see-who-uses-claude-or-perplexity/">Bloomberry</a></figcaption></figure></div><p>The software development vertical is especially interesting as companies are 2.3 times more likely to be Claude only than OpenAI only.</p><p>On the other hand, the industries where OpenAI dominates Anthropic are Marketing services, real estate, advertising, and business consulting.</p><p>In another developer survey conducted by UC San Diego and Cornell University in January, from 99 professional developers, Claude Code (58 respondents) appeared alongside GitHub Copilot (53) and Cursor (51) as one of the three most widely adopted platforms, with 29 respondents using multiple agents simultaneously.</p><p>In 2026, Claude is accelerating even faster with the launch of Cowork. The Cowork launch proved particularly significant. Users had been using Claude Code for non-coding tasks (vacation research, spreadsheet work via Slack, oven control). By launching Cowork, Anthropic showed that Claude Code&#8217;s total addressable market extends far beyond the 28 million professional developers globally.</p><p>Now, in addition to Cowork, we have a new trend of a personal assistant called Clawd bot. While Clawd bot is not owned by Anthropic but rather an open-source project, it has become the &#187;ChatGPT&#171; moment for personal intelligence, and for most users, Clawd works best when used with Claude, causing a surge in usage of Claude Code.</p><p>This is the most eye-opening chart from this article. This shows the daily install counts of AI Coding Assistants in Visual Studio Core. For those non-technical, VS Code is the industry standard for code editors and the primary host of AI coding agents:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yr4h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yr4h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 424w, https://substackcdn.com/image/fetch/$s_!Yr4h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 848w, https://substackcdn.com/image/fetch/$s_!Yr4h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 1272w, https://substackcdn.com/image/fetch/$s_!Yr4h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yr4h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png" width="1306" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1306,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:169401,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/185856188?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Yr4h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 424w, https://substackcdn.com/image/fetch/$s_!Yr4h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 848w, https://substackcdn.com/image/fetch/$s_!Yr4h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 1272w, https://substackcdn.com/image/fetch/$s_!Yr4h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f157c5-6b80-47c6-b480-3774aa5b90ac_1306x628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Since the start of 2026, Claude Code has been surging! It went from 17.7M of daily installs (30-day moving average), similar to where OpenAI&#8217;s Codex was, to 29M and continues to rise exponentially. This really shows that Claude Code is having its own &#187;ChatGPT&#171; moment TODAY.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Why does coding matter so much as an AI vertical?</strong></p><p>In short, because the results are measurable, and companies can put serious investment behind these productivity gains. The academic research and enterprise case studies paint a consistent picture: AI coding tools deliver 26-55% productivity improvements, with experienced developers seeing the largest gains.</p><p>GitHub Copilot baseline: A 2022 controlled experiment found that developers using GitHub Copilot completed tasks 55.8% faster (95% confidence interval: 21-89%) than control groups. Subsequent enterprise deployments confirmed these gains:</p><p>&#8226; GitHub&#8217;s own research: Developers code up to 51% faster for certain tasks</p><p>&#8226; Accenture randomized trial: 8.69% increase in pull requests per developer, 11% increase in merge rates, 84% increase in successful builds</p><p>&#8226; Developer satisfaction: Up to 75% higher job satisfaction, 88% code retention rate (developers keep nearly all AI-generated suggestions)</p><p>&#8226; Success rates: 78% of developers complete tasks using Copilot vs. 70% without it, with 53.2% more likely to pass all unit tests</p><p>Claude Code&#8217;s reported gains exceed Copilot&#8217;s: Internal data from Anthropic and partner companies suggests even stronger performance for complex, long-horizon tasks:</p><p>&#8226; Developers report running 5-15 Claude Code instances concurrently&#8212;multiple in terminals, plus additional browser sessions</p><p>&#8226; Rakuten validated capabilities with a demanding open-source refactor running independently for 7 hours with sustained performance</p><p>&#8226; Boris Cherny (Head of Claude Code at Anthropic): <em>&#8220;Claude Code generated roughly 80% of its own code&#8221;</em> (with human direction, review, and architectural decisions)</p><p>A software engineer in the US costs $200,000-+$400,000 annually. If AI coding tools deliver even conservative 20-30% productivity gains, that translates to $40,000-$90,000 in annual value per developer. For a company with 1,000 engineers, we&#8217;re talking $40-90 million in annual productivity gains, justifying substantial spending on AI coding infrastructure.</p><p><strong>Anthropic&#8217;s Business Momentum</strong></p><p>The AI industry&#8217;s narrative has fixated on OpenAI&#8217;s consumer dominance&#8212;ChatGPT&#8217;s 800 million weekly active users, 2.5-3 billion daily prompts, and $500 billion valuation. But an important story for investors is playing out in enterprise adoption, where Anthropic is systematically outmaneuvering its larger rival.</p><p>This growth trajectory is unprecedented. For context, OpenAI&#8217;s 2025 revenue is estimated at $10-12 billion&#8212;larger in absolute terms but growing more slowly from a higher base. More critically, Anthropic is projected to break even by 2028, while OpenAI isn&#8217;t expected to turn a profit until 2030, according to November 2025 WSJ reporting. OpenAI faces approximately $74 billion in projected losses in 2028 due to massive compute costs, while Anthropic&#8217;s enterprise focus and efficiency gains position it for profitability much sooner.</p><p>While ChatGPT dominates consumer attention, Anthropic systematically captured the enterprise market where switching costs are high, and revenue is sticky. </p><p>According to Views4You, Claude has high penetration rates in different industries:</p><p>&#8226; Healthcare: 61% usage growth in early 2025, with Claude assisting in medical documentation and patient communication</p><p>&#8226; Legal: 18% of AI-enhanced litigation tools rely on Claude</p><p>&#8226; Finance: 24% of major banks use Claude, with 34% of enterprise AI research teams integrating it</p><p>&#8226; Retail/E-commerce: 38% of chatbots employ Claude</p><p>&#8226; Real Estate: 25% of listing analysis tools powered by Claude</p><p>This enterprise penetration is what separates Anthropic from consumer-focused competitors. Enterprise customers sign multi-year contracts, integrate deeply into workflows, and face high switching costs. Revenue from these customers is predictable, recurring, and premium-priced.</p><p>Anthropic with Claude Code is having its own ChatGPT moment, and it&#8217;s important, as coding is a big part of the economy and the job market, especially given the salaries. If there are 36M developers worldwide and their average salary is $48k per year, that would translate to $1.75T in developer salaries each year. If we only take the 20-30% production gains, we are talking about $350B to $525B of value created each year from these tools, and I would argue that the productivity gains are much higher than the 20-30%. </p><p>Anthropic&#8217;s TAM is bigger than many imagine, and its narrow focus on the enterprise and coding markets could prove to be a great strategy as things become more specialized, and it has built a strong head start and developer brand.</p><p>If you enjoyed this article, please consider subscribing to the paid subscription, where I share more in-depth analysis of AI companies and industry trends that I am seeing:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe&quot;,&quot;text&quot;:&quot;Subscribe to Paid&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe"><span>Subscribe to Paid</span></a></p><p>Until next time,</p><p>I hope you found this article valuable. I would appreciate it if you could share it with people you know who might find it interesting.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/p/anthropics-claude-code-is-having?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/p/anthropics-claude-code-is-having?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Thank you!</p><p><strong>Disclaimer:</strong></p><p>I own Google (GOOGL) &amp; Amazon (AMZN), and Microsoft (MSFT) stock, which all have stakes in Anthropic.</p><p>Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer&#8217;s opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.</p>]]></content:encoded></item><item><title><![CDATA[Q4 2025 Channel checks & alternative data: The memory crunch is getting worse & one hyperscaler stands out]]></title><description><![CDATA[For this report, I got the most interesting data on cloud providers Google, Microsoft, and Amazon, as well as semiconductor memory providers SK Hynix, Samsung, and Micron.]]></description><link>https://www.uncoveralpha.com/p/q4-2025-channel-checks-and-alternative</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/q4-2025-channel-checks-and-alternative</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Wed, 21 Jan 2026 14:49:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ZQAN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I am posting my regular channel check &amp; other alternative data report before we start earnings of big tech and semiconductor names.</p><p>For this report, I got the most interesting data on cloud providers Google, Microsoft, and Amazon, as well as semiconductor memory providers SK Hynix, Samsung, and Micron.</p><p>Let&#8217;s dive in.</p><p><strong>Memory is in a historic crunch</strong></p><p>The memory market crunch has spread not only to HBM but also to DRAM and NAND. We can see from this Ornn chart that spot DRAM prices for DDR5 16GB have risen by 366% from the start of Q4 to today. Even since the start of this year, they are up 20.5%.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZQAN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZQAN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 424w, https://substackcdn.com/image/fetch/$s_!ZQAN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 848w, https://substackcdn.com/image/fetch/$s_!ZQAN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 1272w, https://substackcdn.com/image/fetch/$s_!ZQAN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZQAN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png" width="1200" height="737" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:737,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42336,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/185278501?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZQAN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 424w, https://substackcdn.com/image/fetch/$s_!ZQAN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 848w, https://substackcdn.com/image/fetch/$s_!ZQAN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 1272w, https://substackcdn.com/image/fetch/$s_!ZQAN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01375be4-e4ef-4d30-a1fd-c878279c6e13_1200x737.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">source: <a href="https://www.ornnai.com/">Ornn</a></figcaption></figure></div><p>Flash MLC 64GB is also up more than 15% since the start of Q4, and SLC2G is up 59% in the same period.</p><p>When analyzing relevant expert interviews on AlphaSense in bulk for Q4, the findings confirm that industry demand is skyrocketing.</p><p>Memory demand growth accelerated substantially in Q4 2025, with consensus growth ranges expanding from 5%-10% in Q3 to 12%-17% in Q4.</p><p>High-bandwidth memory demand maintained strong momentum with 65%-80% year-over-year growth and extended lead times of 20-30 weeks, with zero inventory availability, compared to a 14-35% growth range and 12-24 week lead times in the prior quarter.</p><p>Lead times for DRAM, according to the analysis of expert interview expanded from 16 to 40 weeks quarter-over-quarter! Many experts also noted that customers are now placing long-term orders, compared with the more short-term demand orders seen just the quarter before.</p><p>HBM pricing premiums are reaching 5x over DDR5, even accelerating from the prior quarter.</p><p>When looking forward to the guidance comments from these experts, things look even tighter:</p><p>Many mention customers panic buying and inventory hoarding due to anticipated supply shortages. Customer orders are shifting to 9-12 month advance commitments with redundant orders across suppliers, compared to standard quarterly planning cycles in the previous quarter.</p><p>Lead times deteriorated substantially across memory categories, with DRAM now extending to 52-56 weeks, driven by overwhelming AI data center infrastructure demand and capacity shifts toward advanced memory products.</p><p>In terms of pricing, rebate programs largely disappeared in Q4 compared to the prior quarter, when suppliers like SK Hynix and Micron were most aggressive with incentives and rebates, offering 20-25% discounts. Price increases are +30-100% across product categories, accelerating from the prior quarter&#8217;s +5-10% general price increases.</p><p>All 2026 capacity is sold out to hyperscalers, with suppliers moving to long-term agreements only, compared to prior quarter mentions of full bookings through 2026, because customers are locking in supply early due to fear of shortages and each accelerator requiring multiple HBM stacks.</p><p>I am expecting skyrocketing earnings results from all three SK Hynix, Samsung, and Micron, both in terms of revenue but especially in terms of profitability, as customer negotiating power is essentially zero at this point.</p><p>Moving now to the cloud industry</p><p><strong>Cloud is accelerating, but one cloud provider stands out in Q4&#8230;</strong></p><p></p>
      <p>
          <a href="https://www.uncoveralpha.com/p/q4-2025-channel-checks-and-alternative">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[2026 AI landscape who benefits the most?]]></title><description><![CDATA[Here are UncoverAlpha&#8217;s 2026 top forecasts in the AI sector, including which companies stand to benefit most from these trends, and the biggest risk pressure points we are monitoring in the AI market for 2026.]]></description><link>https://www.uncoveralpha.com/p/2026-ai-landscape-who-benefits-the</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/2026-ai-landscape-who-benefits-the</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 08 Jan 2026 16:02:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Lr21!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>Here are UncoverAlpha&#8217;s 2026 top forecasts in the AI sector, including which companies stand to benefit most from these trends, and the biggest risk pressure points we are monitoring in the AI market for 2026.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Lr21!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Lr21!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Lr21!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Lr21!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Lr21!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Lr21!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:199248,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/183919145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Lr21!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Lr21!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Lr21!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Lr21!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa9492c-c3b3-45f2-844a-4690bab35dbb_1024x1024.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The power in AI shifts from Nvidia to HBM suppliers and advanced packaging, as both bottlenecks will last longer than most people expect</strong></p><p>While many recognize bottlenecks in both HBM and advanced packaging, we believe both will persist longer than expected. If we start with HBM.</p><p>Memory has historically always been an industry with big demand/supply cycles. Because of that, investors are very wary and are not fully &#187;bought in&#171; when a bottleneck in memory forms, as history shows that often times buying a memory company with a low P/E was a bad strategy (often times at the top of the cycle), and buying a memory company when the P/E was high was better. I am not saying this time is different, but I do believe we are still early in the HBM bottleneck cycle. As ASICs like Google TPUs and Amazon Tranium gain steam, their need for HBM is growing bigger and bigger, similar to Nvidia. 2026 is the year when we will also get a &#187;full AI system&#171; from AMD with their MI400 series. The success of TPUv7 in performance per cost, along with its delivery of a frontier model (Gemini), is driving many other companies to continue investing heavily in this space. HBM providers Micron, Samsung, and SK Hynix are receiving calls from the big tech companies seeking to secure their HBM supply. As HBM production is not increasing substantially, the bottleneck is getting tighter and tighter, where now Micron, Samsung, and SK Hynix can get better prices out of everyone, including Nvidia, as they don&#8217;t have only one big buyer anymore (Nvidia). HBM is already sold out for 2026.</p><p>According to Korean media outlets, big tech companies like Microsoft, Google, and Meta are practically stationed in Korea in an effort to plead to get any additional capacity from SK Hynix or Samsung. The problem escalated to the point that Google&#8217;s management dismissed the procurement personnel responsible, holding them accountable for creating supply-chain risk by failing to sign long-term agreements in advance. However, the HBM problem will even worsen as we transition to HBM4.</p><p>Nvidia&#8217;s Vera Rubin utilizes an 8-stack HBM4 configuration with a memory bandwidth of 22TB/s and a per-pin Fmax of around 10.7Gbps. AMD&#8217;s MI455X opts for a 12-stack HBM4 configuration (so even more than Vera Rubin), but at a lower bandwidth of 19.6TB/s, with a per-pin Fmax of around 6.4Gbps. AMD is betting on using less performant HBM4 and stacking more of it together. Nvidia&#8217;s Vera Rubin NVL72 will have 1.5x the HBM capacity of Blackwell and 2.8x HBM4 Bandwidth. But the Vera Rubin is just the appetizer when it comes to HBM capacity. In 2027, Nvidia plans to launch the Rubin Ultra with an enhanced HBM4 version, HBM4e, which will enable 12- or 16-high stacks, potentially reaching up to 1TB of memory per GPU (with an NVL576 system).</p><p>Keep in mind that ASICs, to remain competitive, will need to follow similar HBM patterns, which will worsen the crunch.</p><p>To top it all, because Nvidia can now sell its H200 into China, the demand that is from China for Nvidia&#8217;s H200 is putting additional pressure on memory makers, as H200 uses HBM3e. All three memory providers are building some new fabs for HBM4, but also reorganizing some HBM3 or even DDR4 and DDR5 memory lines into HBM4.</p><p>The problem is that HBM4 requires about 3x more wafer space than standard DRAM for the same amount of memory. As these manufacturers are reorganizing these product lines towards HBM, the supply of traditional RAM decreases, and now, even here, we have a bottleneck. The AI industry is in its early stages, and adoption isn&#8217;t at a point where we have humanoids, edge AI, AR smart glasses, or AVs in mass use, all of which require massive memory.</p><p>Moving now to the second huge bottleneck, advanced packaging.</p><p><strong>The Advanced Packaging Bottleneck will get worse</strong></p><p>Similar to the HBM bottleneck, I expect conditions will only worsen here. Since we moved to chiplets rather than monolithic SoC, much more advanced packaging is required. You can think of advanced packaging as stitching together different components of an AI accelerator to make them work as one. The goal is also to &#187;stitch &#171; them together as densely as possible to remove latency and energy losses. Advanced packaging is required for Nvidia GPUs, AMD GPUs, Google TPUs, Amazon Traniums, etc.</p><p>So naturally, with Nvidia GPU demand and now on top of that, ASICs programs getting scale the bottleneck is severe. The biggest and most important advanced packaging program is TSMC&#8217;s CoWoS.</p><p>According to Samsung Securities, TSMC&#8217;s CoWoS production capacity (converted to wafers) increased from 35,000 sheets per month in 2024 to about 70,000 sheets last year, and is expected to rise to about 110,000 sheets this year. However, evaluations indicate this remains insufficient. Given that TSMC&#8217;s CoWoS allocation to NVIDIA is approximately 55%, the calculation indicates that only 8.91 million &#8220;Blackwell&#8221; AI accelerators can be produced this year. This volume can support data centers with a maximum capacity of 18 gigawatts (GW), representing only 50% of global data center investment capacity this year. Samsung Securities analyzed, &#8220;There is a possibility that TSMC will not be able to meet even NVIDIA&#8217;s demand this year.&#8221;</p><p>Here are Goldman&#8217;s estimates for annual CoWoS. Even with capacity doubling in 2026 relative to 2025, it is still not enough to meet demand, as 2026 TSMC CoWoS capacity is already essentially sold out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jcmr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jcmr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 424w, https://substackcdn.com/image/fetch/$s_!jcmr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 848w, https://substackcdn.com/image/fetch/$s_!jcmr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!jcmr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jcmr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg" width="1456" height="578" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:88655,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/183919145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jcmr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 424w, https://substackcdn.com/image/fetch/$s_!jcmr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 848w, https://substackcdn.com/image/fetch/$s_!jcmr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!jcmr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e9c2f1-53a6-418f-9232-7eee03a47e4e_1884x748.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The crunch went so badly that there are now rumours that Meta has allocated some of its CoWoS from its ASIC chip to Google for its TPUs, as it appears Meta is preparing to start using Google TPUs.</p><p>The problem for all ASIC programs is that Nvidia is the dominant client and has most of the capacity reserved. If you are Google, Amazon, or Meta and your ASIC program can&#8217;t secure sufficient CoWoS capacity because Nvidia and AMD control most of the capacity, you are considering advanced packaging alternatives to CoWoS, as you don&#8217;t have time to wait. I believe this year, other players with advanced packaging will benefit (I will explain more in the company section of this article, including company names).</p><p><strong>The transition to Co-Packaged Optics (CPO) from Pluggable Modules</strong></p><p>2026 will also mark an important year of transition, as the industry shifts in networking from the pluggable era to CPO. We are seeing this transition as AI models grow in size, resulting in AI clusters with over 100k GPUs connected. With clusters of that size connected, you have a lot of pluggable transceivers. This is a problem as that amount of transceivers can consume 15-20% of the total power in an AI data center. Switching to optical providers can significantly reduce energy consumption, and the signal is more reliable.</p><p>The problem with CPO is that it requires advanced packaging, which is already a bottleneck in chip manufacturing. While TSMC is establishing a dedicated zone for this type of packaging through its COUPE platform, the bottleneck remains the same, and both affect each other. More on this in the company section, where I explain which companies benefit from the surge in interest in CPO.</p><p><strong>Nvidia&#8217;s acquisition of Groq opens up a new path for additional AI chips for specific AI use-cases that open up the SRAM supply chain or combinations of SRAM and other memories</strong></p><p>Nvidia&#8217;s acquisition of Groq is a big signal to the market, as I wrote in <a href="https://www.uncoveralpha.com/p/the-20-billion-admission-why-nvidia">this article</a>. With the move, Nvidia confirms that HBM and advanced packaging bottlenecks will likely persist and seeks to secure growth beyond them. This doesn&#8217;t mean that people will stop using HBM. Nvidia&#8217;s move signals that they expect the HBM bottleneck to be prevalent and long-lasting, so the industry will sell all available HBM over the coming period. At the same time, they need to consider other memory options to continue growing and address the compute gap.</p><div class="pullquote"><p>Groq asset acquisition won&#8217;t impact core business, could spark something new</p><p>Jensen Huang</p></div><p>The main shift here is SRAM. You can fit an AI model on SRAM without HBM, but the model is then 100x smaller. This means SRAM use cases will be limited, but they do exist.</p><p>There are many workloads where latency matters a lot, but the model doesn&#8217;t need to be &#187;god like&#171; ( like serving an ad copy ). With agenic work, an agent could, based on the task, decide which high-quality model they need to answer, and if it&#8217;s an answer that can be answered by a fast, small model, they can use SRAM, and only if it needs more, they go to HBM.</p><p>Robotics also needs more SRAM, as it is low-latency, so you don&#8217;t need the big model here. And if you ask the humanoid a complex task, it can go to the cloud and use computing with HBM to get a more complex answer.</p><p>The point I am making is that we will see a mix of new memory variants emerge as everyone, including Nvidia, seeks ways to move beyond HBM. With this new trend, there is a new supply chain of companies that will benefit and have caught my attention; they will be shared in the company section of this report.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Google Gemini will continue to take market share from OpenAI</strong></p><p>Since Gemini 3, Google has gained significant momentum in Gemini adoption. That momentum only accelerated after December 17 as Google launched Gemini 3 Flash. Gemini 3 Flash is very affordable and arguably the best current intelligence-per-cost model for many use cases. Not surprisingly, we received data from Similarweb showing that Gemini&#8217;s web market share increased to 21.5% from 13.7% three months ago and 5.7% 12 months ago. In the same period, OpenAI&#8217;s market share went from 86.7% 12 months ago to 64.5% today.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yxRI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yxRI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 424w, https://substackcdn.com/image/fetch/$s_!yxRI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 848w, https://substackcdn.com/image/fetch/$s_!yxRI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 1272w, https://substackcdn.com/image/fetch/$s_!yxRI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yxRI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png" width="870" height="692" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:692,&quot;width&quot;:870,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:145472,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/183919145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yxRI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 424w, https://substackcdn.com/image/fetch/$s_!yxRI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 848w, https://substackcdn.com/image/fetch/$s_!yxRI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 1272w, https://substackcdn.com/image/fetch/$s_!yxRI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a5ca82-9fef-4b3c-bb1d-243eb7e6d49e_870x692.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I expect Gemini to continue gaining market share, as the performance gap with ChatGPT over the past 2 years has now closed, and Gemini has taken the lead. Google&#8217;s product execution also took a significant upward shift in the fall of last year, as it appears management has learned that product shipment matters more than benchmark evaluations. On top of that, Google&#8217;s unique infrastructure advantage, derived from its TPU ASICs unit, gives it cost and scale advantages that it can leverage to put pricing pressure on the whole market. OpenAI acknowledged that they needed to allocate more compute for inference as the user base grew, and that they had to reallocate some of that capacity from their research (training) operations. Google&#8217;s DeepMind AI research unit also has an advantage over OpenAI: it is backed by a strong, Free-Cash-Flow-Generating business, so it is not dependent and doesn&#8217;t need to raise external capital.</p><p><strong>The application layer of AI is becoming &#187;investible&#171;.</strong></p><p>I expect in 2026 we will see renewed interest in companies that are characterized as the application layer of AI. Meta&#8217;s recent acquisition of Manus, a fast-growing agent AI company, is a strong signal to the market. While so far it was hard to invest in this layer, as the perception was always that &#187;you are one core model upgrade away from being irrelevant, &#171; things are changing. Manus and many other &#187;AI wrapper&#171; companies are showing that there is value in acquiring users and the user behavior patterns and data from their usage. The rise of methods such as fine-tuning, RAG, and RLHF can strengthen your moat on top of a foundation model, especially as we see frequent improvements in the post-training phase. I think this trend will accelerate in 2026, and opportunities in the application layer will finally emerge.</p><p>On top of the application layer, I also believe there will be more distribution deals, partnerships, or revenue M&amp;A done. In distribution, the prime example is Snap partnering with Perplexity to offer Perplexity within Snapchat and, in return, receiving payment from Perplexity. I think the market will begin to view distribution companies more favorably. The most significant pressure on distribution deals will come from the broader ecosystem outside Google, as Google has the most distribution points available with Chrome, Gmail, Workspace, Maps, YouTube, and many others. I expect them to rely more heavily on those distribution points to support Gemini, which will put additional pressure on others.</p><p><strong>Key market pressure points for me for both the AI market and macro</strong></p><p>There are a few things for me that are important for this AI trend to sustain and continue, which I will be monitoring very closely in 2026. In terms of industry-specific, the number one is funding rounds for AI labs and startups, especially OpenAI and Anthropic. If any of those companies do not raise the amount of funds or at the valuation levels they set, I will view that as a very dangerous signal and may reduce significant exposure to the market. Currently, a lot of the ecosystem still hinges on those two companies to continue with their usage and spending.</p><p>The second thing is architectural modifications to how models are trained and served (inference). This ranges from model sizes to pre- and post-training methods, and includes memory requirements. If there is any significant change here, one needs to be very careful and reassess the factors, as it might shift needs and supply chains to others.</p><p>There is another pressure point at the macro level: the latest developments in Venezuela and their impact on the US-China relationship, especially regarding Taiwan. If anything happens there, even like a blockade or anything, everything changes.</p><p><strong>Companies for 2026</strong></p><p>Here are the companies I am invested in or on my watchlist that are aligned with these 2026 trends:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/2026-ai-landscape-who-benefits-the">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The $20 Billion Admission: Why NVIDIA Just Bought Into the ASIC Revolution with Groq]]></title><description><![CDATA[We got news that Nvidia was &#187;acquiring&#171; (more acquihire) the chip ASIC startup Groq for around $20B. What does this mean for Nvidia and the industry?]]></description><link>https://www.uncoveralpha.com/p/the-20-billion-admission-why-nvidia</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-20-billion-admission-why-nvidia</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Fri, 26 Dec 2025 12:33:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zfi7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>As the AI industry is never sleeping, yesterday we got news that Nvidia was &#187;acquiring&#171; (more acquihire) the chip ASIC startup Groq for around $20B. If you have been a reader of our publication for some time, you know we have mentioned Groq multiple times. A little more than a year ago, I also did an exclusive interview with my friend Groq&#8217;s General Manager, Sunny Madra, which you can <a href="https://www.uncoveralpha.com/p/the-next-tectonic-shift-in-ai-inference">go back and check out.</a></p><p>While many people are speculating on why Nvidia would essentially buy (license: the formal term used) a $20B ASIC startup, I wanted to add my thoughts to the mix, as I believe the Groq acquisition is highly strategic for Nvidia and sends an important signal to the market.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zfi7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zfi7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zfi7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zfi7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zfi7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zfi7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1544273,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/182622807?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zfi7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zfi7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zfi7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zfi7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90a0a08e-1918-4f1c-8b14-a5c326b634bd_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>How is the Groq chip different than a GPU/TPU?</strong></p><p>First, let&#8217;s dismiss the argument that Nvidia bought Groq because its CEO, Jonathan Ross, is one of Google&#8217;s TPU founders. Groq&#8217;s chip, also called the Language Processing Unit (LPU), is very different from a TPU or a GPU.</p><p>Let me quickly explain the GPU, the TPU, and the LPU in terms of how they differ:</p><p><strong>The GPU</strong></p><p>The GPU architecture was originally designed for graphics&#8212;calculating thousands of pixels at once. For AI, it treats a Large Language Model (LLM) as a massive parallel processing job.</p><p>The Bottleneck<strong>:</strong> GPUs rely on HBM (High Bandwidth Memory<strong>)</strong>, which sits outside the processing core. Every time the GPU needs to calculate a word (token), it has to &#8220;fetch&#8221; the model weights from that external memory. This creates a &#8220;memory wall&#8221; where the processor is often sitting idle, waiting for data to arrive.</p><p>The Logic: It uses a &#8220;hub and spoke&#8221; model. It is incredibly versatile and can do everything from training to gaming, but it isn&#8217;t &#8220;perfectly&#8221; efficient for the specific sequential nature of generating text.</p><p><strong>The TPU</strong></p><p>You can read in detail <a href="https://www.uncoveralpha.com/p/the-chip-made-for-the-ai-inference">my piece on Google TPU</a> to get a detailed understanding, but to summarize the key points from this article, the TPU is an ASIC (Application-Specific Integrated Circuit) designed specifically for Tensor math (linear algebra). It uses a Systolic<strong> </strong>Array. Imagine a &#8220;heart&#8221; that pumps data through a grid of processors. Once a piece of data enters the grid, it is passed from one processor to the next without needing to go back to main memory.</p><p>The Logic: TPUs are much more efficient than GPUs for massive batches of data. This makes them very effective in Training and complex inference (similar to the GPU)&#8212;where you are feeding the machine billions of data points at once. However, for a single user asking a question (Inference), they often still face latency issues.</p><p><strong>The Groq LPU</strong></p><p>Groq&#8217;s LPU is a complete departure from the other two. It doesn&#8217;t use HBM (External Memory) at all. Instead, it uses SRAM (Static Random Access Memory), which is built directly into the silicon of the chip.</p><p>The biggest differentiation from that is the Speed. SRAM is up to 100x faster than the HBM found in GPUs. Because the data is right there on the chip, there is zero &#8220;fetch time.&#8221;</p><p>In a GPU, the hardware decides when to process data (probabilistic). In an LPU, the software/compiler decides exactly where every piece of data will be at every billionth of a second (deterministic). It&#8217;s like a perfectly timed assembly line where no one ever has to wait for a part. The unique part of the LPU is that Groq first designed an automated compiler, and only then designed the chip. The reason is that Jonathan, who worked at Google on the TPU, knew the software was the biggest pain and that the Groq startup couldn&#8217;t compete with 10k Nvidia software engineers who write low-level assembly routines (kernels) all day. Because of that automated compiler, you don&#8217;t write any manual kernel optimizations for LPUs, as every token&#8217;s path is predetermined.</p><p>So where does the LPU excel? LLMs generate text one word at a time. The LPU is designed to stream these words through its &#8220;conveyor belt&#8221; architecture, which is why you see Groq generating hundreds of tokens per second while GPUs struggle to hit 50.</p><p>But the LPU is not the &#187;GPU killer&#171; some might think.</p><p>The LPUs strength for some use-cases but a weakness for others is its tiny memory capacity. Even an Nvidia H200 GPU has 141GB of HBM3e memory. A single Groq LPU chip has only 230MB of SRAM. Because 230MB isn&#8217;t enough to hold even a small AI model, you have to link hundreds of LPU chips together just to run one model. For example, to run Llama-3 70B at full speed, you might need hundreds of LPUs (multiple server racks), whereas you can fit that same model onto just two or four Nvidia GPUs in a single small box. Because you need so many LPU chips to handle the memory requirements of modern models, the initial hardware investment can be big and the data center footprint much larger than the one with the GPU.</p><p>Because the LPU is also deterministic, as the software must map out every single calculation before it starts, it is more difficult to handle dynamic workloads or changing underlying architecture (from Transformer to something else).</p><p>But there is upside to the LPU. Even though a single Groq LPU system (a GroqRack) is more expensive to buy than a single Nvidia server, it can be significantly cheaper to run if you have high-volume traffic.</p><p>To get ultra-low latency on a GPU, you have to use a &#8220;Batch Size of 1&#8221; (meaning you process only one user&#8217;s request at a time). This makes the GPU incredibly expensive per token because most of its processing power is sitting idle while it waits for memory to move. But the LPU is designed for a Batch Size of 1. It achieves 300&#8211;500 tokens per second while keeping its internal &#8220;assembly line&#8221; nearly 100% full.</p><p>And then there is the very important energy aspect.</p><p>Because the LPU doesn&#8217;t have to power external HBM (High Bandwidth Memory), it is fundamentally more energy-efficient for the actual math it performs. Moving data from external HBM to a GPU core costs about 6 picojoules per bit. Retrieving it from Groq&#8217;s local SRAM costs only 0.3 picojoules per bit. On an architectural level, Groq is roughly 10x more energy-efficient per token than a GPU for inference.</p><p>But as we talked about before, the downside is that while LPUs are cheaper to run, you are paying more for floor space, networking cables, and physical maintenance. So why did Nvidia decide to buy Groq?</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>The Groq strategic play from Nvidia</strong></p><p>There are five main reasons Nvidia bought Groq: Energy Bottleneck, HBM Bottleneck, CoWoS Bottleneck, liquid-cooled Data Center Bottleneck, and the competition aspect.</p><p>While we already discussed the energy benefits of LPU vs. GPU in the previous section, we are now in an age where energy is the limiting factor for Nvidia&#8217;s growth. Having a second option that is more energy-efficient, especially for simpler inference workloads, is important. To add context, Groq&#8217;s LPUs don&#8217;t require liquid cooling, which is an important aspect of the whole deal. In the world, there are far more air-cooled data centers than liquid-cooled ones. Nvidia&#8217;s latest Blackwell, as well as other future products, will be mostly liquid-cooled as they are meant for maximum performance. In the cloud industry, many air-cooled data centers that can&#8217;t be repurposed for liquid cooling are being left. In fact, in a recent interview with Groq CEO Ross, he mentioned that Groq has just landed a big European data center project where LPUs will be hosted, the data center was actually left vacant by a hyperscalers who didn&#8217;t want to extend the lease as it didn&#8217;t have the options to be a liquid-cooled DC.</p><p>While in an Nvidia perfect world, Nvidia would surely prefer that all DCs be liquid-cooled, the reality is different, as securing a reliable water source is often a problem and will take time. Nvidia&#8217;s reliance on liquid DCs could also lead to growth problems, as liquid cooling adds complexity that many DC operators struggle with (the latest CoreWeave delay is just one example). So Groq adds an air-cooled option for Nvidia to sell in the future and capture more short-term revenue. So the fact that Groq LPUs take more data center footprint is not a problem, as they can be used in air-cooled DCs that are not being utilized that much. In my view, Nvidia&#8217;s air-cooled option is also important, as many competitors, such as AWS&#8217;s Trainium, which is a strong alternative as I discussed in the<a href="https://www.uncoveralpha.com/p/amazon-trainium-scaling-ai-without"> last article</a>, are air-cooled chips.</p><p>Moving to another key aspect of this deal: the HBM bottleneck. While HBM has been a bottleneck for some time now with Google TPUs, AMD MI400s, and AWS Tranium 3 and 4 starting to become more competitive and &#187;eating&#171; more and more HBM, the availability of HBM has become worse and worse. HBM for 2026 is sold out, and a real question is how long it will take for 2027 to sell out, too. The three players, SK Hynix, Samsung, and Micron, are also not eager to expand capacity too much in the future, as they know their industry is cyclical and has recently seen major overbuilds. Now that more chip design companies are competing strongly for HBM capacity, the negotiating power of Micron, SK Hynix, and Samsung will only increase. For Nvidia to secure a viable option for non-complex inference workloads like LPUs is a big positive, as they don&#8217;t use any HBM. Again, the play for Nvidia here is to continue its revenue growth and sales of compute units, without being 100% constrained by available HBM.</p><p>Another strategic advantage is that Groq&#8217;s chips perform well even when fabbed on older nodes. The reason for this is SRAM: since they don&#8217;t have external memory, they don&#8217;t need the densest transistors to achieve high speed. Groq&#8217;s latest generation of LPUs is, in fact, fabbed at a 14nm node at GlobalFoundries. While they are transitioning to newer nodes at Samsung, the fact that you can produce capable chips on an older node, not at TSMC, is another big advantage for someone like Nvidia, as it bypasses another bottleneck: TSMC and CoWoS. The chances of a Groq state-of-the-art chip being fabbed outside of TSMC are much higher than a B300 or Vera Rubin. So, again, with this move, Nvidia is opening a new avenue for growth that doesn&#8217;t face the same bottlenecks as Blackwell or Vera Rubin.</p><p>Now, to the last point: competition. Nvidia knows that if the HBM-energy-liquid cooling-CoWoS bottlenecks squeeze the market and cause a significant shortage of compute,  customers and competitors will start looking for alternatives to bypass those bottlenecks, and a Groq with a supply chain not bottlenecked by the same factors is a prime candidate for that. Groq, going into this &#187;acquisition&#171; was growing fast, and more importantly, its capacity was growing fast.</p><p>Groq CEO 4 months ago:</p><div class="pullquote"><p>&#187;18 months ago, we had 1/10000 of the token capacity. Today we have about 20M tokens per second capacity a month and a half ago we had 10M&#171;</p></div><p>So, rather than Meta or Microsoft buying Groq and opening an alternative path beyond the limited GPU path, Nvidia decided to pull the trigger itself.</p><p><strong>What does this mean for Nvidia?</strong></p><p>Did Nvidia acknowledge that GPUs are not the best hardware for every AI workload? Yes. At the same time, Nvidia is signaling that they expect their GPUs to be completely sold out for years and that they want to grow outside of their bottlenecks.</p><p>More inference revenue will also mean a different margin. Inference margins for Nvidia will not be as high as even the Groq CEO acknowledges this recently:</p><div class="pullquote"><p>&#187;Inference is going to be a high-volume, low-margin market. Nvidia is going to build every single GPU that they can physically manufacture this year, AMD is going to do the same thing; they are limited by the HBM, and they are going to sell every single GPU that they build. The thing is that it is not enough. On top of that, every time they sell for inference, when you are paying the 70-80% margin on a GPU, you have to charge that to your end users. Inference is a high-volume, low-margin business. Now, when we start deploying a large number of inference chips, Nvidia, AMD, they can sell their chips for training, which they are really good at, and they can keep that margin high as you can amortize that over 10-20x more compute that you are going to need for inference.&#171;</p></div><p>What does this mean for you as an investor? In the next few days, I will publish my 2026 outlook and the most interesting names I am investing in or watching. Nvidia's Groq move definitely added a new subsector to my list, as a new supply chain is opening up. If you have not yet consider becoming a paid subscriber, as most of that list of names will be for paid subs only.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe&quot;,&quot;text&quot;:&quot;Subscribe to Paid&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe"><span>Subscribe to Paid</span></a></p><p>Until next time, happy holidays!</p><p>As always, I hope you found this article valuable. I would appreciate it if you could share it with people you know who might find it interesting.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/p/the-20-billion-admission-why-nvidia?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/p/the-20-billion-admission-why-nvidia?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p>Thank you!</p><p><strong>Disclaimer:</strong></p><p>I own Meta (META), Google (GOOGL), Amazon (AMZN), Microsoft (MSFT), TSMC (TSM), Intel (INTC) stock.</p><p>Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer&#8217;s opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.</p>]]></content:encoded></item><item><title><![CDATA[Amazon Trainium: Scaling AI Without Breaking the Bank]]></title><description><![CDATA[In this article, I am publishing a comprehensive deep dive into Amazon&#8217;s custom ASIC chip, Trainium. I will cover the technical details, as well as the performance, costs, and strategic factors of this unit, and what they mean for Amazon and the broader semiconductor ecosystem.]]></description><link>https://www.uncoveralpha.com/p/amazon-trainium-scaling-ai-without</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/amazon-trainium-scaling-ai-without</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 18 Dec 2025 14:04:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kztT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>In this article, I am publishing a comprehensive deep dive into Amazon&#8217;s custom ASIC chip, Trainium. I will cover the technical details, as well as the performance, costs, and strategic factors of this unit, and what they mean for Amazon and the broader semiconductor ecosystem.</p><p>Topics covered:</p><ul><li><p>How Amazon&#8217;s custom chip businesses started</p></li><li><p>How does Trainium work</p></li><li><p>The software optimization layer</p></li><li><p>Performance of Trainium</p></li><li><p>Why is Trainium this cheap?</p></li><li><p>The biggest thing holding Trainium back</p></li><li><p>AWS&#8217;s Trainium Business Strategy and Competitive Positioning</p></li></ul><p>Let&#8217;s dive into it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kztT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kztT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kztT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kztT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kztT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kztT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:216205,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/181986282?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kztT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kztT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kztT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kztT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9b2b731-b5f6-4077-abfd-54e3e5fce591_1024x1024.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>How Amazon&#8217;s custom chip businesses started</strong></p><p>Amazon Web Services (AWS) entered the custom silicon development after acquiring the Israeli chip startup Annapurna Labs in 2015. This acquisition paved the way for AWS&#8217;s in-house chips, such as the Graviton CPU family and Nitro virtualization cards, and later for its machine learning accelerators. AWS&#8217;s machine learning chips comprise Inferentia (for ML inference) and Trainium (for large-scale model training), names which directly reflect their intended use cases. AWS&#8217;s recent Trainium generation, both 2 and 3, can handle high-end inference use cases in addition to training.</p><p>The first-generation AWS<strong> </strong>Trainium was unveiled at re:Invent 2020 as Amazon&#8217;s first in-house training accelerator. Built on a 7 nm process with roughly 55 billion transistors, Trainium1 began powering EC2 Trn1 instances by 2022. AWS then launched<strong> </strong>Trainium2 (second generation) in late 2023, fabricated at 5 nm and featuring a new NeuronCore-v3 architecture. Trainium2 dramatically scaled up the core count &#8211; quadrupling the number of compute cores per chip &#8211; and introduced support for structured sparsity, achieving about 3.5&#215; higher throughput than Trainium1 despite slightly lower per-core frequencies.</p><p>By early 2024, Trainium2 was available via EC2 Trn2 instances and UltraServer systems, delivering 30&#8211;40% better<strong> </strong>price-performance than contemporary GPU-based instances (such as Nvidia A100/H100 instances) according to AWS.</p><p>Most recently, at AWS re:Invent 2025, Amazon announced<strong> </strong>Trainium3, its third-generation AI training chip built on an advanced 3 nm node. The new chip powers the EC2 Trn3 UltraServer &#8211; a 144-chip rack-scale system &#8211; and offers up to 4.4&#215; more compute performance, ~4&#215; higher energy efficiency, and nearly 4&#215; more memory bandwidth compared to the prior Trainium2 generation.</p><p><strong>How does Trainium work</strong></p><p>Under the hood, Trainium chips are highly specialized ASICs focused on matrix math and parallelism. Each chip contains multiple NeuronCores, AWS&#8217;s term for its AI-optimized compute cores. Notably, starting with Trainium, Annapurna Labs added dedicated Collective Communication cores alongside the scalar, vector, and tensor engines in each NeuronCore (a good technical detail of this was written on The Next Platform). These communication engines accelerate distributed training operations (e.g., all-reduce for gradients), reflecting a &#8220;system-first&#8221; design that tightly couples compute with networking. As one AWS architect explained, they &#187;first designed the full system and [worked] backwards... to specify the most optimal chip&#171; rather than treating the chip in isolation. This co-design philosophy (developing silicon alongside software and systems) enables AWS to tailor Trainium&#8217;s architecture to improve large-scale training efficiency.</p><p>Data Types and Throughput:</p><p>Trainium supports a range of numeric formats commonly used in AI: FP32, BF16/FP16, and, notably, a configurable FP8 format designed to boost throughput.</p><p>It is essential to understand the term FP (Floating Points) because, later in the article, we will compare the performance of Nvidia&#8217;s Blackwell, Trainium3, and Google&#8217;s TPUv7 for specific FPs. For less technical readers, FPs are the &#187;resolution&#171; of AI math. Just as a 4K video requires more data and a faster internet connection than a 720p video, FP32 requires more power and time than FP4. By moving to lower FP formats, chip makers are effectively reducing the &#8216;data weight&#8217; of AI, allowing it to run faster with less electricity. Still, the trade-off can be lower accuracy (increased risk of errors).</p><p>Trainium2 and later chips also implement 4:1 structured sparsity (i.e., the ability to skip 4 out of 16 or similar patterns of weights) to exploit model sparsity for additional speedups. According to analysis by The Register, Trainium3&#8217;s hardware can leverage 16:4 sparsity to quadruple effective throughput on supported workloads. This means a single Trainium3 chip, which delivers about 2.5 petaFLOPS of dense FP8 performance, can exceed 10 petaFLOPS effective throughput on sparse models. For higher-precision tasks (such as BF16 training), Trainium still offers competitive performance while focusing on FP8/FP16 for maximum speed where acceptable.</p><p>Memory and Interconnect:</p><p>Each Trainium generation has pushed memory limits to handle ever-larger models. Trainium2 packed 16 GB HBM stacks (HBM3) per chip (total ~96 GB/chip), whereas Trainium3 uses faster HBM3E with 12&#8208;high stacks, giving 144 GB per chip at 4.9 TB/s bandwidth. This nearly 50% increase in memory capacity (and a ~70% increase in bandwidth) enables Trainium3 to feed its compute units efficiently for training massive models. AWS also engineered a proprietary high-speed interconnect called NeuronLink (chip-to-chip links) and a switching fabric (NeuronSwitch) for scaling out. For networking, they also opened the table for other options as they want to optimize for maximum efficiency and vendor flexibility, even on the networking layer.</p><p>Trainium2-based systems used a 3D torus topology. Trainium3 introduces an all-to-all switched fabric with NeuronSwitch-v1, which roughly doubles intra-node bandwidth and reduces latency between chips. Thanks to this fabric, a single Trn3 UltraServer can unite 144 Trainium3 chips into a single coherent system, and AWS&#8217;s UltraCluster 3.0 can further connect &#8220;up to 1 million Trainium chips&#8221; across multiple racks to scale the cluster. In testing, AWS reported that these improvements enable 4&#215; faster model training and inference latency reduction when comparing Trainium3 UltraServers to the previous generation.</p><p><strong>The software optimization layer</strong></p><p>As most of you know by now, any software optimization layer not called CUDA has its hurdles. AWS has tried to mitigate this by integrating its Neuron SDK with popular ML frameworks (TensorFlow, PyTorch, JAX, Hugging Face libraries, etc.) to ease porting. Given recent moves, AWS is increasingly leaning into opening up the software ecosystem to the open-source community and accelerating adoption. Anthropic is key to maturing the Neuron software stack for broader external adoption. A high-ranking Amazon employee made an interesting comment regarding the strategy here:</p><div class="pullquote"><p>&#187;To answer your question, for us in five years, we hope on inference size we can at least address more than 50% of the pure play external customers. That&#8217;s the reason we are trying so hard to attract those leading companies or investing leading companies like Anthropic to work on accelerator because they are invest by Google and us. They are training their model on both TPUs and Trainium.<br><br>I think, basically, they are the trailblazer for all other external customers. Once they test out everything, they develop all the SDKs, those things then future other customer adoption will come in the next five years. This, I think, our conviction. I think this will be a great success going forward. This is what we think.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>So Amazon is betting heavily on Anthropic and its engineers, who have become highly proficient in optimizing Trainium to help build the software library base for broader adoption of Trainium chips. Having Anthropic on top of embracing the open-source ecosystem seems like a clever approach.</p><p>Given that CUDA is entrenched in engineers&#8217; mindshare, this strategy seems the only viable option within the software stack, given a full embrace of the open-source community.</p><p><strong>Performance of Trainium</strong></p><p>Trainium3 chip provides ~2.5 PFLOPS (10 PFLOPS sparse) and 144 GB memory. Hence, a fully populated UltraServer delivers on the order of 360 PFLOPS (dense FP8) or more than 1.4 exaFLOPS (with sparsity) of compute and over 700 TB/s of aggregate memory bandwidth. This puts Trainium3 UltraServer in the same class as the largest GPU-based systems.</p><p>According to AWS, early customers have reported substantial performance and cost benefits. For instance, Amazon&#8217;s Bedrock service (which offers foundation models) is already running production workloads on Trainium3, and others, such as Anthropic, have achieved 50% cost reductions and multi-fold throughput gains by switching from GPUs to Trainium hardware. This claim is from AWS, so take it with a grain of salt, as we don&#8217;t know on what specific workloads these numbers were tested on.</p><p>The real number we are looking for is the total cost of ownership (TCO) per performance.</p><p>Before we go to Trainium 3 (Trn3), I did find some interesting information on Trainium 2:</p><p>An Amazon employee in May mentioned the following:</p><div class="pullquote"><p>&#187; We offer as a price per FLOPS. It&#8217;s probably 30%-40% cheaper equivalent to their leading NVIDIA instance in our data center. The Trainium2 is about 30% cheaper than upper H200. We sell those. We incentivize customers to use it. Our cost perspective, because NVIDIA enjoys such a hefty margin, we all know it.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>Similar takes are found from different customers.</p><p>A customer in February noted that cost-conscious startups using TPUs or Trainium can reduce costs to 1/5 of those of NVIDIA clusters if longer, less time-critical training runs are allowed.</p><p>In April, an executive provided granular hourly pricing data, reporting that while NVIDIA H100 chips cost approximately $3 per hour per chip (via providers like CoreWeave), Trainium chips were available for roughly $1 per hour. They further noted that AWS offered potential discounts for long-term contracts that could bring the effective price down to $0.50 per hour, representing roughly 1/6 to 1/7 of the cost of an H100.</p><p>Another customer mentioned in August that Amazon is offering &#187;massive discounts&#171; on Trainium processors even within their own cloud instances to undercut NVIDIA GPU spot pricing.</p><p>A director at Tenstorrent also noted the benefits of ASIC utilization, noting that GPU utilization for training often sits at only 30-40% due to data movement bottlenecks. In contrast, AI accelerators (ASICs) like Trainium can achieve near 100% utilization because they are explicitly architected for these workloads.</p><p>Most experts consistently cite a 30-50% cost advantage for Trainium over comparable NVIDIA instances, driven by lower unit costs and aggressive pricing strategies.</p><p>Now moving to the performance of Tranium 3. Looking at Trainium3 at FP8 precision, a Trn3 UltraServer is roughly on par with Nvidia&#8217;s latest 72-GPU &#8220;Blackwell Ultra&#8221; system in total throughput. However, at ultra-low precision FP4 for inference, Nvidia&#8217;s system still leads by ~3&#215;.</p><p>SemiAnalysis also did their numbers on the TCO/performance of Trn3.</p><p>Similarly, they found that the TCO per marketed performance Trainium3 is 30% better than GB300 NVL72 on FP8, but on FP4 it is much worse.</p><p>What does this mean? To put that in perspective, currently, for training workloads, FP4 is harder to use because &#8220;low precision&#8221; can cause the model to &#8220;diverge&#8221; (basically, the AI becomes confused during learning).</p><p>However, NVIDIA has recently proven with NVFP4 that you can train with 4-bit precision by using clever scaling. This could potentially reduce training costs by an additional 30&#8211;40% over FP8. At least for the next few months, it isn&#8217;t expected that the big AI labs will switch to FP4 for training.</p><p>In Inference, the story is slightly different, as AI labs are aggressively adopting FP4. FP4 enables massive models (such as a 1-trillion-parameter MoE) to fit within the memory of fewer chips. If a model that used to require 16 chips now fits on 8 chips due to FP4, your cost per token drops by half. No surprise that Amazon has already announced that, for Trainium4, FP4 performance should be 6x that of the Tranium 3.</p><p>This data suggests that Tranium3 can be a very good alternative to Nvidia for training workloads if you know how to use the Trainium software stack.</p><p>Trainium 3&#8217;s operating cost also matters in this calculation, as TRn3 runs at ~1,000W per chip, while Nvidia&#8217;s GB300 runs at ~1,400W, so it&#8217;s not just about the upfront CapEx.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Why is Trainium this cheap?</strong></p><p>This is a calculation I derived from multiple sources, including BOM and industry expert interviews on the manufacturing costs of Trainium3, TPUv7, and Nvidia B200:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P2S5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P2S5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 424w, https://substackcdn.com/image/fetch/$s_!P2S5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 848w, https://substackcdn.com/image/fetch/$s_!P2S5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 1272w, https://substackcdn.com/image/fetch/$s_!P2S5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P2S5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png" width="761" height="162" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:162,&quot;width&quot;:761,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14281,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/181986282?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P2S5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 424w, https://substackcdn.com/image/fetch/$s_!P2S5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 848w, https://substackcdn.com/image/fetch/$s_!P2S5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 1272w, https://substackcdn.com/image/fetch/$s_!P2S5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbaa7a4b-08e0-4f88-8e40-cc9aaad85824_761x162.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">source: own estimates</figcaption></figure></div><p>A Trainium 3 chip is half the price of an Nvidia B200 when looking at the pure manufacturing cost, but looking at the price that Amazon would have to pay, which includes the Nvidia margin ($35k-$40k), the difference is staggering, and the reason why the Trainium has the TCO/performance advantage starts to make sense. While Amazon also applies a margin to those costs for external clients, it is nowhere close to Nvidia&#8217;s margin. These estimates for the costs are somewhat confirmed by an Amazon executive&#8217;s comment back in May:</p><div class="pullquote"><p>&#187;Typically, in our internal chip, without considering all these R&amp;D investments, because that&#8217;s going to be spread over chip, just from a manufacturing cost perspective, our Trainium chips are typically 1/3 cheaper than NVIDIA. Of course, [half of the] NVIDIA&#8217;s price is margined. Just from an acquisition of price perspective, we are around 1/3 of the cost when we buy NVIDIA chip, similar generation.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>Amazon&#8217;s significant cost advantage is also a result of how it optimizes chips and manages its supply chain. As an AIchip Technologies (Amazon supplier) employee explained it:</p><div class="pullquote"><p>&#187;It&#8217;s become so important that I think at this point, Annapurna has a lot of their own design team. They feel they can pretty much do not use a design partner for front-end design, and they can license a high-speed SerDes IP from a third party like Synopsys, Cadence and achieve lower cost. That&#8217;s why I think now in the third generation onward, they&#8217;re more focusing on the cost instead of the other aspect&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p><strong>The biggest thing holding Trainium back</strong></p><p>The most significant factor holding back Amazon&#8217;s Tranium is&#8230; </p>
      <p>
          <a href="https://www.uncoveralpha.com/p/amazon-trainium-scaling-ai-without">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The chip made for the AI inference era – the Google TPU]]></title><description><![CDATA[I am publishing a comprehensive deep dive, not just a technical overview, but also strategic and financial coverage of the Google TPU.]]></description><link>https://www.uncoveralpha.com/p/the-chip-made-for-the-ai-inference</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/the-chip-made-for-the-ai-inference</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Mon, 24 Nov 2025 13:54:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!H_F9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>As I find the topic of Google TPUs extremely important, I am publishing a comprehensive deep dive, not just a technical overview, but also strategic and financial coverage of the Google TPU.</p><p>Topics covered:</p><ul><li><p>The history of the TPU and why it all even started?</p></li><li><p>The difference between a TPU and a GPU?</p></li><li><p>Performance numbers TPU vs GPU?</p></li><li><p>Where are the problems for the wider adoption of TPUs</p></li><li><p>Google&#8217;s TPU is the biggest competitive advantage of its cloud business for the next 10 years</p></li><li><p>How many TPUs does Google produce today, and how big can that get?</p></li><li><p>Gemini 3 and the aftermath of Gemini 3 on the whole chip industry</p></li></ul><p>Let&#8217;s dive into it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H_F9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H_F9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!H_F9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!H_F9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!H_F9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H_F9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1683293,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/179815720?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H_F9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!H_F9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!H_F9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!H_F9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff173524-b98d-4a2b-9f77-7a130ad395a7_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>The history of the TPU and why it all even started?</strong></p><p>The story of the Google Tensor Processing Unit (TPU) begins not with a breakthrough in chip manufacturing, but with a realization about math and logistics. Around 2013, Google&#8217;s leadership&#8212;specifically Jeff Dean, Jonathan Ross (the CEO of Groq), and the Google Brain team&#8212;ran a projection that alarmed them. They calculated that if every Android user utilized Google&#8217;s new voice search feature for just three minutes a day, the company would need to double its global data center capacity just to handle the compute load.</p><p>At the time, Google was relying on standard CPUs and GPUs for these tasks. While powerful, these general-purpose chips were inefficient for the specific heavy lifting required by Deep Learning: massive matrix multiplications. Scaling up with existing hardware would have been a financial and logistical nightmare.</p><p>This sparked a new project. Google decided to do something rare for a software company: build its own custom silicon. The goal was to create an <strong>ASIC (Application-Specific Integrated Circuit)</strong> designed for one job only: running TensorFlow neural networks.</p><p>Key Historical Milestones:</p><ul><li><p>2013-2014:<strong> </strong>The project moved really fast as Google both hired a very capable team and, to be honest, had some luck in their first steps. The team went from design concept to deploying silicon in data centers in just 15 months&#8212;a very short cycle for hardware engineering.</p></li><li><p>2015:<strong> </strong>Before the world knew they existed, TPUs were already powering Google&#8217;s most popular products. They were silently accelerating Google Maps navigation, Google Photos, and Google Translate.</p></li><li><p>2016<strong>:</strong> Google officially unveiled the TPU at Google I/O 2016.</p></li></ul><p>This urgency to solve the &#8220;data center doubling&#8221; problem is why the TPU exists. It wasn&#8217;t built to sell to gamers or render video; it was built to save Google from its own AI success. With that in mind, Google has been thinking about the &#187;costly&#171; AI inference problems for over a decade now. This is also one of the main reasons why the TPU is so good today compared to other ASIC projects.</p><p><strong>The difference between a TPU and a GPU?</strong></p><p>To understand the difference, it helps to look at what each chip was originally built to do. A GPU is a &#8220;general-purpose&#8221; parallel processor, while a TPU is a &#8220;domain-specific&#8221; architecture.</p><p>The GPUs were designed for graphics. They excel at parallel processing (doing many things at once), which is great for AI. However, because they are designed to handle everything from video game textures to scientific simulations, they carry &#8220;architectural baggage.&#8221; They spend significant energy and chip area on complex tasks like caching, branch prediction, and managing independent threads.</p><p>A TPU, on the other hand, strips away all that baggage. It has no hardware for rasterization or texture mapping. Instead, it uses a unique architecture called a Systolic Array.</p><p>The &#8220;Systolic Array&#8221; is the key differentiator. In a standard CPU or GPU, the chip moves data back and forth between the memory and the computing units for every calculation. This constant shuffling creates a bottleneck (the Von Neumann bottleneck).</p><p>In a TPU&#8217;s systolic array, data flows through the chip like blood through a heart (hence &#8220;systolic&#8221;).</p><ol><li><p>It loads data (weights) once.</p></li><li><p>It passes inputs through a massive grid of multipliers.</p></li><li><p>The data is passed directly to the next unit in the array without writing back to memory.</p></li></ol><p>What this means, in essence, is that a TPU, because of its systolic array, drastically reduces the number of memory reads and writes required from HBM. As a result, the TPU can spend its cycles computing rather than waiting for data.</p><p>Google&#8217;s new TPU design, also called Ironwood also addressed some of the key areas where a TPU was lacking:</p><ul><li><p>They enhanced the SparseCore for efficiently handling large embeddings (good for recommendation systems and LLMs)</p></li><li><p>It increased HBM capacity and bandwidth (up to 192 GB per chip). For a better understanding, Nvidia&#8217;s Blackwell B200 has 192GB per chip, while Blackwell Ultra, also known as the B300, has 288 GB per chip.</p></li><li><p>Improved the Inter-Chip Interconnect (ICI) for linking thousands of chips into massive clusters, also called TPU Pods (needed for AI training as well as some time test compute inference workloads). When it comes to ICI, it is important to note that it is very performant with a Peak Bandwidth of 1.2 TB/s vs Blackwell NVLink 5 at 1.8 TB/s. But Google&#8217;s ICI, together with its specialized compiler and software stack, still delivers superior performance on some specific AI tasks.</p></li></ul><p>The key thing to understand is that because the TPU doesn&#8217;t need to decode complex instructions or constantly access memory, it can deliver significantly higher Operations Per Joule.</p><p>For scale-out, Google uses Optical Circuit Switch (OCS) and its 3D torus network, which compete with Nvidia&#8217;s InfiniBand and Spectrum-X Ethernet. The main difference is that OCS is extremely cost-effective and power-efficient as it eliminates electrical switches and O-E-O conversions, but because of this, it is not as flexible as the other two. So again, the Google stack is extremely specialized for the task at hand and doesn&#8217;t offer the flexibility that GPUs do.</p><p><strong>Performance numbers TPU vs GPU?</strong></p><p>As we defined the differences, let&#8217;s look at real numbers showing how the TPU performs compared to the GPU. Since Google isn&#8217;t revealing these numbers, it is really hard to get details on performance. I studied many articles and alternative data sources, including interviews with industry insiders, and here are some of the key takeaways.</p><p>The first important thing is that there is very limited information on Google&#8217;s newest TPUv7 (Ironwood), as Google introduced it in April 2025 and is just now starting to become available to external clients (internally, it is said that Google has already been using Ironwood since April, possibly even for Gemini 3.0.). And why is this important if we, for example, compare TPUv7 with an older but still widely used version of TPUv5p based on Semianalysis data:</p><ul><li><p>TPUv7 produces 4,614 TFLOPS(BF16) vs 459 TFLOPS for TPUv5p</p></li><li><p>TPUv7 has 192GB of memory capacity vs TPUv5p 96GB</p></li><li><p>TPUv7 memory Bandwidth is 7,370 GB/s vs 2,765 for v5p</p></li></ul><p>We can see that the performance leaps between v5 and v7 are very significant. To put that in context, most of the comments that we will look at are more focused on TPUv6 or TPUv5 than v7.</p><p>Based on analyzing a ton of interviews with Former Google employees, customers, and competitors (people from AMD, NVDA &amp; others), the summary of the results is as follows.</p><p>Most agree that TPUs are more cost-effective compared to Nvidia GPUs, and most agree that the performance per watt for TPUs is better. This view is not applicable across all use cases tho.</p><p>A Former Google Cloud employee:</p><div class="pullquote"><p>&#187;If it is the right application, then they can deliver much better performance per dollar compared to GPUs. They also require much lesser energy and produces less heat compared to GPUs. They&#8217;re also more energy efficient and have a smaller environmental footprint, which is what makes them a desired outcome.</p><p>The use cases are slightly limited to a GPU, they&#8217;re not as generic, but for a specific application, they can offer as much as 1.4X better performance per dollar, which is pretty significant saving for a customer that might be trying to use GPU versus TPUs.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>Similarly, a very insightful comment from a Former Unit Head at Google around TPUs materially lowering AI-search cost per query vs GPUs:</p><div class="pullquote"><p>&#187;TPU v6 is 60-65% more efficient than GPUs, prior generations 40-45%&#171;</p></div><p>This interview was in November 2024, so the expert is probably comparing the v6 TPU with the Nvidia Hopper. Today, we already have Blackwell vs V7.</p><p>Many experts also mention the speed benefit that TPUs offer, with a Former Google Head saying that TPUs are 5x faster than GPUs for training dynamic models (like search-like workloads).</p><p>There was also a very eye-opening interview with a client who used both Nvidia GPUs and Google TPUs as he describes the economics in great detail:</p><div class="pullquote"><p>&#187;If I were to use eight H100s versus using one v5e pod, I would spend a lot less money on one v5e pod. In terms of price point money, performance per dollar, you will get more bang for TPU. If I already have a code, because of Google&#8217;s help or because of our own work, if I know it already is going to work on a TPU, then at that point it is beneficial for me to just stick with the TPU usage.</p><p>In the long run, if I am thinking I need to write a new code base, I need to do a lot more work, then it depends on how long I&#8217;m going to train. I would say there is still some, for example, of the workload we have already done on TPUs that in the future because as Google will add newer generation of TPU, they make older ones much cheaper.<br><br>For example, when they came out with v4, I remember the price of v2 came down so low that it was practically free to use compared to any NVIDIA GPUs.</p><p>Google has got a good promise so they keep supporting older TPUs and they&#8217;re making it a lot cheaper. If you don&#8217;t really need your model trained right away, if you&#8217;re willing to say, &#8220;I can wait one week,&#8221; even though the training is only three days, then you can reduce your cost 1/5.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>Another valuable interview was with a current AMD employee, acknowledging the benefits of ASICs:</p><div class="pullquote"><p>&#187;I would expect that an AI accelerator could do about probably typically what we see in the industry. I&#8217;m using my experience at FPGAs. I could see a 30% reduction in size and maybe a 50% reduction in power vs a GPU.&#171;</p></div><p>We also got some numbers from a Former Google employee who worked in the chip segment:</p><div class="pullquote"><p>&#187;When I look at the published numbers, they (TPUs) are anywhere from 25%-30% better to close to 2x better, depending on the use cases compared to Nvidia. Essentially, there&#8217;s a difference between a very custom design built to do one task perfectly versus a more general purpose design.&#171;</p></div><p>What is also known is that the real edge of TPUs lies not in the hardware but in the software and in the way Google has optimized its ecosystem for the TPU.</p><p>A lot of people mention the problem that every Nvidia &#187;competitor&#171; like the TPU faces, which is the fast development of Nvidia and the constant &#187;catching up&#171; to Nvidia problem. This month a former Google Cloud employee addressed that concern head-on as he believes the rate at which TPUs are improving is faster than the rate at Nvidia:</p><div class="pullquote"><p>&#187;The amount of performance per dollar that a TPU can generate from a new generation versus the old generation is a much significant jump than Nvidia&#171;</p></div><p>In addition, the recent data from Google&#8217;s presentation at the Hot Chips 2025 event backs that up, as Google stated that the TPUv7 is 100% better in performance per watt than their TPUv6e (Trillium).</p><p>Even for hard Nvidia advocates, TPUs are not to be shrugged off easily, as even Jensen thinks very highly of Google&#8217;s TPUs. In a podcast with Brad Gerstner, he mentioned that when it comes to ASICs, Google with TPUs is a &#187;special case&#171;. A few months ago, we also got an article from the WSJ saying that after the news publication The Information published a report that stated that OpenAI had begun renting Google TPUs for ChatGPT, Jensen called Altman, asking him if it was true, and signaled that he was open to getting the talks back on track (investment talks). Also worth noting was that Nvidia&#8217;s official X account posted a screenshot of an article in which OpenAI denied plans to use Google&#8217;s in-house chips. To say the least, Nvidia is watching TPUs very closely.</p><p>Ok, but after looking at some of these numbers, one might think, why aren&#8217;t more clients using TPUs?</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Where are the problems for the wider adoption of TPUs</strong></p><p>The main problem for TPUs adoption is the ecosystem. Nvidia&#8217;s CUDA is engraved in the minds of most AI engineers, as they have been learning CUDA in universities.<strong> </strong>Google has developed its ecosystem internally but not externally, as it has used TPUs only for its internal workloads until now. TPUs use a combination of JAX and TensorFlow, while the industry skews to CUDA and PyTorch (although TPUs also support PyTorch now). While Google is working hard to make its ecosystem more supportive and convertible with other stacks, it is also a matter of libraries and ecosystem formation that takes years to develop.</p><p>It is also important to note that, until recently, the GenAI industry&#8217;s focus has largely been on training workloads. In training workloads, CUDA is very important, but when it comes to inference, even reasoning inference, CUDA is not that important, so the chances of expanding the TPU footprint in inference are much higher than those in training (although TPUs do really well in training as well &#8211; Gemini 3 the prime example).</p><p>The fact that most clients are multi-cloud also poses a challenge for TPU adoption, as AI workloads are closely tied to data and its location (cloud data transfer is costly). Nvidia is accessible via all three hyperscalers, while TPUs are available only at GCP so far. A client who uses TPUs and Nvidia GPUs explains it well:</p><div class="pullquote"><p>&#187;Right now, the one biggest advantage of NVIDIA, and this has been true for past three companies I worked on is because AWS, Google Cloud and Microsoft Azure, these are the three major cloud companies.</p><p>Every company, every corporate, every customer we have will have data in one of these three. All these three clouds have NVIDIA GPUs. Sometimes the data is so big and in a different cloud that it is a lot cheaper to run our workload in whatever cloud the customer has data in.</p><p>I don&#8217;t know if you know about the egress cost that is moving data out of one cloud is one of the bigger cost. In that case, if you have NVIDIA workload, if you have a CUDA workload, we can just go to Microsoft Azure, get a VM that has NVIDIA GPU, same GPU in fact, no code change is required and just run it there.</p><p>With TPUs, once you are all relied on TPU and Google says, &#8220;You know what? Now you have to pay 10X more,&#8221; then we would be screwed, because then we&#8217;ll have to go back and rewrite everything. That&#8217;s why. That&#8217;s the only reason people are afraid of committing too much on TPUs. The same reason is for Amazon&#8217;s Trainium and Inferentia.&#171;</p><p>source: <a href="https://www.alpha-sense.com/uncoveralpha/">AlphaSense</a></p></div><p>These problems are well known at Google, so it is no surprise that internally, the debate over keeping TPUs inside Google or starting to sell them externally is a constant topic. When keeping them internally, it enhances the GCP moat, but at the same time, many former Google employees believe that at some point, Google will start offering TPUs externally as well, maybe through some neoclouds, not necessarily with the biggest two competitors, Microsoft and Amazon. Opening up the ecosystem, providing support, etc., and making it more widely usable are the first steps toward making that possible.</p><p>A former Google employee also mentioned that Google last year formed a more sales-oriented team to push and sell TPUs, so it&#8217;s not like they have been pushing hard to sell TPUs for years; it is a fairly new dynamic in the organization.</p><p><strong>Google&#8217;s TPU is the biggest competitive advantage of its cloud business for the next 10 years</strong></p><p>The most valuable thing for me about TPUs is their impact on GCP. As we witness the transformation of cloud businesses from the pre-AI era to the AI era, the biggest takeaway is that the industry has gone from an oligopoly of AWS, Azure, and GCP to a more commoditized landscape, with Oracle, Coreweave, and many other neoclouds competing for AI workloads. The problem with AI workloads is the competition and Nvidia&#8217;s 75% gross margin, which also results in low margins for AI workloads. The cloud industry is moving from a 50-70% gross margin industry to a 20-35% gross margin industry. For cloud investors, this should be concerning, as the future profile of some of these companies is more like that of a utility than an attractive, high-margin business. But there is a solution to avoiding that future and returning to a normal margin: the ASIC.</p><p>The cloud providers who can control the hardware and are not beholden to Nvidia and its 75% gross margin will be able to return to the world of 50% gross margins. And there is no surprise that all three AWS, Azure, and GCP are developing their own ASICs. The most mature by far is Google&#8217;s TPU, followed by Amazon&#8217;s Trainum, and lastly Microsoft&#8217;s MAIA (although Microsoft owns the full IP of OpenAI&#8217;s custom ASICs, which could help them in the future).</p><p>While even with ASICs you are not 100% independent, as you still have to work with someone like Broadcom or Marvell, whose margins are lower than Nvidia&#8217;s but still not negligible, Google is again in a very good position. Over the years of developing TPUs, Google has managed to control much of the chip design process in-house. According to a current AMD employee, Broadcom no longer knows everything about the chip. At this point, Google is the front-end designer (the actual RTL of the design) while Broadcom is only the backend physical design partner. Google, on top of that, also, of course, owns the entire software optimization stack for the chip, which makes it as performant as it is. According to the AMD employee, based on this work split, he thinks Broadcom is lucky if it gets a 50-point gross margin on its part.</p><p>Without having to pay Nvidia for the accelerator, a cloud provider can either price its compute similarly to others and maintain a better margin profile or lower costs and gain market share. Of course, all of this depends on having a very capable ASIC that can compete with Nvidia. Unfortunately, it looks like Google is the only one that has achieved that, as the number one-performing model is Gemini 3 trained on TPUs. According to some former Google employees, internally, Google is also using TPUs for inference across its entire AI stack, including Gemini and models like Veo. Google buys Nvidia GPUs for GCP, as clients want them because they are familiar with them and the ecosystem, but internally, Google is full-on with TPUs.</p><p>As the complexity of each generation of ASICs increases, similar to the complexity and pace of Nvidia, I predict that not all ASIC programs will make it. I believe outside of TPUs, the only real hyperscaler shot right now is AWS Trainium, but even that faces much bigger uncertainties than the TPU. With that in mind, Google and its cloud business can come out of this AI era as a major beneficiary and market-share gainer.</p><p>Recently, we even got comments from the SemiAnalysis team praising the TPU:</p><div class="pullquote"><p>&#187;Google&#8217;s silicon supremacy among hyperscalers is unmatched, with their TPU 7<sup>th</sup> Gen arguably on par with Nvidia Blackwell. TPU powers the Gemini family of models which are improving in capability and sit close to the pareto frontier of $ per intelligence in some tasks&#171;</p><p>source: <a href="https://semianalysis.com/">SemiAnalysis</a></p></div><p><strong>How many TPUs does Google produce today, and how big can that get?</strong></p><p>Here are the numbers that I researched:</p>
      <p>
          <a href="https://www.uncoveralpha.com/p/the-chip-made-for-the-ai-inference">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q3 earnings: Google's AI muscle, Meta Goes All in, Microsoft shows its cards]]></title><description><![CDATA[I wanted to share my thoughts and the key highlights from yesterday&#8217;s earnings calls from Google, Microsoft, and Meta, because I believe we got some very important signals for these companies and the industry at large.]]></description><link>https://www.uncoveralpha.com/p/q3-earnings-googles-ai-muscle-meta</link><guid isPermaLink="false">https://www.uncoveralpha.com/p/q3-earnings-googles-ai-muscle-meta</guid><dc:creator><![CDATA[UncoverAlpha]]></dc:creator><pubDate>Thu, 30 Oct 2025 12:00:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!PABX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>I wanted to share my thoughts and the key highlights from yesterday&#8217;s earnings calls from Google, Microsoft, and Meta, because I believe we got some very important signals for these companies and the industry at large.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PABX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PABX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!PABX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!PABX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!PABX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PABX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2771784,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.uncoveralpha.com/i/177555929?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PABX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!PABX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!PABX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!PABX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F903345c2-11a3-42b7-9121-ff3e40def5d7_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Google earnings</strong></p><p>The key findings from the call I would summarize in the following:</p><ul><li><p>Google Search is much better than expected, and the future of AI Search monetization is getting clearer</p></li><li><p>GCP continues to impress and win over deals</p></li><li><p>Google is executing on AI</p></li></ul><p><strong>Search</strong></p><p>Google Search delivered impressive results, generating $56.6B in revenue, up 14.6% YoY.</p><p>Sundar explained:</p><div class="pullquote"><p>&#187;AI is driving an expansionary moment for Search. As people learn what they can do with our new AI experiences, they&#8217;re increasingly coming back to Search more.&#171;</p></div><p>Google already talked about in Q2 earnings that overall queries and commercial queries continue to grow YoY on Search, yesterday they added even more color and said:</p><div class="pullquote"><p>&#187;During the Q2 call, we shared that overall queries and commercial queries continue to grow year-over-year. This growth rate increased in Q3, largely driven by our AI investments in Search, most notably AI Overviews and AI Mode.&#171;</p></div><p>So, especially AI overviews are prompting people to search more, which is enhancing the Search experience and giving us first glimpses of what future AI search will look like. Google also said that AI mode is resonating well with users and that it now has 75M DAUs.</p><p>Ad clicks in the quarter were up 7% YoY, and CPCs were up 7% YoY.</p><p>A very interesting segment of the call was when Google discussed the monetization of AI Search.</p><div class="pullquote"><p>&#187;And as I&#8217;ve shared before, for AI Overviews, even at our current baseline of ads below and within the AI&#8217;s response, overall, we see the monetization at approximately the same rate.&#171;</p></div><p>But the bigger AHA moment for me on the call was when Google hinted at a path where AI Search can also make traditionally non-monetizable search queries monetizable:</p><div class="pullquote"><p>&#187;You could also argue that on queries, that historically have not been well monetized. We think there is a potential opportunity here where you can obviously imagine that we can build this out with smart AI integration.&#171;</p><p>&#187;there&#8217;s an opportunity to actually take, let&#8217;s say, queries that are not fully commercial but could have an adjacent commercial relationship to basically expand this into more attractive ads offerings without -- while really creating a really interesting user experience at the same time.&#171;</p></div><p>This is a very important piece for investors as only about 20% of traditional Search queries are commercial, so if AI Search can unlock that pie further while the monetization rate of AI overviews is similar to tradtional Search the TAM in terms of ads for the AI Search part could be even bigger than the traditional Search TAM, which most investors don&#8217;t expect right now.</p><p><strong>GCP</strong></p><p>As expected, and as I <a href="https://www.uncoveralpha.com/p/q3-2025-channel-checks-and-other">shared in my last article</a>, which explained what alt data and my expectations were for GCP, AWS, and Azure, GCP delivered a really strong quarter.</p><p>Google Cloud revenue was $15.2B, up 34% YoY, but within Google Cloud, GCP continued to grow at a rate that was much higher than Cloud&#8217;s overall revenue-growth rate, as mentioned by management. My guess is that, based on this comment and other data that I am seeing, we are talking about a +40% YoY growth rate in GCP.</p><p>The impressive stat was also Google Cloud&#8217;s backlog, given that they don&#8217;t have a prime customer such as OAI, and the usage of Anthropic is split between AWS and GCP:</p><div class="pullquote"><p>&#187;Google Cloud&#8217;s backlog increased 46% sequentially (quarter over quarter) and 82% year-over-year, reaching $155 billion at the end of the third quarter.&#171;</p></div><p>As I already shared and now confirmed by management, GCP is winning big with new clients:</p><div class="pullquote"><p>&#187;The number of new GCP customers increased by nearly 34% year-over-year. Two, we are signing larger deals. We have signed more deals over $1 billion through Q3 this year than we did in the previous 2 years combined.&#171;</p></div><p>The diversifications of clients also seem very healthy, especially as I am more and more concerned about the concentration of the industry to two clients:</p><div class="pullquote"><p>&#187;Over the past 12 months, nearly 150 Google Cloud customers each processed approximately 1 trillion tokens with our models for a wide range of applications.&#171;</p></div><p><strong>AI execution</strong></p><p>A big, important data point was that the Gemini app now has over 650M MAUs, and that queries increased by 3x from Q2 of this year. Just to put it in context, ChatGPT probably has around 1B MAUs, so Gemini has made significant gains in the last quarter.</p><p>Gemini adoption is also present with enterprises, not just end-users:</p><div class="pullquote"><p>&#187;Our first-party models, like Gemini, now process 7 billion tokens per minute via direct API used by our customers.&#171;</p></div><p>On an important question of models advancing at a slower pace, Sundar acknowledged that, but at the same time, hinted that this is the reason why Gemini 3.0 is coming out a few months later than expected:</p><div class="pullquote"><p>&#187;I&#8217;m incredibly impressed by the pace at which the teams are executing and the pace at which we are improving these models. But it also is true at the same time that each of the prior model you&#8217;re trying to get better over is now getting more and more capable. So I think both the pace is increasing, but sometimes we are taking the time to put out a notably improved model, so I think -- and that may take slightly longer.&#171;</p></div><p>And yes, we got confirmation, Gemini 3.0 is coming out THIS year.</p><p><strong>Google Summary</strong></p><p>All in all, a great quarter by Google, showing not only that GCP is taking the most market share right now and is uniquely positioned with its TPU offering, but also Google Search showing what the future is going to look like, and that future might be even better than the past, which is a big message.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><p><strong>Microsoft earnings</strong></p><p>With Microsoft, my main focus is on what's most important: Azure. We got a strong Azure number with Azure growing 40% and 39% in constant currency, but the questions and concerns were focused on customer concentration and the relationship with OpenAI.</p><p>And we got some really great insights on that topic and how Satya and Microsoft are thinking about this going forward.</p><p>First of all, as you might expect just from having OAI as a client, forward bookings are off the charts:</p><div class="pullquote"><p>&#187;Commercial bookings increased 112% and 111% in constant currency and were significantly ahead of expectations, driven by Azure commitments from OpenAI as well as continued growth in the number of $100 million-plus contracts for both Azure and M365. These results do not include any impact from the incremental $250 billion Azure commitments from OpenAI announced yesterday. Commercial remaining performance obligation increased to $392 billion and was up 51% year-over-year.&#171;</p></div><p>But the question is not just what your bookings are, but whether the most important client, OAI, can pay for those orders, and what if Microsoft overbuilds because of that one client?</p><p>Microsoft gave us an answer to that and how they are viewing things.</p><p>Throughout the call, management was very careful to send the following message: Yes, we will expand and build new data centers at a high pace over the next 2 years, but we are doing so based on even short-term demand we have, so 2 years.</p><p>So they are saying that Microsoft is not taking much risk because it is matching its buildout to short-term demand, since the ability of OAI to fulfill short-term orders is much easier to see than the long-term projections.</p><p>It was also clear from the call that Microsoft's biggest risk is the short life expectancy of GPUs. In a way, they actually hinted that they see them as 2-year duration assets. Here is an important segment on that from their CFO:</p><div class="pullquote"><p>&#187;Let me talk a little bit about maybe connecting a couple of the dots because with $400 billion of RPO, that&#8217;s sort of short-dated as we talked about, our needs to continue to build out the infrastructure is very high. And that&#8217;s for booked business today. That is not any new booked business we started trying to accomplish on October 1, right?</p><p>And so the way to think about that, and you saw it this quarter in particular, and as we talked about &#8216;26, the remainder, number one, we&#8217;re pivoting toward -- increasingly, we talked about this short-lived assets, both GPUs and CPUs, Again, we talk about all these workloads are burning both in terms of app building. Now when that happens, short-lived assets generally are done to match sort of the duration of the contracts or the duration of your expectation of those contracts. And so I sometimes think when people think about risk, they&#8217;re not realizing that most of the lifetimes of these and the lifetime of the contracts are very similar.</p><p>And so when you think about having revenue and the bookings and coming on the balance sheet, the depreciation of short-lived assets, they&#8217;re actually quite matched, Mark. And as you know, we&#8217;ve spent the past few years not actually being short GPUs and CPUs per se, we were short the space or the power is the language we used to put them in. So we spent a lot of time building out that infrastructure. Now we&#8217;re continuing to do that also using leases. Those are very long-lived assets, as we&#8217;ve talked about 15 to 20 years.</p><p>And over that period of time, do I have confidence that we&#8217;ll need to use all of that, it is very high.</p><p>And so when I think about sort of balancing those things, seeing the pivot to GPU, CPU short-lived, seeing the pivot in terms of how those are being utilized, we are -- and I said this now, we&#8217;ve been short now for many quarters. I thought we were going to catch up, we are not. Demand is increasing. It is not increasing in just one place. It is increasing across many places.</p><p>We&#8217;re seeing usage increases in products. We are seeing new products launch that are getting increasing usage, and increasing usage very quickly. When people see real value, they actually commit real usage.</p><p>And I sometimes think this is where this cycle needs to be thought through completely is that when you see these kind of demand signals and we know we&#8217;re behind, we do need to spend. But we&#8217;re spending with a different amount of confidence in usage patterns and in bookings, and I feel very good about that. I have said we are now likely to be short capacity to serve the most important things we need to do, which is Azure, our first-party applications. We need to invest in product R&amp;D and we&#8217;re doing end-of-life replacements in the fleet. So we&#8217;re going to spend to make sure that happens.</p><p>It&#8217;s about modernization. It&#8217;s about high quality. It&#8217;s about service delivery, and it&#8217;s about meeting demand.&#171;</p></div><p>And Satya added to this with this comment, confirming the worry about the usefulness of the life of GPUs, when you get much more capable GPUs every year:</p><div class="pullquote"><p>&#187;The second thing that we&#8217;re also doing is continually modernizing the fleet. It&#8217;s not like we buy one version of, say, NVIDIA and load up for all the gigawatts we have. Each year, you buy, you write the Moore&#8217;s Law, you continuously modernize and depreciate it. And that means you also use software to grow efficiency.&#171;</p></div><p>Satya also communicated between the lines that they want to customize any of their data centers for OpenAI, as they want to hedge:</p><div class="pullquote"><p>&#187;But it&#8217;s great to have the hit first-party apps in the beginning because you can build scale that then if it&#8217;s a fungible and that&#8217;s where the key is. You don&#8217;t want to build for a digital native in -- as if you&#8217;re just doing hosting for them. You want to build. That&#8217;s where -- I think some of the decision-making of ours is probably getting better understood. What do we say yes to, what do we say no to.&#171;</p></div><p>On Azure&#8217;s 40% growth, we also got information that a lot of that growth actually comes from OAI:</p><div class="pullquote"><p>&#187;Results were ahead of expectations, driven by better-than-expected growth in our core infrastructure business, primarily from our largest customers.&#171;</p></div><p>An important information Microsoft laid out in the call was also how they will expand capacity in the next 2 years and what this means for revenue for Azure:</p><div class="pullquote"><p>&#187;We will increase our total AI capacity by over 80% this year and roughly double our total data center footprint over the next 2 years, reflecting the demand signals we see.&#171;</p></div><p>The last comment I would highlight was Satya&#8217;s comment, saying in the end, Microsoft&#8217;s success is not OpenAI but their own model, which I found was quite interesting:</p><div class="pullquote"><p>&#187;And then we have to fund our own R&amp;D and model capability because in the long run, that&#8217;s what&#8217;s going to differentiate us.&#171;</p></div><p><strong>Microsoft Summary</strong></p><p>It was a great quarter for Microsoft and Azure based on the numbers, but there is growing concern about the concentration on one client. It would be interesting to see what Azure growth would be like, ex, OAI.</p><p><strong>Meta earnings</strong></p><p>I can summarize the earnings exactly as I laid out in <a href="https://www.uncoveralpha.com/p/q3-2025-channel-checks-and-other">this post</a> a few days ago. The core business of the family of apps is on fire, but Zuck has his eyes set on AI and wants to have an OpenAI inside of Meta. What this brings, at least in the short term, is heavy pressure on Meta&#8217;s profits and FCFs as OpenAI&#8217;s business model is a heavy cash burn one.</p><p><strong>The core family of apps</strong></p><p>Let&#8217;s first look at the core business. Revenue was $51.2B up 26% YoY.</p><div class="pullquote"><p>&#187;Across Facebook, Instagram and Threads, our AI recommendation systems are delivering higher quality and more relevant content, which led to 5% more time spent on Facebook in Q3 and 10% on Threads.&#171;</p></div><p>AI is having a profound effect on Meta across both ad targeting and engagement trends.</p><p>Reels now has an annual run rate of over $50 billion.</p><div class="pullquote"><p>&#187;And now the annual run rate going through our completely end-to-end AI-powered ad tools has passed $60 billion.&#171;</p></div><p>The big WOW moment for me on the call, when it comes to the core, was:</p><div class="pullquote"><p>&#187;In the U.S., overall time spent on Facebook and Instagram grew double digits year-over-year, driven by continued video strength as well as healthy growth in non-video time on Facebook.&#171;</p></div><p>Time spent on both Facebook and Instagram is accelerating in Q3!</p><p>You also had the continuation of the great growth trend of both direct subscriptions and WhatsApp messaging, as well as click-to-WhatsApp ads:</p><div class="pullquote"><p>&#187;Family of Apps other revenue was $690 million, up 59%, driven by WhatsApp paid messaging revenue growth as well as meta verified subscriptions&#171;</p><p>&#187;We&#8217;re seeing strong growth across our portfolio of solutions, including with click-to-WhatsApp ads, which grew revenue 60% year-over-year in Q3.&#171;</p></div><p>We also got word that Meta&#8217;s Ray-Ban Display glasses are sold out.</p><p><strong>CapEx bomb</strong></p><p>But then we moved to the portion where Zuck is going all in and ready to burn a ton of cash. The 2026 &#187;soft&#171; guidance was the key for many investors and their fears:</p><div class="pullquote"><p>&#187;As we have begun to plan for next year, it&#8217;s become clear that our compute needs have continued to expand meaningfully, including versus our own expectations last quarter. We are still working through our capacity plans for next year, but we expect to invest aggressively to meet these needs, both by building our own infrastructure and contracting with third-party cloud providers. We anticipate this will provide further upward pressure on our CapEx and expense plans next year. As a result, our current expectation is that CapEx dollar growth will be notably larger in 2026 than 2025. We also anticipate total expenses will grow at a significantly faster percentage rate a than 2025, with growth primarily driven by infrastructure costs, including incremental cloud expenses and depreciation.&#171;</p></div><p>Zuck added things like:</p><div class="pullquote"><p>&#187;We&#8217;re also building what we expect to be an industry-leading amount of compute.&#171;</p><p>&#187;I think that it&#8217;s the right strategy to aggressively frontload building capacity so that way we&#8217;re prepared for the most optimistic cases.&#171;</p><p>&#187;If it takes longer, then we&#8217;ll use the extra compute to accelerate our core business which continues to be able to profitably use much more compute than we&#8217;ve been able to throw at it.&#171;</p><p>&#187;But any compute that we don&#8217;t need for that we feel pretty good that we&#8217;re going to be able to absorb a very large amount of that to just convert into more intelligence and better recommendations in our family of apps and ads in a profitable way.&#171;</p></div><p>Meta is saying in the best case scenario, we have the compute and are the next OAI, in the worst case, we are frontloading some of the CapEx that we will need in the future for our core:</p><div class="pullquote"><p>&#187;So we&#8217;re really trying to plan ahead not only to ensure that we have the capacity we need in 2026, but also to give ourselves the sort of flexibility and option value to have the capacity that we think we could need in &#8216;27 and &#8216;28&#171;</p></div><p>This strategy could be risky: a data center investment is fine because it is a long-duration asset, but frontloading too many GPUs could be dangerous, as Microsoft is doing the opposite. It all comes down to the fact that Meta wants to be OpenAI, and when you want to be OpenAI, you also have to have a similar P&amp;L profile in the coming years.</p><p>A comment that made me jump was that Mark wants to calm down investors, so he even hinted at the fact that if Meta overbuilds, they are open to becoming a compute provider to others:</p><div class="pullquote"><p>&#187;Now I mean, it&#8217;s, of course, possible to overshoot that, right? And if we do, I mean, this is what I mentioned in my comments, then we see that there&#8217;s just a lot of demand for other new things that we build internally, externally, like almost every week, people come to us from outside the company asking us to stand up an API service or asking if we have different compute that they could get from us and we haven&#8217;t done that yet. But obviously, if you got to a point where you ever built, you could have that as an option.&#171;</p></div><p><strong>Meta Summary</strong></p><p>For me, this quarter shows just how strong Meta's core business is and how AI is a huge enabler of it. If Zuck hadn&#8217;t been ambitious with the LLM model builder, the stock would probably be up this quarter, but the fact is, he has. The reasons are quite obvious: if Meta delivers and is the frontier LLM provider, its long-term margin and growth profiles are much better than if it were not; but in the short term, this means Meta has to risk current profits and cash flow to even have a chance at becoming that.</p><p>As already said before for me this outcome was expected and my positon in Meta is minimal as it ever was coming into this quarter, but as investors digest this new short-term reality for Meta over the next 2-3 years where profits and FCF will shrink drastically I will be looking for the chances to build up my position again and make it a core holding of mine.</p><p>As always, I hope you found this article valuable. I would appreciate it if you could share it with people you know who might find it interesting. I also invite you to become a paid subscriber, as paid subscribers get additional articles covering both big tech companies in more detail, as well as mid-cap and small-cap companies that I find interesting.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/subscribe&quot;,&quot;text&quot;:&quot;Subscribe to Paid&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/subscribe"><span>Subscribe to Paid</span></a></p><p>Thank you!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.uncoveralpha.com/p/q3-earnings-googles-ai-muscle-meta?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.uncoveralpha.com/p/q3-earnings-googles-ai-muscle-meta?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p><strong>Disclaimer:</strong></p><p>I own Meta (META), Google (GOOGL), , Microsoft (MSFT) stock.</p><p>Nothing contained in this website and newsletter should be understood as investment or financial advice. All investment strategies and investments involve the risk of loss. Past performance does not guarantee future results. Everything written and expressed in this newsletter is only the writer&#8217;s opinion and should not be considered investment advice. Before investing in anything, know your risk profile and if needed, consult a professional. Nothing on this site should ever be considered advice, research, or an invitation to buy or sell any securities.</p>]]></content:encoded></item></channel></rss>