NVIDIA has emerged as a leader, with Google, Amazon, and Huawei ensuring profits, while AMD faces unexpected losses. AI inference is not only a technological revolution but also a business that can be precisely calculated and highly rewarding.
AI inference is an astonishingly profitable business.
Morgan Stanley's latest major report, for the first time, has calculated the economic return of the global AI computing power competition using precise financial models. The conclusion is that a standard 'AI inference factory,' regardless of which giant's chip is used, generally has an average profit margin exceeding 50%.
Among them, $NVIDIA (NVDA.US)$ The GB200 has undoubtedly claimed the crown with nearly 78% profit margin, while $Alphabet-C (GOOG.US)$ the chips from Huawei are also 'guaranteed to be profitable.' However, the market has high expectations for $Advanced Micro Devices (AMD.US)$ However, its AI platform recorded significant losses in inference scenarios.


Profitability Rankings: A Tale of Two Extremes
Morgan Stanley's model calculations reflect the divergence in profitability among AI hardware giants in real business scenarios, composing a clear 'song of ice and fire'.
The flames belong to NVIDIA, Google, Amazon, and Huawei.
The report shows that the AI factory using NVIDIA's flagship product, the GB200 NVL72, achieved a terrifying profitability of 77.6%, leading the pack. This is not only due to its unparalleled computing, memory, and networking performance but also thanks to continuous innovations in areas such as FP4 precision and the strong ecosystem of the CUDA software, demonstrating absolute market dominance.

Google's self-developed TPU v6e pod follows closely with a profit margin of 74.9%, proving that top cloud providers are fully capable of building highly cost-effective AI infrastructure through optimized hardware-software collaboration.
Similarly, AWS's Trn2 UltraServer achieved a profit margin of 62.5%, while Huawei's Ascend CloudMatrix 384 platform recorded a profit margin of 47.9%.
Unexpectedly, ice water was poured onto AMD.
The most disruptive conclusion of the report is AMD's financial performance in inference scenarios. Data calculated by Morgan Stanley indicates that the AI factories using its MI300X and MI355X platforms have profit margins of -28.2% and -64.0%, respectively.
The core reason for the losses lies in the severe imbalance between high costs and output efficiency. The report indicates that the annual total cost of ownership (TCO) for one MI300X platform reaches as high as $774 million, comparable to NVIDIA's GB200 platform at $806 million.
This means that the initial investment and ongoing expenses for operating the AMD solution are top-tier; however, the revenue generated from its token output efficiency in the inference tasks simulated by the model, which occupy 85% of the future AI market, falls significantly short of covering its high costs.

"100MW AI Factory Model": Modeling AI factories to quantify investment returns.
Supporting the above conclusion is a standardized analytical framework pioneered by Morgan Stanley—the "100MW AI Factory Model." It quantitatively assesses AI solutions across different technological paths within the same commercial dimension, centered around three main pillars:
1. Standardized "Computational Power Unit": The model uses 100 megawatts (MW) of power consumption as the baseline unit for the "AI factory." This represents the typical power consumption of a medium-sized data center, sufficient to drive approximately 750 high-density AI server racks.
2. Detailed "Cost Ledger": The model comprehensively calculates the Total Cost of Ownership (TCO), primarily including:
Infrastructure Costs: Approximately $660 million in capital expenditure per 100MW for the construction of data centers and supporting power facilities, depreciated over 10 years.
Hardware costs: Total can reach between $367 million to $2.273 billion for server systems (including AI chips), depreciated over 4 years.
Operating costs: Ongoing electricity expenses calculated based on the power usage effectiveness (PUE) of different cooling schemes and the global average electricity price.
A comprehensive estimate suggests that the annual average total cost of ownership (TCO) for a 100MW 'AI factory' ranges between $330 million and $807 million.
3. Market-oriented 'revenue formula': Revenue is directly linked to token output. The model calculates the transactions per second (TPS) based on publicly available performance data of various hardware and sets a fair price of $0.20 per million tokens by referencing the pricing of mainstream APIs such as OpenAI and Gemini. Additionally, considering a 70% equipment utilization rate in practice brings revenue forecasts closer to commercial reality.
Future battlefield: ecological competition and product roadmap.
Behind the profitability lies a deeper strategic game. The report reveals that the future battleground for AI will focus on the construction of technological ecosystems and the layout of next-generation products.
In the non-NVIDIA camp, a war over 'connection standards' has already begun. Manufacturers led by AMD are vigorously promoting UALink, emphasizing that its strict specifications for low latency are crucial for AI performance; while led by $Broadcom (AVGO.US)$The forces represented advocate for a more open and flexible Ethernet solution. The outcome of this debate will determine who can establish an open ecosystem capable of competing with NVIDIA's NVLink.
Meanwhile, NVIDIA is solidifying its leading position with a clear roadmap. The report mentions that its next-generation platform, "Rubin," is progressing as planned and is expected to enter mass production in the second quarter of 2026, with related servers anticipated to begin scaling in the third quarter of the same year. This undoubtedly sets a continually moving and higher target for all competitors.
In summary, Morgan Stanley's report injects a dose of 'commercial rationality' into the fervent AI market. It eloquently demonstrates that AI inference is not just a technological revolution but also a business that can be precisely calculated and yields substantial returns.
For global decision-makers and investors, the two profit charts mentioned earlier will provide considerable reference value for investments in computing power during the AI era.
Editor/Rocky