Magician's Hands (05): Nvidia AI Compute Dominance

0 comments

A Dependency Investigation (#05) – May, 2026

About This Series

The Magician’s Hands is a series of dependency investigations. Each report examines a single case in which a structural dependency, between a state and an infrastructure owner, a farmer and a seed company, a continent and an energy supplier, was created, normalised, leveraged, and converted into power. The cases span domains and decades. The grammar beneath them does not change.

The series takes its name from a simple observation: the most consequential things happening in the world are rarely the things that take centre stage. While we watch the visible hands, something else is being built in the structural layer underneath. These reports are an attempt to make that layer legible.

The founding article, “The Magician’s Hands”, sets out the full grammar. Each investigation that follows applies it to a specific case.

Scene-setter

The story that is publicly told about Nvidia is a story about vision rewarded. A graphics chip company, founded in 1993 to serve the gaming market, made a series of bets on parallel computation that turned out to be exactly what the artificial intelligence revolution required. By the time the world understood what large-scale AI training demanded, Nvidia had already built the hardware, the software, and the developer community to supply it. The narrative is one of entrepreneurial foresight meeting historical contingency: the right product, in the right place, at the right moment.

That story is not false. It is incomplete in a way that is structurally significant. What it foregrounds is innovation and adoption. What it backgrounds is the architecture of irreversibility that was being assembled simultaneously. While the visible hands moved — chip releases, benchmark victories, soaring valuations — something else was being built in the layer beneath: a dependency so thoroughly embedded in the toolchains, workflows, institutional knowledge, and research cultures of everyone doing serious AI work that the question of whether to use Nvidia hardware had quietly ceased to be a question at all.

This investigation examines how that dependency was created, normalised, leveraged, and obscured, and what has been converted in the process. The five moves will show that Nvidia’s dominance is not primarily a hardware story. It is a software and institutional knowledge story to which hardware happens to be attached, and that distinction matters enormously for understanding both the depth of the dependency and the near-impossibility of exit.

Move 1: Creating the Dependency

The entry point is creation. Nothing was captured under duress; no existing flow was quietly infiltrated. Nvidia built something, offered it as a tool, and the field of machine learning walked through the door with full awareness and rational intent. The dependency that followed was not a trap. It was a gift that turned structural.

The founding act was CUDA, the Compute Unified Device Architecture, launched in 2007. CUDA was a programming platform that allowed developers to write general-purpose code that ran on Nvidia’s graphics processing units, exploiting their capacity for massively parallel computation. At launch, the primary audience was scientific computing: physics simulations, financial modelling, computational biology. Machine learning was a minor presence at the frontier of academic research, not yet the consuming preoccupation of the technology industry it would become.

The inflection point arrived in 2012. A neural network called AlexNet, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto, won the ImageNet Large Scale Visual Recognition Challenge by a margin that was not incremental but categorical. AlexNet was trained on two Nvidia GTX 580 GPUs in Krizhevsky’s bedroom at his parents’ house. The result demonstrated empirically that GPU-accelerated deep learning could solve problems that prior approaches had failed to crack. The field reoriented almost immediately. Researchers who had been sceptical of neural networks began running experiments. Experiments required GPUs. GPUs that could run CUDA code were Nvidia GPUs. The adoption that followed was rational at every individual decision point. A researcher choosing to build on CUDA in 2013 or 2015 was choosing the platform with the largest library of optimised code, the most active developer community, the most comprehensive documentation, and the hardware that benchmarks consistently showed to be fastest for training. Each of those reasons was genuine. None of them was visible as a dependency at the moment of adoption. They were visible as advantages.

The hook was real utility. CUDA genuinely accelerated AI research. The structural consequence that individual adopters could not see at the moment of adoption was that the choice of platform was not merely a technical preference. It was an institutional commitment. Doctoral students trained on CUDA-based workflows carried those workflows into their first research positions, then into industry roles, then into the institutions they founded or led. The toolchain became the curriculum. The curriculum became the professional standard. By the time Nvidia’s A100 GPU was released in 2020 and the H100 in 2022, the question facing an AI research lab was not whether to use Nvidia hardware. It was how many units to secure and how quickly. The moment at which the dependency became costly to reverse had passed years earlier, quietly, during a period in which no individual actor experienced it as a consequential choice.

Move 2: Normalising the Dependency

Normalisation operated through both mechanisms the grammar identifies, and they worked in close alignment.

The narrative was supplied early and generously. Nvidia positioned CUDA not as a platform strategy but as an act of community investment: tooling made available to researchers, libraries maintained and expanded, developer conferences held annually, academic partnerships funded. The GTC conference, Nvidia’s primary developer event, constructed an identity for the AI development community as participants in a shared technical project rather than customers of a single vendor. Jensen Huang, Nvidia’s co-founder and chief executive, became a figure of unusual cultural authority within the AI research world, a founder-technologist whose keynotes were followed with the attention usually reserved for scientific announcements. The dominant narrative foregrounded the platform as infrastructure for discovery. It did not foreground the terms on which that infrastructure was held. Counter-voices who raised concerns about concentration risk, about the dangers of a critical research stack being held by a single corporation with a single set of commercial incentives, were typically categorised as technically uninformed or commercially motivated. The concerns did not land because the platform was, in fact, delivering results. The results made the concerns seem like the complaints of people who did not understand how science worked.

Operational normalisation ran deeper. CUDA became not just the standard library but the cognitive substrate of AI development. Frameworks including TensorFlow and PyTorch were built to run on CUDA. Optimisation techniques, debugging tools, profiling instruments, and deployment pipelines all assumed a CUDA-compatible environment. A researcher working in 2018 did not think of themselves as choosing Nvidia each morning. They thought of themselves as opening their development environment. The dependency had ceased to be a relationship and had become infrastructure: invisible because it was load-bearing. Switching cost at this point was not merely financial. It was the cost of retraining teams, rebuilding toolchains, forgoing years of accumulated optimisation, and accepting degraded performance benchmarks during a period when benchmark performance was directly tied to the ability to attract funding, talent, and publication outcomes. Exit had become expensive before anyone had attempted to use the dependency as leverage.

Move 3: Leveraging the Dependency

The leverage, when it came, arrived from an unexpected direction. It was not Nvidia that first moved to exploit the structural position. It was the United States government.

On October 7, 2022, the US Department of Commerce’s Bureau of Industry and Security issued new export controls restricting the sale of Nvidia’s A100 and H100 chips to China. The policy was framed publicly as a national security measure, preventing advanced AI capabilities from reaching an adversarial state. What it revealed in structural terms was that the US government understood Nvidia’s position as a strategic instrument. A dependency that had been built through commercial activity and normalised through technical adoption had become, at the geopolitical scale, a chokepoint. The ability to participate in frontier AI development, for any actor anywhere, now ran through a single company operating under US jurisdiction.

Nvidia’s response to the export controls illustrates the direct leverage form. The company developed modified chips — the A800, with its NVLink interconnect bandwidth reduced from 600 to 400 gigabytes per second, and the H800, with its bandwidth reduced from 900 to approximately 400 gigabytes per second — intended to satisfy the regulatory threshold for the Chinese market. On October 17, 2023, the Bureau of Industry and Security tightened the controls further, switching from interconnect speed as the defining criterion to total processing performance and performance density, rendering the modified chips subject to restrictions as well. The sequence demonstrated that the rules of participation in AI development were being written by parties other than those doing the development, and that the rules could be changed faster than the dependent parties could adapt.

Structural leverage operated across the commercial layer simultaneously. Cloud providers including Amazon, Microsoft, and Google found that their ability to offer competitive AI infrastructure services was directly constrained by their access to Nvidia hardware. During the peak demand period of 2023, lead times for H100 access reached eight to eleven months, confirmed by analyst tracking at the time. The pricing power Nvidia exercised during this period was not incidental to the market position; it was the market position being expressed commercially. Nvidia’s gross margin reached 72.7% for fiscal year 2024, with the fourth quarter hitting 76% — figures that analysts consistently described as extraordinary for a hardware company and that reflected the structural rather than merely competitive nature of the pricing.

Option-space leverage was operating at the national policy level simultaneously. Governments designing national AI strategies found that the realistic path to competitive AI capability ran through Nvidia procurement. The question of whether to build sovereign AI infrastructure was, in practical terms, the question of whether to spend the money required to secure Nvidia hardware and whether to accept the geopolitical alignment that came with it. No coercion was applied. The structural position had already reshaped what felt like a realistic option.

Move 4: Obscuring the Mechanism

The temporal gap in this case is compressed relative to many dependency stories, but it was sufficient. The critical period of CUDA adoption ran roughly from 2012 to 2018. During this period, the decisions that created the dependency were being made at the level of individual researchers, university laboratories, and early AI companies. The parties making those decisions were not the parties who would face the geopolitical consequences. A doctoral student choosing a GPU cluster in 2014 was not making a foreign policy decision. They were choosing the tool that worked. The connection between that choice, aggregated across thousands of laboratories and companies over a decade, and the strategic chokepoint that the US government identified in 2022, was not a connection that any institutional architecture existed to make at the time it would have needed to be made to be actionable.

The narrative that made the dependency unaskable was the narrative of technical meritocracy. The GPU that won the benchmarks was the right GPU. The platform with the most libraries was the right platform. The company with the best developer relations was building the ecosystem everyone wanted to be inside. These claims were true, and their truth was precisely what prevented the structural question from being asked. The structural question — which was not about whether CUDA was good but about what it meant for critical research infrastructure to be consolidated under a single commercial actor operating under a single national jurisdiction — was not technically illiterate. It was simply not the question the field was organised to ask. The benchmark culture of AI research, in which progress is measured by performance on standardised tasks, structurally favoured tools that optimised for those benchmarks. Nvidia’s hardware and software were optimised for those benchmarks. The culture and the dependency reinforced each other.

Structural opacity operated through distribution. The dependency was not created by a single procurement decision or a single policy choice. It was assembled from millions of individual technical decisions made by researchers and engineers who had no visibility into the aggregate picture and no institutional mechanism for seeing it. National governments, which might in principle have monitored the consolidation of critical AI infrastructure under a single vendor, were in most cases not yet treating AI compute as an infrastructure category requiring the oversight they applied to energy or telecommunications. The information that would have made the dependency legible was distributed across procurement records, academic toolchain choices, and cloud service agreements. The architecture to interpret it collectively did not exist.

Move 5: Converting the Value

The conversions that Nvidia’s structural position enabled are operating simultaneously across financial, strategic, and normative registers.

The financial conversion is the most visible. Nvidia’s data centre revenue grew from approximately $3 billion in fiscal year 2020 to $47.5 billion in fiscal year 2024. Nvidia’s market capitalisation first crossed $3 trillion on June 5, 2024, and the company became briefly the most valuable publicly traded company in the world on June 18, 2024, at over $3.3 trillion. These numbers represent the financial expression of a structural position: the conversion of toolchain lock-in and institutional knowledge dependency into cash flows that were, for a sustained period, essentially captive. The organisations paying were not paying because Nvidia had won a competitive tender in an open market. They were paying because the alternative was not to do frontier AI work.

The strategic conversion is more consequential and less frequently named. The United States government’s use of export controls as a policy instrument presupposes that Nvidia’s position is durable enough to make the controls effective. A restriction on a product that can be substituted within twelve months is not a strategic instrument. A restriction on a product embedded in a decade of institutional knowledge, toolchain architecture, and developer culture is a different kind of tool. The conversion here is from commercial dominance to geopolitical leverage, and it was performed not by Nvidia but by the state that holds jurisdiction over it. US allies conducting AI development on Nvidia hardware have, through that technical choice, implicitly aligned themselves with US technology policy on a matter they did not vote on and were not consulted about. The dependency converts into policy compliance without a policy ever having been imposed.

The normative conversion is the deepest and the least visible. Nvidia, through CUDA and the ecosystem it anchors, exercises meaningful influence over what AI development looks like, what it is optimised for, what hardware assumptions are baked into the frameworks that researchers use, and what kinds of computation are easy versus difficult. Standards that emerge from a single dominant platform are not neutral technical standards. They are the preferences, incentives, and architectural assumptions of the platform holder, crystallised into defaults that everyone else inherits. The field does not experience this as Nvidia setting norms. It experiences it as how AI development works.

Who paid is a layered question. AI research labs and cloud providers paid directly, in procurement costs and constrained margins. National governments paid in strategic optionality, accepting geopolitical alignment as the price of AI capability. The Chinese technology sector paid in developmental setback, forced into an accelerated domestic chip program under conditions of significant disadvantage. Future developers and researchers inherit a technical landscape whose architecture was shaped by one company’s commercial decisions during a period of exceptional leverage. The costs of that inheritance are not yet fully visible.

Move 6: Eliminating Alternatives

This move applies. The resistance cases are analytically important not because they succeeded but because their failure is part of the structural story.

AMD’s ROCm platform represents the most sustained Western attempt to offer a CUDA alternative. ROCm is technically functional and supports major AI frameworks, but it has not achieved comparable developer adoption, library depth, or hardware optimisation. The reasons are partly technical and partly institutional: the CUDA ecosystem’s head start translated into a developer community that had no strong incentive to rebuild existing workflows on an alternative platform, particularly when the alternative could not guarantee equivalent performance on the benchmarks that determined funding and publication outcomes. The elimination of AMD as a viable alternative was not achieved through acquisition or legal action. It was achieved through the compounding advantage of being first in a network-effects market, where the value of the platform increases with the number of developers on it and the depth of the library ecosystem, creating conditions that challengers can approach but not easily breach.

Intel’s Gaudi processors and the custom silicon programs at Google, Microsoft, Amazon, and Meta represent a second category of resistance: the attempt by major cloud providers and technology companies to reduce their own dependency by developing proprietary alternatives. Google’s TPUs are the most mature of these efforts and have demonstrated genuine performance advantages for specific workloads. Yet even Google continues to offer Nvidia GPU access as the primary AI compute product for customers who have not specifically optimised for TPU architectures. The custom silicon programs reduce the dependency at the margins for their developers but do not address it for the broader ecosystem of researchers and companies that cannot maintain custom hardware programs.

The Chinese domestic response — including Biren, Cambricon, and most significantly the Huawei Ascend line — represents the most urgent exit attempt and the most instructive failure. These programs accelerated sharply after the 2022 export controls. Huawei’s Ascend 910C has been assessed by DeepSeek researchers as performing at approximately 60% of the Nvidia H100 in real-world benchmarks. The programs face compounding disadvantages: limited access to advanced semiconductor manufacturing processes due to separate controls on chip manufacturing equipment, a software ecosystem that must be rebuilt largely from scratch, and the need to attract developer communities away from established CUDA workflows. The gap is narrowing in hardware terms in some segments, but the software and institutional knowledge gap is a separate and larger problem. The dependency was never primarily in the silicon.

Analytical Notes

Two dependency architectures are operating simultaneously in this case, and the collision between them is itself the most consequential dynamic for actors outside the US-China bilateral.

The first dependency is commercial: the global AI development community’s dependency on Nvidia’s CUDA ecosystem, created through rational adoption and normalised through institutional habituation. The second dependency is geopolitical: the dependency of any state or institution operating on Nvidia hardware on the technology policy decisions of the United States government, which holds jurisdiction over the platform and has demonstrated willingness to use that jurisdiction as a strategic instrument.

The party caught between these two architectures is not China, whose collision with the second dependency is direct and visible. It is the set of states that consider themselves neither adversaries nor fully aligned partners of the United States: the European Union, India, the Gulf states, South Korea, Japan, and others conducting national AI programs on infrastructure that operates under a foreign jurisdiction’s control. These actors adopted the CUDA ecosystem because it was the rational technical choice. They now find that their AI development capability is structurally contingent on a geopolitical relationship they do not control. They cannot easily exit the first dependency, because the switching costs are prohibitive. They cannot fully insulate themselves from the second dependency, because it is a consequence of the first. Their option space is shaped by the collision: they must either accept the geopolitical alignment that comes with Nvidia dependency, invest the very substantial resources required to build alternative ecosystems, or find a middle path of partial diversification that reduces but does not eliminate exposure. None of these options was visible as an option at the moment the dependency was being created.

Closing

What this case reveals about the grammar is the relationship between speed and legibility. The Nvidia dependency was assembled in the period between 2012 and approximately 2020, during which it was both too new to appear as an infrastructure problem and too technically specialised to attract the institutional oversight that is typically applied to energy, water, or telecommunications infrastructure. By the time the dependency was legible at the scale that would have justified a policy response, it had already been converted into a geopolitical instrument. The temporal gap between creation and consequence, compressed here relative to agricultural or energy dependencies measured in generations, was nonetheless sufficient. Eight years is enough time for a software ecosystem to become load-bearing.

The founding article argues that the most consequential things happening are rarely the things that take centre stage. In this case, the thing taking centre stage was the AI revolution itself: the models, the benchmarks, the capabilities, the societal implications. The structural layer being assembled beneath it — the quiet consolidation of the compute substrate into a single company operating under a single jurisdiction — was visible to those who looked but not to the institutional architectures that would have needed to see it to act. The magician’s hands were the chips. The thing being built while we watched them was the chokepoint.

The question the reader should carry forward is this: if a dependency of this depth and geopolitical consequence was assembled during a period when the technology receiving global attention was the application layer above it, what is being assembled now in the infrastructure layers beneath the next thing we are all watching?er we are watching the right ones.

Magician’s Hands (05): Nvidia AI Compute Dominance

A Dependency Investigation (#05) – May, 2026

Scene-setter

Move 1: Creating the Dependency

Move 2: Normalising the Dependency

Move 3: Leveraging the Dependency

Move 4: Obscuring the Mechanism

Move 5: Converting the Value

Move 6: Eliminating Alternatives

Analytical Notes

Closing

Related

Category

Tags

The Extra Words

Article Categories

Contact Information

Company Websites

Magician’s Hands (05): Nvidia AI Compute Dominance

A Dependency Investigation (#05) – May, 2026

Scene-setter

Move 1: Creating the Dependency

Move 2: Normalising the Dependency

Move 3: Leveraging the Dependency

Move 4: Obscuring the Mechanism

Move 5: Converting the Value

Move 6: Eliminating Alternatives

Analytical Notes

Closing

Share:

Related

Category

Tags

The Extra Words

Article Categories

Contact Information

Company Websites