ON the south side of Austin, Texas, engineers at semiconductor maker Advanced Micro Devices designed an artificial intelligence chip called MI300 that was released a year ago and is expected to generate more than US$5bil in sales in its first year of release.
Not far away in a north Austin high-rise, designers at Amazon developed a new and faster version of an AI chip called Trainium. They then tested the chip in creations including palm-size circuit boards and complex computers the size of two refrigerators.
Those two efforts in the capital of Texas reflect a shift in the rapidly evolving market of AI chips, which are perhaps the hottest and most coveted technology of the moment.
The industry has long been dominated by Nvidia, which has leveraged its AI chips to become a US$3 trillion behemoth.
For years, others tried to match the company’s chips, which provide enormous computing power for AI tasks, but made little progress. Now the chips that Advanced Micro Devices, known as AMD, and Amazon have created – as well as customer reactions to their technology – are adding to signs that credible alternatives to Nvidia are finally emerging.
For some crucial AI tasks, Nvidia’s rivals are proving they can deliver much faster speed, and at prices that are much lower, said Daniel Newman, an analyst at Futurum Group.
“That’s what everybody has known is possible, and now we’re starting to see it materialise,” he said.
The shift is being driven by an array of tech companies – from large competitors such as Amazon and AMD to smaller startups – that have started tailoring their chips for a particular phase of AI development that is becoming increasingly important.
That process, called “inferencing”, happens after companies use chips to train AI models. It allows them to carry out tasks such as serving up answers with AI chatbots.
“The real commercial value comes with inference, and inference is starting to gain scale,” said Cristiano Amon, chief executive of Qualcomm, a mobile chipmaker that plans to use Amazon’s new chips for AI tasks.
“We’re starting to see the beginning of the change.”
Nvidia’s rivals have also started taking a leaf out of the company’s playbook in another way.
They have begun emulating Nvidia’s tactic of building complete computers – and not just the chips – so that customers can wring the maximum power and performance out of the chips for AI purposes.
The increased competition was evident recently, when Amazon announced the availability of computing services based on its new Trainium 2 AI chips and testimonials from potential users including Apple.
The company also unveiled computers containing either 16 or 64 of the chips, with ultrafast networking connections that particularly accelerate inferencing performance. Amazon is even building a kind of giant AI factory for the startup Anthropic, which it has invested in, said Matt Garman, chief executive of Amazon Web Services.
That computing “cluster” will have hundreds of thousands of the new Trainium chips and will be five times as powerful as any that Anthropic has ever used, said Tom Brown, a founder and the chief compute officer of the startup, which operates the Claude chatbot and is based in San Francisco.
“This means customers will get more intelligence at a lower price and at faster speeds,” Brown said.
In total, spending on computers without Nvidia chips by data centre operators, which provide the computing power needed for AI tasks, is expected to grow 49% in 2024 to US$126bil, according to Omdia, a market research firm.
Even so, the increased competition does not mean Nvidia is in danger of losing its lead.
A spokesman for the company pointed to comments made by Jensen Huang, Nvidia’s chief executive, who has said his company has major advantages in AI software and inferencing capability.
Huang has added that demand is torrid for the company’s new Blackwell AI chips, which he says perform many more calculations per watt of energy used, despite an increase in the power they need to operate.
“Our total cost of ownership is so good that even when the competitor’s chips are free, it’s not cheap enough,” Huang said in a speech at Stanford University.
The changing AI chip market has partly been propelled by well-funded startups such as SambaNova Systems, Groq and Cerebras Systems, which have lately claimed big speed advantages in inferencing, with lower prices and power consumption.
Nvidia’s current chips can cost as much as US$15,000 each, and its Blackwell chips are expected to cost tens of thousands of dollars each. That has pushed some customers toward alternatives.
Dan Stanzione, executive director of the Texas Advanced Computing Centre, a research centre, said the organisation planned to buy a Blackwell-based supercomputer next year but would most likely also use chips from SambaNova for inferencing tasks because of their lower power consumption and pricing.
“That stuff is just too expensive,” he said of Nvidia’s chips.
AMD said it expected to target Nvidia’s Blackwell chips with its own new AI chips arriving next year.
In the company’s Austin labs, where it exhaustively tests AI chips, executives said inferencing performance was a major selling point.
One customer is Meta, the owner of Facebook and Instagram, which says that it has trained a new AI model, called Llama 3.1 405B, using Nvidia chips but that it uses AMD MI300s chips for providing answers to users.
Amazon, Google, Microsoft and Meta are also designing their own AI chips to speed up specific computing chores and achieve lower costs, while still building big clusters of machines powered by Nvidia’s chips.
Google also plans to begin selling services based on a sixth generation of internally developed chips, called Trillium, which is nearly five times as fast as its predecessor.
Amazon, sometimes seen as a laggard in AI, seems particularly determined to catch up. The company allocated US$75bil this year for AI chips and other computing hardware, among other capital spending.
It is far more optimistic about the new Trainium 2 chips, which are four times as fast as previous chips. Amazon also plans another chip, Trainium 3, which is set to be even more powerful.
Eiso Kant, a founder and the chief technology officer of Poolside, an AI startup in San Francisco, estimated that Trainium 2 would provide a 40% improvement in computing performance per dollar compared with Nvidia-based hardware.
Amazon also plans to offer Trainium-based services in data centres across the world, Kant added, which helps with inferencing tasks.
“The reality is, in my business, I don’t care what silicon is underneath,” he said. “What I care about is that I get the best price performance and that I can get it to the end user.” — ©2024 The New York Times Company