News
The AMD MI300X is a Power Hog, Consuming Up to 750 Watts
Still hot on the heels of AMD’s formal introduction of their artificial intelligence data center accelerator, the MI300X, we find ourselves in this situation. AMD intends to use it as a club to try to knock Nvidia from its perch as the dominating player in artificial intelligence acceleration. It is, without a doubt a formidable processing force that should not be underestimated. Increasing performance can occasionally result in increased power draws, even though each new architecture typically improves power efficiency (using less energy for the same unit of work). This is because each new architecture typically improves power efficiency. And AMD’s OAM-based (OCP Accelerator Module) MI300X is a power hog. In fact, at 750 W, it is the product with the highest-rated thermal design power (TDP) that has ever been produced in its form factor.
However, there is no need to be concerned because the requirements for OAM systems run up to 1000 W of deliverable power, which means that there is still potential to scale performance even further. Even though 750 watts may seem like an excessive amount of power for a single piece of computer hardware to use (at least from the point of view of a single person), we need to keep in mind that those watts are running technology that is significantly faster and more specialized than even AMD’s most powerful graphics cards. AMD is giving what it claims to be the most performant accelerator for AI-related workloads (both in generative AI and in Large Language Model [LLM] processing) given the amount of wattage being offered.
That assertion might have some credence if we consider how AMD was able to pack 153 billion transistors onto 12 chiplets created using two different fabrication techniques (8x 5nm [GPU] and 4x 6nm nodes [I/O die]). There is also the fact that AMD could run a 40-billion parameter LLM model (Falcon 40-B) on top of a single MI300X. This is an impressive feat. That is quite remarkable, especially considering that AMD plans to grow the MI300X so that it can contain up to eight accelerators in a single box.
As we can see from the table above, the emphasis that AMD has placed on boosting power efficiency has not been sufficient to balance the rising demands for computational resources associated with High Performance computational (HPC) scenarios. These scenarios increasingly involve processing LLM models, which appear to be cropping up left and right. Because of the increased performance requirements, even with AMD’s most recent power-saving technologies and processes and TSMC’s most recent fabrication technology, there was still a need for a 190 W power envelope increase.