It looks like AI has followed Crypto chip wise in going CPU > GPU > ASIC
GPUs, while dominant in training large models, are often too power-hungry and costly for efficient inference at scale. This is opening new opportunities for specialized inference hardware, a market where startups like Untether AI were early pioneers.
In April, then-CEO Chris Walker had highlighted rising demand for Untether’s chips as enterprises sought alternatives to high-power GPUs. “There’s a strong appetite for processors that don’t consume as much energy as Nvidia’s energy-hungry GPUs that are pushing racks to 120 kilowatts,” Walker told CRN. Walker left Untether AI in May.
Hopefully the training part of AI goes to ASIC’s to reduce costs and energy use but GPU’s continue to improve inference and increase VRAM sizes to the point that AI requires nothing special to run it locally
The ASICs will come as soon as the architecture stabilizes. The problem with building specialized hardware today is it won’t necessarily be capable of running models that come out tomorrow.
With AMD’s IP, they could make a hybrid chip, eg a (for example) bitnet ASIC hanging off a GPU for flexible, cuda-compatible compute where needed.
Nvidia sorta does this now (with tensor cores being a separate part of the die), but with their history of MCM designs, AMD could take it to an extreme.