For some weird reason, in my country it’s easier to order a Beelink or a Framework than an HP. They will sell everything else, except what you want to buy.
I ordered a Beelink GTR9 Pro which should hopefully arrive next month.
Really excited to play around with it, the 24GB in my 7900 XTX just don’t cut it for local LLMs.
There are a lot of benchmarks for the 395 processor here: https://kyuz0.github.io/amd-strix-halo-toolboxes/
They are leaving a lot of performance (and VRAM) on the table by doing this on Windows.
Seems pretty decent, but I wonder how it compares to an AI optimized desktop build with the same budget of 2000$.
It will probably kick the ass of that desktop. $2000 won’t get you far with a conventional build.
Well, thats what I said “AI optimized”.
Even my 5 year old 900$ rig can output like 4 tps.
There is nothing “optimized” that will get you better inference performance of medium/large models at $2000.
With what model? GPT oss or something else?
LLama 3 8B Instruct: 25tps
DeepSeek R1 distill qwen 14b: 3.2tps
To be fair: Motherboard, cpu and ram I bought 6 years ago with an nvidia 1660. Then I bought the Radeon RX 6600 XT on release in 2021, so 4 years ago. But it’s a generic gaming rig.
I would be surprised if 2000$ worth of modern hardware, picked for this specific task would be worse than that mini PC.
I promise. It’s not possible. But things change quickly of course.
(Unless you’re lucky/pro and get your hands on some super cheap used high end hardware…)
To be honest that is pretty good. Thanks!