It depends on how powerful and fast you want your model. Yeah, a 500b parameter model running at 20 tokens per second is gonna require a expensive GPU cluster server.
If you happen to not have pewdiepie levels of cash laying around but still want to get in on the local AI you need one powerful GPU inside any desktop with a reasonably fast CPU. A used 16GB 3090 was like 700$USD last I checked on eBay and well say another 100$ for an upgraded power supply to run it. Many people have an old desktop just laying around in the basement but an entry level ibuypower should be no more than 500. So realistically Its more like 1500-2000$USD to get you into the comfy hobbyist status. I make my piece of shit 10 year old 1070ti 8GB work running 8-32b quant models. Ive heard people say 70b is a really good sweetspot and that’s totally attainable without 15k investment.
Only 20k nbd
It’s a shame Moore’s law doesn’t seem to hold anymore.
It depends on how powerful and fast you want your model. Yeah, a 500b parameter model running at 20 tokens per second is gonna require a expensive GPU cluster server.
If you happen to not have pewdiepie levels of cash laying around but still want to get in on the local AI you need one powerful GPU inside any desktop with a reasonably fast CPU. A used 16GB 3090 was like 700$USD last I checked on eBay and well say another 100$ for an upgraded power supply to run it. Many people have an old desktop just laying around in the basement but an entry level ibuypower should be no more than 500. So realistically Its more like 1500-2000$USD to get you into the comfy hobbyist status. I make my piece of shit 10 year old 1070ti 8GB work running 8-32b quant models. Ive heard people say 70b is a really good sweetspot and that’s totally attainable without 15k investment.