Qwen3-32b: Windows95 starfield screensaver web app with warp drive on click

xodoh74984@lemmy.world · edit-2 2 months ago

Qwen3-32b: Windows95 starfield screensaver web app with warp drive on click

SmokeyDope@lemmy.world · 2 months ago

You’re welcome. Also, whats your gpu and are you using cublas (nvidia) or vulcan(universal amd+nvidia) or something else for gpu postprocessing?

xodoh74984@lemmy.world · 2 months ago

It’s a 4090 using cublas. I just run the stock llama.cpp server with CUDA support. Do you know if there’d be any advantage to building it from source or using something else?

SmokeyDope@lemmy.world · 2 months ago

If you were running amd GPU theres some versions of llama.cpp engine you can compile with rocm compat. If your ever tempted to run a huge model with partial offloaded CPU/ram inferencing you can set the program to run with highest program niceness priority which believe it or not pushes up the token speed slightly