

- Postgres
- Azure
I haven’t seen this mentioned but apart from 8K being expensive, requiring new production pipelines, unweildley for storage and bandwidth, unneeded, and not fixing g existing problems with 4K, it requires MASSIVE screens to reap benefits.
There are several similar posts, but suffice to say, 8K content is only perceived by average eyesight at living room distances when screens are OVER 100 inches in diameter at the bare minimum. That’s 7 feet wide.
Source: https://www.rtings.com/tv/reviews/by-size/size-to-distance-relationship
It depends on what you mean.
To me, Ollama feels like it’s designed to be a developer-first, local LLM server with just enough functionality to get you to a POC, from where you’re intended to use someone else’s compute resources.
llama.cpp actually supports more backends, with continuous performance improvements and support for more models.
ROCm is a software stack which includes a bunch of SDKs and API.
HIP is a subset of ROCm which lets you program on AMD GPUs with focus portability from Nvidia’s CUDA
Ollama does use ROCm, however, so does llama.cpp. Vulkan happens to be another available backend supported by llama.cpp.
GitHub: llama.cpp Supported Backends
There is an old PRs which attempted to bring Vulkan support to Ollama - a logical and helpful move, given that the Ollama engine is based on llama.cpp - but the Ollama maintainers weren’t interested.
As for performance vs ROCm, it does fine. Against CUDA, it also does well unless you’re in a mulit-gpu setup. Its magic trick is compatibility. Pretty much everything runs Vulkan. And Vulkan is intecompatible between generations of cards, architectures AND vendors. That’s how I’m running a single PC with Nvidia and AMD cards together
My eepc is also 32 bit with 2gb of RAM. I did Debian 12 with LXDE from the net installer and it works really well.
Talent, passion, skill, and a worthwhile cause, all coming together on display, beautifully. It’s so imprssive what people can make.
He’s very good!
Just to be clear, I think T-SQL is fine and apparently they added some string agg function so you don’t have to hack XML_agg so… Something improved. But stinky spaghetti SQL is unfixable
Oh god this turned into a vent session
I think back of what I left behind. And I feel bad.
But then I feel better because I remember the reason I left was that we outgrew our processes and codebase and we desperately needed a restructure but i got no support in doing so.
I bitched for years that it was a continuity risk and a performance nightmare. But no. “Deliver more features. Add more junk for use cases that brought us no business value.” Never consider governance or security. Never consider best practices. Just more.
I knew eventually something bad would happen and I would be thrown under the bus. So I split. It was a good decision.
But yeah. Seone inherited a lot turd code
I wish I was working on your stuff back when I supported stinky 2008 T-SQL where everything was dynamic and sequential. I would have called you just for moral support
It picked up a lot of use by GenZ+ especially though the phrase “New {something} just dropped”.
This is so on point.
I get not wanting to compile your code. Its extra work and, if you’re already catering to a very thech-savvy crowd, you can let them deal with the variance and extra compile time.
BUT if you’re releasing your code for others TO USE and you don’t provide reproducible instructions, what’s the point?!?
It depends on your goals and your use case.
Do you want the most performance per dollar?You will never touch what the big datacenters can achieve.
Do you want privacy? Buy it yourself?
Do you want quality output? Go to the online providers or expect to pay more to build it yourself.
I am actively trying to work on non-Nvidia hardware because I’m a techno-masochist. It’s very uphill especially at the cutting edge. People are building for CUDA.
I can do amazing image generation on a 7900xtx with 24gb of vram. One of those is under 900 in the US which is great. A 3090 would probably be easier and is more expensive although it’s less performant hardware
If your video card has 16+ Gb of memory you will be able to run it with: NVIDIA cards GTX 10 series or later on ollama.
Ollama is easy, but it leaves a lot of performance on the table.
If you have less than 16 GB, you may be able to get good performance using llama.cpp or especially Ik_llama.cpp.
Possibly. Vulkan would be compatible with the system and would be able to take advantage of iGPUs. You’d definintely want to look into whether or not you have any dedicated vRAM thats DDR5 and just use that if possible.
Explanation: LLMs are extremely bound by memory bandwidth. They are essentially giant gigabyte-sized stores of numbers which have to be read from memory and multiplied by a numeric representations of your prompts…for every new word you type in and every word you generate. To do this, these models constantly pull data in and out of [v]RAM. So, while you may have plenty of RAM, and decent amounts of computing power, your 780m probably won’t ever be great for LLMs, even with Vulkan, because you don’t have the memory bandwidth to keep it busy.
roughly, for a small model
Ollama doesn’t depend on Vulkan for its backend. The Vulkan back end is the most widely compatible GPU accelerated back end.
This is to the detriment of it’s users. Ollama works on top of llama.cpp which can run on Vulkan. People have created merge requests for including Vulkan in Ollama. Meta wouldn’t accept it.
The op site is hosted on Neocities. They aim to foster that 2000s vibe. Check them out here
The OSX idea is very much an edge case for me. I’ve heard of it but not something I know much about.