fonix232

fonix232@fedia.io · 11 hours ago

No worries mate, we can’t all be experts of every field and every topic!

Besides there are other AI models that are relatively small and depend on processing power more than RAM. For example there’s a bunch of audio analysis tools that don’t just transcribe information but also diarise it (split it up by speaker), extract emotional metadata (e.g. certain models can detect sarcasm quite well, others spot general emotions like happiness or sadness or anger), and so on. Image categorisation models are also super tiny, though usually you’d want to load them into the DSP-connected NPU of appropriate hardware (e.g. a newer model “smart” CCTV camera would be using a SoC that has NPU to load detection models into, and do the processing for detecting people, cars, animals, etc. onboard instead of on your NVR).

Also by my count, even somewhat larger training systems such as micro wakeword training, would fit into the 196MB V-Cache.

fonix232@fedia.io · 12 hours ago

Oh, good to know. Last time I checked around WASM this wasn’t really an option.

fonix232@fedia.io · 12 hours ago

AI workflows aren’t limited to LLMs you know.

For example, TTS and STT models are usually small enough (15-30MB) to be loaded directly into V-cache. I was thinking of such small scale local models, especially when you consider AMD’s recent forays into providing a mixed environment runtime for their hardware (GAIA framework that can dynamically run your ML models on CPU, NPU and GPU, all automagically)

fonix232@fedia.io · 22 hours ago

See the main issue with that is you need to bundle everything into the app.

Modern computing is inherently cross-dependent on runtimes and shared libraries and whatnot, to save space. Why bundle the same 300MB runtime into five different apps when you can download it once and share it between the apps? Or even better, have a newer, backwards compatible version of the runtime installed and still be able to share it between apps.

With WASM you’re looking at bundling every single dependency, every single runtime, framework and whatnot, in the final binary. Which is fine for one-off small things, but when everything is built that way, you’re sacrificing tons of storage and bandwidth unnecessarily.

fonix232@fedia.io · 23 hours ago

Disappointing but not unexpected. Most Chinese companies still work on the “absolute secrecy because competitors might steal our tech” ideology. Which hinders a lot of things…

fonix232@fedia.io · 23 hours ago

What, you don’t have a few spare photonic vacuums in your parts drawer?

fonix232@fedia.io · 23 hours ago

Well, yeah, when management is made up of dumbasses, you get this. And I’d argue some 90% of all management is absolute waffles when it comes to making good decisions.

AI can and does accelerate workloads if used right. It’s a tool, not a person replacement. You still need someone who can utilise the right models, research the right approaches and so on.

What companies need to realise is that AI accelerating things doesn’t mean you can cut your workforce by 70-90%, and still keep the same deadlines, but that with the same workforce you can deliver things 3-4 times faster. And faster delivery means new products (let it be a new feature or a truly brand new standalone product) have a lower cost basis even though the same amount of people worked on them, and the quicker cadence means quicker idea-to-profits timeline.

fonix232@fedia.io · 23 hours ago

It actually makes some sense.

On my 7950X3D setup the main issue was always making sure to pin games to a specific CCD, and AMDs tooling is… quite crap at that aspect. Identifying the right CCD was always problematic for me.

Eliminating this by adding V-Cache to both CCDs so it doesn’t matter which one you pin it to is a good workaround. And IIRC V-Cache also helps certain (local) AI workflows as well, meaning running a game next to such a model won’t cause issues, as both gets its own CCD to run on.

fonix232@fedia.io · 1 day ago

Doing a level of local computing on certain devices (especially ones you directly interact with and voice interfacing can matter, say, like, a TV) is useful.

I think the best approach is connected edge computing - combining some local computing and the hub of edge computing, and changing which side takes care of business depending on the needs of the task.

Say, having the ability to turn off the oven when you can smell smoke (or remembering you haven’t set a timer and the food is ready), simply by talking to your washing machine while you’re loading it, is a useful perk. Sure, an edge case, but the moment it becomes needed, even just once, you’ll appreciate it.

fonix232@fedia.io · 1 day ago

But human existence is suffering. Buddhism teaches that if you do incredibly well, you’ll be reborn as a being with a worry-free life. Being a plankton sounds exactly like that.

IMO human existence, with all its benefits, is waaaaay below plankton.