
Oh, is it open-source on Windows?

Oh, is it open-source on Windows?


This is the real future of neural networks. Trained on supercomputers - runs on a Game Boy. Even in comically large models, the majority of weights are negligible, and local video generation will eventually be taken for granted.
Probably after the crash. Let’s not pretend that’s far off. The big players in this industry have frankly silly expectations. Ballooning these projects to the largest sizes money can buy has been illustrative, but DeepSeek already proved LLMs can be dirt cheap. Video’s more demanding… but what you get out of ten billion weights nowadays is drastically different from a six months ago. A year to date ago, video models barely existed. A year to date from now, the push toward training on less and running on less will presumably be a lot more pressing.


“Just” read documentation, says someone assuming past documentation is accurate, comprehensible, and relevant.
I taught myself QBASIC from the help files. I still found Open Watcom’s documentation frankly terrible, bordering useless. There’s comments in the original Doom source code lamenting how shite the dead-tree books were.


Due to some disagreements—some recent; some tolerated for close to 2 decades—with how collaboration should work, we’ve decided that the best course of action was to fork the project
Okay, that was always allowed!
Programming is the weirdest place for kneejerk opposition to anything labeled AI, because we’ve been trying to automate our jobs for most of a century. Artists will juke from ‘the quality is bad!’ to ‘the quality doesn’t matter!’ the moment their field becomes legitimately vulnerable. Most programmers would love if the robot did the thing we wanted. That’s like 90% of what we’re looking for in the first place. If writing ‘is Linux in dark mode?’ counted as code, we’d gladly use that, instead of doing some arcane low-level bullshit. I say this as someone who has recently read through IBM’s CGA documentation to puzzle out low-level bullshit.
You have to check if it works. But if it works… what is anyone bitching about?


Excellent news. It’s ridiculous that matrix algebra was turned into proprietary software.


The flipside of putting an actor’s face on smut is that you could just as easily put any face on an actor. Cate Blanchett’s characters won’t have to look like one another any more than Grey DeLisle’s characters look like one another. Cast your leads from This Person Does Not Exist. Like if Iron Man had Robert Downey Jr’s acting, but looked like the Avengers video game version.
Stubborn prick misidentified parts of a computer forty years ago, and clung to relevance by never admitting programs don’t need the processor to understand a damn thing. He invented bigotry against robots. His one hit is a thought experiment on par with Descarte saying ‘the dog only acts like it feels pain,’ as he slides the dagger deeper.
A computer could be indistinguishable from a human - it could simulate a real person’s brain scan at a subatomic level - and John Searle would insist it’s no different from Eliza. May we never hear his bullshit touted again.


… you should probably check, before you go selling the what-ifs.
Diffusion is a denoising algorithm. It’s just powerful enough that “noise” can mean, all the parts that don’t look like Shrek eating ramen. Show it a blank page and it’ll squint until it sees that. It’s pretty good at finding Shrek. It’s so-so at finding “eating.” You’re better-off starting from rough approximation, like video of a guy eating ramen. And it probably doesn’t hurt if he’s painted green.


I would be shocked if any diffusion model could do that based on a description. Most can’t overfill a wine glass.
Rendering over someone demonstrating the movement, as video-to-video, is obviously easier than firing up Blender. But: that’s distant from any dream of treating the program like an actress. Each model’s understanding is shallow and opinionated. You cannot rely on text instructions.
The practical magic from video models, for the immediate future, is that your video input can be real half-assed. Two stand-ins can play a whole cast, one interaction at a time. Or a blurry pre-vis in Blender can go straight to a finished shot. At no point will current technologies be more than loose control of a cartoon character, because to these models, everything is a cartoon character. It doesn’t know the difference between an actor and a render. It just knows shinier examples with pinchier proportions move faster.
You can navigate back to it sometimes, if you figure out the last four tangents you went on.


Oh it’s definitely not just Google. Apple’s been this fucked since 2007. But since this is the Android community, it’s helpful to stay on-message.


Shatter this corporation.


Voice acting is acting.
The name is a hint.
But acting skill alone won’t let Idris Elba sound like Tilda Swinton. AI can.


Uh huh. So it’s more than naive pitch-shifting, but less than somehow fixing “oh god oh man oh man oh god.” Like how someone sounds is more complex than playback speed, but still distinct from how they choose to say things.
You can figure this out. I believe in you.


This whole thing was one guy trying to put his cartoon OC into a labor union. It’s a stunt that’s wrong for reasons unrelated to anybody’s hate-boner against all things AI.
The “digital actress” from Final Fantasy: The Spirits Within would be identically unqualified. Characters aren’t persons.


Like a character’s just a face.
Everyone’s got a Homer Simpson impression. Very few of them sound like Dan Castellaneta. This tech fixes how your vocal cords are shaped - not whether you can pull off an American accent.


Changing your voice won’t fix bad acting.


I fully endorse photorealistic cartoons. Characters can look like anything, without having to find a specific guy and hope fits the role.
Getting the cartoon onscreen can still involve an actual actor. Diffusion turns whatever you have into whatever you describe. Turning a guy into another guy is not a big ask. It’s how some of this mess started, with Nicolas Cage deepfaked as Superman, and defictionalizing the Stallone version of The Terminator. The target face does not need to be a real person. Three actors can stage nearly any script.
Same goes for voice-acting. VAs are understandably concerned about being cloned. Nobody’s talking about the opposite: making up what characters sounds like, so any actor can play anybody. Or everybody. You can even substitute, when a scene needs extra oomph - like a band featuring a guitarist for a solo. Same sound… distinct performance.
The bubble continuing ensures the current paradigm soldiers on, meaning hideously expensive projects shove local models into people’s hands for free, because everyone else is doing that.
And once it bursts, there’s gonna be an insulating layer of dipshits repeating “guess it was nothing!” over the next decade of incremental wizardry. For now, tolerating the techbro cult’s grand promises of obvious bullshit means the unwashed masses are interpersonally receptive to cool things happening.
Already the big boys are pivoted toward efficiency instead of raw speed at all costs. The closer they get toward a toaster matching current tech with a model trained for five bucks, the better. I’d love for VCs to burn money on experimentation instead of scale.