• KingRandomGuy@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 hours ago

    Yeah, I agree that it does help for some approaches that do require a lot of VRAM. If you’re not on a tight schedule, this type of thing might be good enough to just get a model running.

    I don’t personally do anything that large; even the diffusion methods I’ve developed were able to fit on a 24GB card, but I know with the hype in multimodal stuff, VRAM needs can be pretty high.

    I suspect this machine will be popular with hobbyists for running really large open weight LLMs.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      7 hours ago

      I suspect this machine will be popular with hobbyists for running really large open weight LLMs.

      Yeah.

      It will probably spur a lot of development! I’ve seen a lot of bs=1 speedup “hacks” shelved because GPUs are fast enough, and memory efficiency is the real bottleneck. But suddenly all these devs are going to have a 48GB-96GB pool that’s significantly slower than a 3090. And multimodal becomes much more viable.

      Not to speak of better ROCM compatibility. AMD should have done this ages ago…