• Hackworth@piefed.ca
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      12 hours ago

      The Firefly image generator is a diffusion model, and the Firefly video generator is a diffusion transformer. LLMs aren’t involved in either process - rather the models learn image-text relationships from meta tags. I believe there are some ChatGPT integrations with Reader and Acrobat, but that’s unrelated to Firefly.

      • utopiah@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 hours ago

        Surprising, I would expect it’d rely at some point on something like CLIP in order to be prompted.