• ThirdConsul@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 hours ago

      According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.

      And that fact remains unchanged on other LLM models.

    • frongt@lemmy.zip
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      1
      ·
      14 hours ago

      For words, it’s pretty good. For code, it often invents a reasonable-sounding function or model name that doesn’t exist.

      • Xenny@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        5 hours ago

        It’s not even good for words. AI just writes the same stories over and over and over and over and over and over. It’s the same problem as coding. It can’t think of anything novel. Hell I can’t even think. I’d argue the best and only real use for an llm is to help be a rough draft editor and correct punctuation and grammar. We’ve gone way way way too far with the scope of what it’s actually capable of

        • Flic@mstdn.social
          link
          fedilink
          arrow-up
          1
          ·
          5 hours ago

          @Xenny @frongt it’s definitely not good for words with any technical meaning, because it creates references to journal articles and legal precedents that sound plausible but don’t exist.
          Ultimately it’s a *very* expensive replacement for the lorem ipsum generator keyboard shortcut.