• Serinus@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      8 hours ago

      It works well when you use it for small (or repetitive) and explicit tasks. That you can easily check.

      • ThirdConsul@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 hours ago

        According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.

        And that fact remains unchanged on other LLM models.

      • frongt@lemmy.zip
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        14 hours ago

        For words, it’s pretty good. For code, it often invents a reasonable-sounding function or model name that doesn’t exist.

        • Xenny@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          5 hours ago

          It’s not even good for words. AI just writes the same stories over and over and over and over and over and over. It’s the same problem as coding. It can’t think of anything novel. Hell I can’t even think. I’d argue the best and only real use for an llm is to help be a rough draft editor and correct punctuation and grammar. We’ve gone way way way too far with the scope of what it’s actually capable of

          • Flic@mstdn.social
            link
            fedilink
            arrow-up
            1
            ·
            5 hours ago

            @Xenny @frongt it’s definitely not good for words with any technical meaning, because it creates references to journal articles and legal precedents that sound plausible but don’t exist.
            Ultimately it’s a *very* expensive replacement for the lorem ipsum generator keyboard shortcut.

    • ptu@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      17 hours ago

      I use it for things that are simple and monotonous to write. This way I’m able to deliver results to tasks I couldn’t have been arsed to do. I’m a data analyst and mostly use mysql and power query

    • dogdeanafternoon@lemmy.ca
      link
      fedilink
      English
      arrow-up
      3
      ·
      15 hours ago

      What’s your preferred Hello world language? I’m gunna test this out. The more complex the code you need, the more they suck, but I’ll be amazed if it doesn’t work first try to simply print hello world.

      • xthexder@l.sw0.com
        link
        fedilink
        English
        arrow-up
        9
        ·
        edit-2
        15 hours ago

        Malbolge is a fun one

        Edit: Funny enough, ChatGPT fails to get this right, even with the answer right there on Wikipedia. When I tried running ChatGPT’s output the first few characters were correct but it errors with invalid char at 37

        • dogdeanafternoon@lemmy.ca
          link
          fedilink
          English
          arrow-up
          2
          ·
          14 hours ago

          Cheeky, I love it.

          Got correct code first try. Failed creating working docker first try. Second try worked.

          tmp="$(mktemp)"; cat >"$tmp" <<'MBEOF'
          ('&%:9]!~}|z2Vxwv-,POqponl$Hjig%eB@@>}=<M:9wv6WsU2T|nm-,jcL(I&%$#"
          `CB]V?Tx<uVtT`Rpo3NlF.Jh++FdbCBA@?]!~|4XzyTT43Qsqq(Lnmkj"Fhg${z@>
          MBEOF
          docker run --rm -v "$tmp":/code/hello.mb:ro esolang/malbolge malbolge /code/hello.mb; rm "$tmp"
          

          Output: Hello World!

          • xthexder@l.sw0.com
            link
            fedilink
            English
            arrow-up
            5
            ·
            edit-2
            13 hours ago

            I’m actually slightly impressed it got both a working program, and a different one than Wikipedia. The Wikipedia one prints “Hello, world.”

            I guess there must be another program floating around the web with “Hello World!”, since there’s no chance the LLM figured it out on its own (it kinda requires specialized algorithms to do anything)

            • dogdeanafternoon@lemmy.ca
              link
              fedilink
              English
              arrow-up
              1
              ·
              14 hours ago

              I’d never even heard of that language, so it was fun to play with.

              Definitely agree that the LLM didn’t actually figure anything out, but at least it’s not completely useless