• Thorry@feddit.org
    link
    fedilink
    English
    arrow-up
    52
    ·
    10 hours ago

    Also just because the code works, doesn’t mean it’s good code.

    I’ve had to review code the other day which was clearly created by an LLM. Two classes needed to talk to each other in a bit of a complex way. So I would expect one class to create some kind of request data object, submit it to the other class, which then returns some kind of response data object.

    What the LLM actually did was pretty shocking, it used reflection to get access from one class to the private properties with the data required inside the other class. It then just straight up stole the data and did the work itself (wrongly as well I might add). I just about fell of my chair when I saw this.

    So I asked the dev, he said he didn’t fully understand what the LLM did, he wasn’t familiar with reflection. But since it seemed to work in the few tests he did and the unit tests the LLM generated passed, he thought it would be fine.

    Also the unit tests were wrong, I explained to the dev that usually with humans it’s a bad idea to have the person who wrote the code also (exclusively) write the unit tests. Whenever possible have somebody else write the unit tests, so they don’t have the same assumptions and blind spots. With LLMs this is doubly true, it will just straight up lie in the unit tests. If they aren’t complete nonsense to begin with.

    I swear to the gods, LLMs don’t save time or money, they just give the illusion they do. Some task of a few hours will take 20 min and everyone claps. But then another task takes twice as long and we just don’t look at that. And the quality suffers a lot, without anyone really noticing.

    • Kissaki@feddit.org
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      So I asked the dev, he said he didn’t fully understand what the LLM did, he wasn’t familiar with reflection.

      Big baffling facepalm moment.

      If they would at least prefix the changeset description with that it’d be easier to interpret and assess.

    • criss_cross@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      They’ve been great for me at optimizing bite sized annoying tasks. They’re really bad at doing anything beyond that. Like astronomically bad.

      • Kissaki@feddit.org
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        2 hours ago

        They did say why they’re doing it

        Whenever possible have somebody else write the unit tests, so they don’t have the same assumptions and blind spots.

        Did that not make sense to you?

        I usually wouldn’t do that, because it’s a bigger investment. But it certainly makes logical sense to me and is something teams can weigh and decide on.

    • airgapped@piefed.social
      link
      fedilink
      English
      arrow-up
      7
      ·
      7 hours ago

      Great description of a problem I noticed with most LLM generated code of any decent complexity. It will look fantastic at first but you will be truly up shit creek by the time you realise it didn’t generate a paddle.