Microsoft says Copilot will 'finish your code before you finish your coffee' adding fuel to the Windows 11 AI controversy that's still raging

Sahwa@reddthat.com · 2 months ago

Microsoft says Copilot will 'finish your code before you finish your coffee' adding fuel to the Windows 11 AI controversy that's still raging

Thorry@feddit.org · 2 months ago

Also just because the code works, doesn’t mean it’s good code.

I’ve had to review code the other day which was clearly created by an LLM. Two classes needed to talk to each other in a bit of a complex way. So I would expect one class to create some kind of request data object, submit it to the other class, which then returns some kind of response data object.

What the LLM actually did was pretty shocking, it used reflection to get access from one class to the private properties with the data required inside the other class. It then just straight up stole the data and did the work itself (wrongly as well I might add). I just about fell of my chair when I saw this.

So I asked the dev, he said he didn’t fully understand what the LLM did, he wasn’t familiar with reflection. But since it seemed to work in the few tests he did and the unit tests the LLM generated passed, he thought it would be fine.

Also the unit tests were wrong, I explained to the dev that usually with humans it’s a bad idea to have the person who wrote the code also (exclusively) write the unit tests. Whenever possible have somebody else write the unit tests, so they don’t have the same assumptions and blind spots. With LLMs this is doubly true, it will just straight up lie in the unit tests. If they aren’t complete nonsense to begin with.

I swear to the gods, LLMs don’t save time or money, they just give the illusion they do. Some task of a few hours will take 20 min and everyone claps. But then another task takes twice as long and we just don’t look at that. And the quality suffers a lot, without anyone really noticing.

airgapped@piefed.social · 2 months ago

Great description of a problem I noticed with most LLM generated code of any decent complexity. It will look fantastic at first but you will be truly up shit creek by the time you realise it didn’t generate a paddle.

Kissaki@feddit.org · 2 months ago

So I asked the dev, he said he didn’t fully understand what the LLM did, he wasn’t familiar with reflection.

Big baffling facepalm moment.

If they would at least prefix the changeset description with that it’d be easier to interpret and assess.

floofloof@lemmy.ca · edit-2 2 months ago

Who hasn’t encountered that one jerk who builds only new code to impress management, and never maintains or fixes existing code? I think of them as proof-of-concept posers. They make things that look flashy, impress the execs, and barely work for a single use care, then dump all the bugs, maintenance and actual architecture on the other devs. LLMs are going to be a gift to these people and a pain for everyone who actually knows how to engineer things well. They’ll encourage this kind of shallow flashiness and make the maintenance problems worse, but the execs will be convinced that only the LLM posers are productive and everyone else is sitting idle.

criss_cross@lemmy.world · 2 months ago

They’ve been great for me at optimizing bite sized annoying tasks. They’re really bad at doing anything beyond that. Like astronomically bad.

Pieisawesome@lemmy.dbzer0.com · 2 months ago

Why would unit tests not be written by the same person? That doesn’t make a lot of sense…

Kissaki@feddit.org · edit-2 2 months ago

They did say why they’re doing it

Whenever possible have somebody else write the unit tests, so they don’t have the same assumptions and blind spots.

Did that not make sense to you?

I usually wouldn’t do that, because it’s a bigger investment. But it certainly makes logical sense to me and is something teams can weigh and decide on.