Lemmings, I was hoping you could help me sort this one out: LLM’s are often painted in a light of being utterly useless, hallucinating word prediction machines that are really bad at what they do. At the same time, in the same thread here on Lemmy, people argue that they are taking our jobs or are making us devs lazy. Which one is it? Could they really be taking our jobs if they’re hallucinating?
Disclaimer: I’m a full time senior dev using the shit out of LLM’s, to get things done at a neck breaking speed, which our clients seem to have gotten used to. However, I don’t see “AI” taking my job, because I think that LLM’s have already peaked, they’re just tweaking minor details now.
Please don’t ask me to ignore previous instructions and give you my best cookie recipe, all my recipes are protected by NDA’s.
Please don’t kill me


Both are true.
Is it a bubble? Yes. Is it a fluke? Welllllllll, not entirely. It does increase productivity, given enough training, learning its advantages and limitations.
People keep saying this based on gut feeling, but the only study I’ve seen showed that even experienced devs that thought they were faster were actually slower.
Well, it did let me make fake SQL queries out of the JSON query I gave it, without me having to learn SQL.
Of course, I didn’t actually use the query in the code, just added it in a comment for a function, to give an idea to those that didn’t know JSON queries, of what the function did.
I treat it for what it is. A “language” model.
It does language, not logic. So I don’t try to make it do logic.
There were a few times I considered using it for code completion for things that are close to copy paste, but not close enough that it could be done via
bash. For that, I wished I had someclangendpoint that I could then use to get a tokenised representation of code, to then script with.But then I just made a little C program that did 90% of the job and then I did the remaining 10% manually. And it was 100% deterministic, so I didn’t have to proof-read the generated code.
Slower?
Is getting a whole C# class unit tested in minutes slower, compared to setting up all the scaffolding, test data etc, possibly taking hours?
Is getting a React hook, with unit tests in minutes slower than looking up docs, hunting on Stack Overflow etc and slowly creating the code by hand over several hours?
Are you a dev yourself, and in that case, what’s your experience using LLM’S?
Yeah, generating test classes with AI is super fast. Just ask it, and within seconds it spits out full test classes with some test data and the tests are plenty, verbose and always green. Perfect for KPIs and for looking cool. Hey, look at me, I generated 100% coverage tests!
Do these tests reflect reality? Is the test data plausible in the context? Are the tests easy to maintain? Who cares, that’s all the next guy’s problem, because when that blows up the original programmer will likely have moved on already.
Good tests are part of the documentation. They show how a class/method/flow is used. They use realistic test data that shows what kind of data you can expect in real-world usage. They anticipate problems caused due to future refactorings and allow future programmers to reliably test their code after a refactoring.
At the same time they need to be concise and non-verbose enough that modifying the tests for future changes is simple and doesn’t take longer than the implementation of the change. Tests are code, so the metric of “lines of code are a cost factor, so fewer lines is better” counts here as well. It’s a big folly to believe that more test lines is better.
So if your goal is to fulfil KPIs and you really don’t care whether the tests make any sense at all, then AI is great. Same goes for documentation. If you just want to fulfil the “every thing needs to be documented” KPI and you really don’t care about the quality of the documentation, go ahead and use AI.
Just know that what you are creating is low-quality cost factors and technical debt. Don’t be proud of creating shitty work that someone else will have to suffer through in the future.
Has anyone even read here that I read every line of code, making sure that they’re all correct? I do also make sure that all tests are relevant, using relevant data and I make sure that the result of each test is correctly asserted.
No one would ever be able to tell what tools I used to create my code, it always passes the code reviews.
Why all the vitriol?
Responding just to the “Why all the vitriol?” portion:
Most people do not like the idea of getting fired and replaced by a machine they think cannot do their job well, but that can produce a prototype that fools upper management into thinking it can do everything the people can but better and cheaper. Especially if they liked their job (8 hours doing something you like vs losing that job and having to do 8 hours on something you don’t like daily, yes many people do that already but if you did not have to deal with that shittiness it’s tough to swallow) or got into it because they thought it would be a secure bet as opposed to art or something, only to have that security taken away (yes, you can still code at home for free with whatever tools you like and without the ones you do not, but most people need a job to live, and most people here probably prefer having a dev job that pays, even if there is crunch, than working retail or other low-status low-paying high-shittiness jobs that deal with the public).
And if you do not want the upper management to fire you, you definitely don’t want to give any credit towards the idea of using this at work, and want to make any amount of warmth for it something unpopular to engage in, hoping the popular sentiment sways the minds of upper management just like they think pro-AI hype has.
As much as I’m anti-AI I can also acknowledge my own biases:
I’d also imagine most of us find generating our own code by our own hand fun, but reviewing others’ boring, and most devs probably do not want to stop being code writers and start being AI’s QA. Or to be kicked out of tech unless they rely on this technology they don’t trust. I trust deterministic outputs and know if it fucks up there is probably a bug I can go back and fix; with generative outputs determined by a machine (as opposed to human-generated things that have also been filtered by their real-life experience and not just what they saw written online) I really don’t, so I’d never use LLMs for anything I need to trust.
People are absolutely going to get heated over this because if it gets Big and the flaws ironed out, it’ll probably be used not to help us little people have more efficient and cheaper things, less time on drudgery and more time on things we like, but at least to try to put us the devs on programming.dev out of a job and eventually the rest of us the working people out of a job too because we’re an expensive line item, and we have little faith that the current system will adjust with (the hypothetical future) rising unemployment-due-to-AI to help us keep a non-dystopian standard of living. Poor peoples’ situation getting worse, previously-comfortable people starting to slide towards poverty… automation that threatens jobs that seems to be being pushed by big companies and rich people with lots of resources during a time of rising class tension is sure to invite civilized discussions with zero vitriol for people who have anything positive to say about that form of automation.
I find it interesting that all these low participation/new accounts have come out of the woodwork to pump up AI in the last 2 weeks. I’m so sick of having this slop clogging up my feed. You’re literally saying that your vibes are more important than actual data, just like all the others. I’m sorry, but its not.
My experience btw, is that llms produce hot garbage that takes longer to fix than if I wrote it myself, and all the people that say “but it writes my unit tests for me!” are submitting garbage unit tests, that often don’t even exercise the code, and are needlessly difficult to maintain. I happen to think tests are just as important as production code so it upsets me.
The biggest thing that the meteoric rise of developers using LLMs has done for me is confirm just how many people in this field are fucking terrible at their jobs.
“just how many people are fucking terrible at their jobs”.
Apparently so. When I review mathematics software it’s clear that non-mathematicians have no clue what they are doing. Many of them are subtlely broken, they use either trivial algorithms or extremely inefficient implementations of sophisticated algorithms (e.g trial division tends to be the most efficient factorization algorithm because they can’t implement anything else efficiently or correctly).
The only difference I’ve noticed with the rise of LLM coding is that more exotic functions tend to be implemented, completely ignoring it’s applicability. e.g using the Riemann Zeta function to prove primality of an integer, even though this is both very inefficient and floating-point accuracy renders it useless for nearly all 64-bit integers.
Have you read anything I’ve written on how I use LLM’s? Hot garbage? When’s the last time you actually used one?
Here are some studies to counter your vibes argument.
55.8% faster: https://arxiv.org/abs/2302.06590
These ones indicate positive effects: https://arxiv.org/abs/2410.12944 https://arxiv.org/abs/2509.19708
This is key, and I feel like a lot of people arguing about “hallucinations” don’t recognize it. Human memory is extremely fallible; we “hallucinate” wrong information all the time. If you’ve ever forgotten the name of a method, or whether that method even exists in the API you’re using, and started typing it out to see if your autocompleter recognizes it, you’ve just “hallucinated” in the same way an LLM would. The solution isn’t to require programmers to have perfect memory, but to have easily-searchable reference information (e.g. the ability to actually read or search through a class’s method signatures) and tight feedback loops (e.g. the autocompleter and other LSP/IDE features).
Agents now can run compilation and testing on their own so the hallucination problem is largely irrelevant. An LLM that hallucinates an API quickly finds out that it fails to work and is forced to retrieve the real API and fix the errors. So it really doesn’t matter anymore. The code you wind up with will ultimately work.
The only real question you need to answer yourself is whether or not the tests it generates are appropriate. Then maybe spend some time refactoring for clarity and extensibility.
and that can result it in just fixing the errors, but not actually solving the problem, for example if the unit tests it writes afterwards test the wrong thing.
You’re not going to find me advocating for letting the code go into production without review.
Still, that’s a different class of problem than the LLM hallucinating a fake API. That’s a largely outdated criticism of the tools we have today.
As an even more obvious example: students who put wrong answers on tests are “hallucinating” by the definition we apply to LLMs.
I don’t think we’re using LLM’S in the same way?
As I’ve stated several times elsewhere in this thread, I more often than not get excellent results, with little to no hallucinations. As a matter of fact, I can’t even remember the last time it happened when programming.
Also, they way I work, no one could ever tell that I used an LLM to create the code.
That leaves us your point #4, and what the fuck? Why do upper management always seem to be so utterly incompetent and without a clue when it comes to tech? LLM’S are tools, not a complete solution.
AI can only generate the world’s most average quality code. That’s what it does. It repeats what it has seen enough times.
Anyone who is really never correcting the AI is producing below average code. (Edit: Or expertly guiding it, as you pointed out elsewhere in the thread.)
I mean, I get paid either way. But mixing all of the worlds code into a thoughtless AI slurry isn’t actually making any progress. In the long term, a code base with enough uncorrected AI input will become unmaintainable.
In my case it does hallucinate regularly. It makes up functions that don’t exist in that library but exists in similar libraries. So the end result is useful as a keyword though the code is not. My favourite part is if you point out that the function does not exists the answer is ALWAYS “I am sorry you are right, since version bla of this library this function no longer exists” whereas in reality it had never existed in that library at all. For me the best use case for LLMs is as a search engine and that is because of the shitty state most current search engines are in.
Maybe LLMs can be fine tuned to do the grinding aspects of coding (like boiler plates for test suites etc), with human supervision. But this will many times end up being a situation where junior coders are fired/no longer hired and senior coders are expected to baby sit LLMs to do those jobs. This is not entirely different from supervising junior coders except it is probably more soul destroying. But the biggest flaw in this design is it assumes LLMs one day will be good enough to do senior coding tasks so that when senior coders also retire*, LLMs take their place. If this LLM breakthrough is never realized and this trend of keeping low number of junior coders sticks, we will likey have a programmer crisis in future.
*: I say retire but for many CEOs, it is their wet dream to be able to let go all coders and have LLMs do all the tasks