Developer and refugee from Reddit

  • 0 Posts
  • 88 Comments
Joined 2 years ago
cake
Cake day: July 2nd, 2023

help-circle
  • After working on a team that uses LLMs in agentic mode for almost a year, I’d say this is probably accurate.

    Most of the work at this point for a big chunk of the team is trying to figure out prompts that will make it do what they want, without producing any user-facing results at all. The rest of us will use it to generate small bits of code, such as one-off scripts to accomplish a specific task - the only area where it’s actually useful.

    The shine wears off quickly after the fourth or fifth time it “finishes” a feature by mocking data because so many publicly facing repos it trained on have mock data in them so it thinks that’s useful.










  • So there are a few very specific tasks that LLMs are good at from the perspective of a software developer:

    1. Certain kinds of analysis tasks can be done very quickly and efficiently with Copilot in agent mode. For instance, having it assess your existing code for adherence to stylistic standards where a violation isn’t going to trigger a linting error.
    2. Quick script writing is something it excels at. There are all kinds of circumstances where you might need an independent script, such as a database seed file. It’s not part of the application itself, but it’s a useful utility to have, and Copilot is good at writing them.
    3. Scaffolding a new application. If you’re creating something brand new and know which tools you want to use for it, but don’t want to go through the hassle of setting everything up yourself, having Copilot do it can be a real time saver.

    And that’s… pretty much it. I’ve experimented with building applications with “prompt engineering,” and to be blunt, I think the concept is fundamentally flawed. The problem is that once the application exceeds the LLM’s context window size, which is necessarily small, you’re going to see it make a lot more mistakes than it already does, because - just as an example - by the time you’re having it write the frontend for a new API endpoint, it’s already forgotten how that endpoint works.

    As the application approaches production size in features and functions, the number of lines of code becomes an insurmountable bottleneck for Copilot. It simply can’t maintain a comprehensive understanding of what’s already there.




  • There’s also the fact that what we are currently calling AI isn’t, that there are better options that aren’t environmental catastrophes (I’m hopeful about small language models), and that no one seems to want all the “AI” being jammed into every goddamn thing.

    No, I don’t want Gemini in my email or messaging, I want to read messages from people myself. No, I don’t want Copilot summaries of my meetings in Teams, half the folks I work with have accents it can’t parse. Get the hell out of my way when I’m trying to interact with actual human beings.

    And I say that as someone whose job literally involves working with LLMs every day. Ugh.