• Armillarian@pawb.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    18 hours ago

    I think its better if their github mention the minimum token count requirement to selfhost this. I don’t think it will ever reach something usable for normal selfhost user.

    Based on your statement i think most of your experience come from corporate AI usage… Which deploy multiple agent system in their AI and hosted in large data center.

    • Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      4
      ·
      18 hours ago

      I do selfhost my own, and even tried my hand at building something like this myself. It runs pretty well, I’m able to have it integrate with HomeAssistant and kubectl. It can be done with consumer GPUs, I have a 4000 and it runs fine. You don’t get as much context, but it’s about minimizing what the LLM needs to know while calling agents. You have one LLM context that’s running a todo list, you start a new one that is charge of step 1, which spins off more contexts for each subtask, etc. It’s not that each agent needs it’s own GPU, it’s that each agent needs it’s own context.