• HelloRoot@lemy.lol
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    5 days ago

    Well, thats what I said “AI optimized”.

    Even my 5 year old 900$ rig can output like 4 tps.

    • mapumbaa@lemmy.zipOP
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      4 days ago

      There is nothing “optimized” that will get you better inference performance of medium/large models at $2000.

      • HelloRoot@lemy.lol
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        4 days ago

        LLama 3 8B Instruct: 25tps

        DeepSeek R1 distill qwen 14b: 3.2tps

        To be fair: Motherboard, cpu and ram I bought 6 years ago with an nvidia 1660. Then I bought the Radeon RX 6600 XT on release in 2021, so 4 years ago. But it’s a generic gaming rig.

        I would be surprised if 2000$ worth of modern hardware, picked for this specific task would be worse than that mini PC.

        • mapumbaa@lemmy.zipOP
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          edit-2
          4 days ago

          I promise. It’s not possible. But things change quickly of course.

          (Unless you’re lucky/pro and get your hands on some super cheap used high end hardware…)