“We’re having trouble with AI solving problems. Let’s solve the problem with AI.”
It’s not strictly a bad idea or doomed or anything, but it makes me sad.
This framework was tested on nine complex challenges. It achieved an 85 percent success rate, whereas the best baseline only achieved a 39 percent success rate. This suggests its applications in various multistep planning tasks, such as scheduling airline crews or managing machine time in a factory.
85% isn’t good. It’s a vast improvement, but it’s not a good rate at scale. If you have 100,000 actions, 15,000 are wrong. If you have 1M customers, 150K are calling customer support.
Also, even if we’re talking about smaller scales like scheduling airline crews or managing machine time, how is AI not overkill? You have to have relatively massive amounts of hardware for the payoff of what a handful of people could do. Or a “dumb” algorithm. Or a signup sheet. And now we’re adding additional computing overhead?
AI is still a solution in search of a problem.
Well yeah, but the article is about a paper that’s showing a strategy to improve planning capabilities in comparison to using LLMs as they are currently. It’s just research, they’re not saying to use that in production now, and I’d say it isn’t something the researchers are even worried about for this particular artifact.