Model Evaluation and Threat Research is an AI research charity that looks into the threat of AI agents! That sounds a bit AI doomsday cult, and they take funding from the AI doomsday cult organisat…
That the script could incorporate some checking mechanisms and implement an “i dont know” for when the LLMs answers fails some tests.
They already do some of that but for other purposes, like censoring, or as by recent news, grok looks up musks opinions before answering questions, or to make more accurate math calculations they actually call a normal calculator, and so on…
They could make the LLM produce an answer A, then look up the question on google and ask that LLM to “compare” answer A with the main google results looking for inconsistencies and then return “i dont know” if its too inconsistent. Its not a rigorous test, but its something, and im sure the actual devs of those chatbots could make something much better than my half baked idea.
The chatbots are not just LLMs though. They run scripts in which some steps are queries to an LLM.
ok… what are you trying to point out?
That the script could incorporate some checking mechanisms and implement an “i dont know” for when the LLMs answers fails some tests.
They already do some of that but for other purposes, like censoring, or as by recent news, grok looks up musks opinions before answering questions, or to make more accurate math calculations they actually call a normal calculator, and so on…
They could make the LLM produce an answer A, then look up the question on google and ask that LLM to “compare” answer A with the main google results looking for inconsistencies and then return “i dont know” if its too inconsistent. Its not a rigorous test, but its something, and im sure the actual devs of those chatbots could make something much better than my half baked idea.