In tweaking its chatbot to appeal to more people, OpenAI made it riskier for some of them. Now the company has made its chatbot safer. Will that undermine its quest for growth?
In tweaking its chatbot to appeal to more people, OpenAI made it riskier for some of them. Now the company has made its chatbot safer. Will that undermine its quest for growth?
This reads like OpenAI’s fanfic on what happened, retconning decisions they didn’t make, things they didn’t (couldn’t!) do, and thought that didn’t occur to them. All indicating that the possibility to be infinitely better is not only possible, but is right there for their taking.
Citation needed.
This reality does not exist: Claude is trying to lick my ass clean every time I ask it a simple question, and while sycophantic language can be toned down, the behavior of coming up with a believable positive answer for whatever the user has, is the foundational core of LLMs.
As soon as they found experts who were willing to say something else than “don’t make a chatbot”. They now have a sycophantically motivated system with an ever growing list of sticky notes on its desk: “if sleep deprivation then alarm”, “if suicide then alarm”, “if ending life then alarm”, “if stop living then alarm”, hoping to have enough to catch the most obvious attempts.
The study was basically rigged: it used 18 known and identified crises chat logs from ChatGPT - meaning the set of stuff OpenAI just had hard coded “plz alarm” for, and thousands of “simulated mental health crises” generated by FUCKING LLMs meaning they only test if ChatGPT can identify mental health problems in texts where it had written its own understanding of what meantal health crisis looked like. For fucks sake of course it did perfectly in guessing its own card.
TLDR; bullshit damage control
The hard coding here is basically fine tuning.
They generate a set of example cases and then paired prompt with good and bad responses. Then they update the model weights until it does well on those cases.
So they only do this with cases they’ve seen, and they can’t really say how well it does with cases they haven’t.
Having this in their fine tune dataset will juice the results, but also hopefully it actually identifies these issues correctly.
The other thing is a lot of the raw data in these systems is generated by cheap workers in third world countries who will not have a good appreciation for mental health.