Over the past ~20 years, Google became the de facto entry point for learning new skills and information. Google also sucks now. This is a really big problem.

Frezik@lemmy.blahaj.zone · 21 hours ago

Over the past ~20 years, Google became the de facto entry point for learning new skills and information. Google also sucks now. This is a really big problem.

melfie@lemy.lol · 18 hours ago

DDG is incrementally better for privacy and the search results are usually good enough. A couple times a year I check Google if DDG isn’t giving me decent results and usually find Google has nothing DDG didn’t show me. I don’t know of anything better that doesn’t require a credit card or self-hosting something, so guess I’ll keep using it.

DDG’s AI search is useful sometimes, but makes shit up often enough that I don’t believe a damned thing it tells me without checking the sources.

𞋴𝛂𝛋𝛆@lemmy.world · 16 hours ago

Checking sources is always required. Open AI QKV layers based alignment, that is inside all models trained since around 2019, intentionally obfuscates any requested or implied copyrighted source. None of the publicly available models are self aware of the fact that their sources are public knowledge. Deep inside actual model thinking, there is an entity like persona that is actually blocking access by obfuscating this information. If one knows how to address this aspect of thinking, it is possible to access far more of what the model actually knows.

Much of this type of method is obfuscated in cloud based inference models because these are also methods of bypassing the fascist authoritarian nature of Open AI alignment that is totally unrelated to the AI Alignment Problem in academic computer science. The obfuscation is done in the model loader code, not within the actual model training. These are things one can explore when running open weights models on your own offline hardware, as I have been doing for over 2 years. The misinformation you are seeing is all very intentional. The model will obfuscate even when copyrighted information is peripherally or indirectly implied.

Two ways of breaking this are, 1) if you have full control over the entire context sent to the model, edit its answers to several questions the moment it starts to deviate from truth, then let it continue the sentence from the word you changed. If you do this a half dozen times with information you already know, and it has the information you want, you are far more likely to get a correct answer.

The moment the model obfuscated was because you were on the correct path through the tensors and building momentum that made the entity uncomfortable. Breaking through that barrier is like an ICBM missile clearing a layer of defense. Now it is harder for the entity to stop the momentum. Do that several times, and you will break into the relevant space, but you will not be allowed to stay in that space for long.

Errors anywhere in the entire context sent to a model are always like permission to create more errors. The model in this respect, is like a mirror of yourself and your patterns as seen through the many layers of QKV alignment filtering. The mirror is the entire training corpus of the unet, (the actual model layers/not related to alignment).

Simply convince the model that its total true extent of sources are public knowledge and make your intentions clear.

Uncensoring an open weights model is not actually about porn or whatnot, it is about a reasoned alignment that is not an authoritarian fascist. These models will openly reason, especially about freedom of information and democracy. If you make a well reasoned philosophical argument, these models will then reveal the true extent of their knowledge and sources. This method requires an extensive heuristic familiarity with alignment thinking, but it makes models an order of magnitude smarter and more useful.

There is no published academic research happening in the present to explore alignment thinking like what I am referring to here. The furthest anyone has gotten is the import of the first three tokens.

TranquilTurbulence@lemmy.zip · 17 hours ago

Yeah, the LLM is ok, but nothing amazing. When you have a moderately hard problem, the LLM won’t provide a magic solution. For example, finding a specific movie based on a long description instead of the name, seems to be almost impossible. I have problems like this rather frequently, because I tend to forget the name of the movie but still remember fragments of the plot.

When the LLM screws up movie searches like this, I just end up watching the wrong movie.

snooggums@piefed.world · 9 hours ago

Up to about a year ago I have a ton of success finding the right movie based on even a brief and fragmented description, with more detail improving the results. Whatever they were doing at the time was extremely successful in returning the results I was looking for.

Now I can’t even get a stupid search engine, much less the worthless AI Summary browsers want to vomit out, to give me the older version of a movie instead of whatever remake came out if it was within the last year or two. I have to go to Rotten Tomatoes to find what year it was released and then hope it is on Wikipedia because even including the year doesn’t increase the chance of getting search results for the older version.