A talk from the hacker conference 39C3 on how AI generated content was identified via a simple ISBN checksum calculator (in English).

  • fubarx@lemmy.world
    link
    fedilink
    English
    arrow-up
    36
    ·
    9 hours ago

    He notes that LLM vendors have been training their models on Wikipedia content. But if the content contains incorrect information and citations, you get the sort of circular (incorrect) reference that leads to misinformation.

    One irony, he says, is that LLM vendors are now willing to pay for training data unpolluted by the hallucinated output their own products generate.