I don’t want to do that. Instead I let it suggest better phrasing, words, basically a better editor.
This!
Locally, I actually keep a long context llm that can fit all or most of the story, and sometimes use it as an “autocomplete.” For instance, when my brain freezes and I can’t finish a sentence, I see what it suggests. If I am thinking of the next word, I let it generate one token and look at the logprobs for all the words it considered, kind of like a thesaurus sorted by contextual relevance.
This is only doable locally, as prompts are cached so you get instant/free responses ingesting (say) a 80K word block of text.
This!
This!
Locally, I actually keep a long context llm that can fit all or most of the story, and sometimes use it as an “autocomplete.” For instance, when my brain freezes and I can’t finish a sentence, I see what it suggests. If I am thinking of the next word, I let it generate one token and look at the logprobs for all the words it considered, kind of like a thesaurus sorted by contextual relevance.
This is only doable locally, as prompts are cached so you get instant/free responses ingesting (say) a 80K word block of text.