@davanstrien on Hugging Face: "Is your summer reading list still empty? Curious if an LLM can generate a book…"

Very interesting. As you aptly pointed out, creative writing is one of the tasks the community is currently focusing on, and I believe we can do a significantly better job than Anthropic or ClosedAI.

Making an LLM for creative writing might seem like a trivial task, but it is not. Can we create datasets or LLMs that write based on a prompt? Sure. But would the output be any good? Not quite. That's the challenging part.

I have started an ambitious project, LLAMA-3_8B_Unaligned, and I am more than 1,000 work hours into it...

My approach is quite different though. Instead of using KTO\DPO etc... I "just" want the LLM to be able to follow writing instructions extremely well while completely altering the token distribution probability so that it doesn't resemble machine-written text (aka "SLOP").

I would love to follow up on your project, and I think the holy grail would be to see about 5-10 short books that were at least 95% written by AI.

Fun fact: There's only a tiny bunch of LLMs that can split a LONG text into paragraphs correctly with a low probability of error. That's a problem I didn't expect to encounter, like many others. As I said, this "trivial" task is way harder than it seems! Please keep us updated.

Join the conversation