About books, machine learning, and copyright issues

Everything written by AI can be considered an amalgamation of everything that has already existed.

“Machines can generate new media by modeling and recombining corpora of existing media; this is not fundamentally different from the human creative process of observation and reformation.” (Elson 22)

To read about how AI is trained click the link below:

“Artificial Intelligence.” The Johns Hopkins Guide to Digital Media

Did you know if you’ve published a book there is a way to find out whether or not its been used by AI algorithms?

“The Books3 dataset contains 183,000 books, downloaded from pirate sources. We know that companies like Meta (creators of LLaMA), EleutherAI, and Bloomberg have used it to train their language models. OpenAI has not disclosed training information about GPT 3.5 or GPT 4—the models underlying ChatGPT—so we don’t know whether it also used Books3. Regardless of whether GPT was trained on Books3, the class action lawsuits against OpenAI should uncover more information on the datasets used by OpenAI, which we believe also include books obtained from pirate sources.”

To read this article click the link below:

Author’s Guild