Controversy Surrounds OpenAI’s ChatGPT AI Training on Copyrighted Works, Harry Potter Included
OpenAI, a leading artificial intelligence company, has found itself at the center of a heated debate following allegations that its language model, ChatGPT, was trained on copyrighted works, including the beloved Harry Potter series by JK Rowling. This revelation has sparked outrage, particularly from the author herself.
A recent study conducted by David Bamman and his team at the University of California, Berkeley, brought this issue to light. It was discovered that OpenAI’s ChatGPT and other language models extensively utilized text from the internet to train their AI. While this may seem innocuous, the concern arises from the inclusion of copyrighted books, such as those authored by Rowling.
The researchers extracted approximately 100 passages from nearly 600 fiction books, each featuring a unique named character. After removing the names, they tasked the AI with filling in the missing information. The crucial difference here is that while Lewis Carroll’s works are in the public domain, Rowling’s Harry Potter series is still protected by copyright.
This suggests that OpenAI’s AI was trained on a significant amount of material from both Carroll’s and Rowling’s books. However, it’s not just Harry Potter that potentially falls under this copyright infringement. Other works, including Ray Bradbury’s Fahrenheit 451 and George RR Martin’s Game of Thrones, are also protected by copyright and are suspected to have been part of the AI’s training data.
Andrés Guadamuz, an Artificial Intelligence specialist at the University of Sussex, acknowledges the legal complexities of the situation. He points out that OpenAI’s training process involves the use of online content that may include both legitimate quotes from various sources and possible pirated copies.
David Bamman believes that resolving the rights of authors in light of OpenAI’s AI is a matter that will likely be decided through court cases in the coming months.
As the controversy deepens, it remains to be seen how copyright holders will respond to the alleged use of their protected works in training this AI. The implications of this situation have raised important legal and ethical questions regarding the use of copyrighted materials in the development of artificial intelligence.
With various perspectives at play, it is essential to balance the rights of authors and the advancement of AI technology. As the legal landscape evolves, it is likely that there will be significant discussions and potential legal battles to determine the boundaries of AI training and the protection of intellectual property.
This development serves as a reminder of the challenges that arise in the fast-evolving world of artificial intelligence, and highlights the need for clear regulations and guidelines to govern its development and use.