The United States Copyright Office has recently opened an investigation into copyright law and policy concerns associated with genAI. The agency is looking into whether new laws or regulations at the federal level are needed to address these issues. Since GenAI technology’s underlying language models are trained using copyrighted works, the Copyright Office is investigating what constitutes adequate disclosure and transparency in this context. The treatment of AI-generated works that mimic the characteristics of human artists, as well as the legal standing of such works, are also being examined.
The Copyright Office’s AI Initiative
The Copyright Office has been looking into claims of genAI and other AI-related copyright infringement since March when they launched their AI initiative. To solicit input and answer questions from interested parties, the agency has held four public sessions and two webinars so far. They are now actively soliciting public feedback in order to better inform their regulatory efforts, advise Congress, and serve as a resource for citizens, courts, and government agencies dealing with copyright infringement.
According to Shira Perlmutter, the US Copyright Office’s Register of Copyrights and Director, it’s crucial to address the thorny problems brought up by generative AI. A quote from her: “We look forward to continuing to examine these issues of vital importance to the evolution of technology and the future of human creativity.”
The Media Worries About Copyright Violation
The media and entertainment industry has long been troubled by the problem of AI copyright infringement. The WGA, Sarah Silverman, Christopher Golden, and Richard Kadrey are just a few of the notables who have accused genAI developers OpenAI and Meta of copyright infringement. The Writers Guild has even proposed legislating against the use of computer-generated media.
Understanding who owns the data used in an AI model is a significant obstacle to preventing copyright violations. In many cases, tech companies simply host these AI models on their platforms without claiming any ownership over the resulting content. It is crucial to establish a regulatory framework to address these issues, as the lack of clarity around content and outcome ownership has caused many organizations to reevaluate their AI efforts.
The Biden Administration’s Efforts
The risks of genAI were recently brought to light, and the Biden administration has been working to mitigate them. The administration met with the heads of AI companies including Google, Microsoft, OpenAI, and the up-and-coming Anthropic. The rules that emerged from these discussions, however, were only meant to serve as suggestions and have no binding force.
Copyright problems in genAI model training have been described as a collision between antiquated regulations and cutting-edge technologies by Gartner distinguished vice president analyst Avivah Litan. Litan is unsure of what course of action to take to resolve this conflict.
Teaching Machine Learning Models and Verifying Content
Content such as text, voice, images, and videos are generated by GenAI using large language models (LLMs) that have billions or even trillions of parameters. In order to produce tailored results, these LLMs require training on data and information culled from a wide range of sources, including public data from the internet and proprietary data uploaded by businesses.
The provenance of media content generated by genAI can now be certified thanks to the emergence of standards for training AI models and authenticating content like the Coalition for Content Provenance and Authenticity (C2PA). These guidelines can help determine which training materials were subject to copyright infringement and properly attribute them in AI-generated answers.
License information and references back to the original code repository are standard features of code generation tools like CodeWhisperer and Github Copilot. As a means of identifying and sourcing copyrighted content used in genAI models, Litan proposes that the industry implement a similar concept with copyright materials.
Time and energy will be needed to implement these solutions, either retroactively or prospectively. However, according to Litan, regulators must set appropriate standards and policies to guarantee the lawful application of copyright components in genAI.
See first source: Computer World
Q1: What is the United States Copyright Office investigating regarding genAI?
A: The Copyright Office is investigating copyright law and policy concerns related to genAI technology. This includes issues surrounding disclosure and transparency, the legal status of AI-generated works, and copyright infringement concerns.
Q2: Why is the Copyright Office conducting this investigation?
A: The investigation aims to determine if new federal laws or regulations are necessary to address copyright issues arising from genAI technology, which uses copyrighted works for training language models and generates AI-created content.
Q3: How has the Copyright Office gathered input on this matter?
A: The Copyright Office launched an AI initiative and has held public sessions and webinars to solicit input from interested parties. They are actively seeking public feedback to inform regulatory efforts and advise Congress.
Q4: What is the stance of the Biden administration on genAI copyright issues?
A: The Biden administration has been addressing the risks of genAI by engaging with AI company leaders, but any rules resulting from these discussions are suggestions and not binding regulations.
Q5: What copyright concerns have been raised by the media and entertainment industry?
A: The media industry, including notable figures and organizations like the Writers Guild, has accused genAI developers of copyright infringement. The lack of clarity in ownership of data used in AI models has contributed to these concerns.
Q6: How can the provenance of AI-generated media content be certified?
A: Standards such as the Coalition for Content Provenance and Authenticity (C2PA) help verify the origin of media content generated by genAI. These standards aid in identifying copyrighted content and attributing it correctly.
Q7: What are some proposed solutions to address copyright concerns in genAI?
A: Proposed solutions include implementing license information and references back to original code repositories, similar to code generation tools. The industry could also adopt standards and policies to ensure lawful use of copyright components in genAI.
Q8: What is Gartner distinguished vice president analyst Avivah Litan’s perspective on copyright issues in genAI?
A: Avivah Litan describes copyright concerns in genAI as a conflict between outdated regulations and advanced technologies. It is challenging to determine the appropriate course of action to resolve this conflict.