Category Archives: News

Copyright Office Hosts Listening Session on Copyright in AI-Generated Audiovisual Works

Posted June 26, 2023
Photo by Jon Tyson on Unsplash

On May 17, the Copyright Office held a listening session on the topic of copyright issues in AI-generated audiovisual works. You may remember that we’ve covered the other listening sessions convened by the Office on visual arts, musical works, and textual works (in which we also participated). In today’s post, we’ll summarize and discuss the audiovisual works listening session and offer some insights on the conversation.

Participants in the audiovisual works listening session included AI developers in the audiovisual space such as Roblox and Hidden Door; trade groups and professional organizations including the Motion Picture Association, Writers Guild of America West, and National Association of Broadcasters; and individual filmmakers and game developers. 

Generative AI Tools in Films and Video Games

As was the case in the music listening session, multiple participants indicated that generative AI is already being used in film production. The representative from the Motion Picture Association (MPA) explained that “innovative studios” are already using generative AI in both the production and post-production processes. As with other creative industries, generative AI tools can support filmmakers by increasing the efficiency of various tasks that are part of the filmmaking process. For example, routine tasks like color correction and blurring or sharpening particular frames are made much simpler and quicker through the use of AI tools. Other participants discussed the ways in which generative AI can help with ideation, overcoming “creativity blocks,” eliminating some of the drudgery of filmmaking, enhancing visual effects, and lowering barriers to entry for would-be filmmakers without the resources of more established players. These examples are analogous to the various ways that generative AI can support authors, which Authors Alliance and others have discussed, like brainstorming, developing characters, and generating ideas for new works.

The representative from the MPA also emphasized the potential for AI tools to “enhance the viewer experience” by making visual effects more dramatic, and in the longer term, possibly enable much deeper experiences like having conversations with fictional characters from films. The representative from Hidden Door—a company that builds “online social role-playing games for groups of people to come together and tell stories together”—similarly spoke of new ways for audiences to engage with creators, such as by creating a sort of fan fiction world with the use of generative AI tools, with contributions from the author, the user, and the generative AI system. And in fact, this can create “new economic opportunities” for authors, who can monetize their content in new and innovative ways. 

Video games are similarly already incorporating generative AI. In fact, generative AI’s antecedents, such as “procedural content generation” and “rule-based systems” have been used in video games since their inception. 

Centering Human Creators

Throughout the listening session, participants emphasized the role of human filmmakers and game developers in creating works involving AI-generated elements, stating or implying that creators who use generative AI should own copyrights in the works they produce using these tools. The representative from Roblox, an online gaming platform that allows users to program games and play other users’ games, emphasized that AI-generated content is effective and engaging because of the human creativity inherent in “select[ing] the best output” and making other creative decisions. A representative from Inworld AI, a developer platform for AI characters, echoed this idea, explaining that these tools do not exist in isolation, but are productive only when a human uses them and makes creative choices about their use, akin to the use of a simpler tool like a camera or paintbrush. 

A concern expressed by several participants—including the Writers Guild of America West, National Broadcasters Association, and Directors Guild—is that works created using generative AI tools could devalue works created by humans without such tools. The idea of markets being “oversaturated” with competing audiovisual works raises the possibility that individual creators could be crowded out. While this is far from certain, it reflects increasing concerns over threats to creators’ economic livelihoods when AI-generated works compete alongside theirs. 

Training Data and Fair Use

On the question of whether using copyrighted training materials to train generative AI systems is a fair use, there was disagreement among participants. The representative from the Presentation Guild likened the use of copyrighted training data without permission to “entire works . . . being stolen outright.” They further said that fair use does not allow this type of use due to the commercial nature of the generative AI companies, the creative nature of the works used to train the systems (though it is worth noting that factual works, and others entitled only “thin” copyright protection, are also use to train these tools), and because by “wrest[ing] from the creator ownership and control of their own work[,]” the market value for those works is harmed . This is not, in my view, an accurate statement of how the market effects factor in fair use works, because unauthorized uses that are also fair always wrest some control from the author—this is part of copyright’s balance between an author’s rights and permitting onward fair uses. 

The representative from the Writers Guild of America (“WGA”) West—which is currently on strike over, among other things, the role of generative AI in television writing—had some interesting things to say about the use of copyrighted works as training data for generative AI systems. In contract negotiations, WGA had proposed a contract which “would . . . prohibit companies from using material written under the Guild’s agreement to train AI programs for the purpose of creating other derivative and potentially infringing works.” The companies refused to acquiesce, arguing that “the technology is new and they’re not inclined to limit their ability to use this new technology in the future.” The companies’ positions are somewhat similar to those expressed by us and many others—that while generative AI remains in its nascency, it is sensible to allow it to continue to develop before instituting new laws and regulations. But it does show the tension between this idea and creators who feel that their livelihoods may be threatened by generative AI’s potential to create works with less input from human authors. 

Other participants, such as the representative from Storyblock, a stock video licensing company, emphasized their belief that creators of the works used to train generative AI tools should be required to consent, and should receive compensation and credit for the use of their works to train these models. The so-called “three C’s” idea has gained traction in recent months. In my view, the use of training data is a fair use, making these requirements unnecessary from a copyright perspective, but it represents an increasingly prevailing view among rightsholders and licensing groups (including the WGA, motivating its ongoing strike in some respects) when it comes to making the use of generative AI tools more ethical. 

Adequacy of Registration Guidance

Several participants expressed concerns about the Copyright Office’s recent registration guidance regarding works containing AI-generated materials, and specifically how to draw the line between human-authored and AI-generated works when generative AI tools are used as part of a human’s creative process. The MPA representative explained that the guidance does not account for the subtle ways in which generative AI tools are used as part of the filmmaking process, where it often works as a component of various editing and production tools. The MPA representative argued that using these kinds of tools shouldn’t make parts of films unprotected by copyright or trigger a need to disclose minor uses of such tools in copyright registration applications. The representative from Roblox echoed these concerns, noting that when a video game involves thousands of lines of code, it would be difficult for a developer to disclaim copyright in certain lines of code that were AI-generated. 

A game developer and professor expressed her view that in the realm of video games, we are approaching a reality where generative AI is “so integrated into a lot of consumer-grade tools that people are going to find it impossible to be able to disclose AI usage.” If creators or users do not even realize they are using generative AI when they use various creative digital tools, the Office’s requirement that registrant’s disclose their use of generative AI in copyright registration applications will be difficult if not impossible to follow.

The JCPA, Again

Posted June 15, 2023
Photo by AbsolutVision on Unsplash

For those of you following along, you’ve seen the numerous posts we’ve made about the Journalism Competition and Preservation Act, e.g., here, here, and here. The bill, which neither supports competition nor preservation of journalism, does have a really compelling story. Its apparent goal is to bolster local newsrooms and journalists by making it easier for them to negotiate with companies like Google or Meta (which links to news content), adding revenue to help aid in their operations. 

Today’s update is that the JCPA is a little closer to becoming law, with the Senate Judiciary Committee voting to move the bill forward on a 14-7 vote. We again joined a group of more than two dozen civil society organizations in opposing the bill in this letter led by Public Knowledge. We also joined a large group of organizations opposing a very similar bill that was introduced earlier this year in California, with similar aims. 

While the bill has some wonderful goals, it seems destined to fail at achieving them, while doing real damage to the broader online information ecosystem. As we’ve detailed before, the JCPA seems to create a pseudo-copyright regime in which platforms would have to pay for linking to news, which is a radical change in how the internet functions. It also includes provisions that would effectively force social media platforms to carry certain news outlet coverage, even when a platform disagrees with the views that those news outlets express, thus undermining Section 230 protections for platforms that want to remove false or misleading content from their websites. 

For the actual competition issues, the bill has also been contorted so that its aims–competition and support for small news outlets–have been co-opted by the biggest commercial publishers. For example, the bill’s supporters say it doesn’t benefit the biggest news outlets, but its cap of 1,500 employees would exclude a grand total of *3* of the largest newspapers in the US, while the JCPA’s minimum threshold of $100,000 in revenue  would leave out the smallest, most vulnerable newsrooms. Further, that numerical cap also doesn’t apply to broadcasters at all, which means it actually favors companies like News Corp., Sinclair, iHeartRadio, and NBCU. 

The Senate Judiciary Committee markup earlier today (you can watch the recording here) was relatively tame, but it was clear that there was very little agreement about what the bill would actually accomplish, or what its unintended consequences might be. The recurring theme throughout was that something must be done to protect and support journalism and that it is unfair that big tech companies are reaping incredible profits while small news publishers are getting very little of the financial pie and are struggling to survive. While we agree with both of these propositions, unfortunately, the JCPA seems uniquely ineffective at fixing the problem. 

Supreme Court Announces Decision in Jack Daniel’s v. VIP Products

Posted June 8, 2023
Photo by Chris Jarvis on Unsplash

Today, the Supreme Court handed down a decision in Jack Daniel’s v. VIP Products, a trademark case about the right to parody popular brands, for which Authors Alliance submitted an amicus brief, supported by the Harvard Cyberlaw clinic. In a unanimous decision, the Court vacated and remanded the Ninth Circuit’s decision, overturning the decision asking the lower courts to re-hear the case with a new, albeit it very narrow, principle announced by the Court: special First Amendment review is not appropriate in cases where one brand’s trademark is used in another, even when used as a parody. In addition to the majority opinion delivered by Justice Kagan, there were two concurring opinions by Justice Sotomayor and Justice Gorsuch, each joined by other justices. 

Case Background

The case concerns a dog toy that parodies Jack Daniel’s famous Tennessee Whiskey bottle, using some of the familiar features from the bottle, and bearing the label “Bad Spaniels.” After discovering the dog toy, Jack Daniel’s requested that VIP cease selling the toys. VIP Products refused, then proceeded to file a trademark suit, asking for a declaratory judgment that its toy “neither infringed nor diluted Jack Daniel’s trademarks.” Jack Daniel’s then countersued to enforce its trademark, arguing that the Bad Spaniels toy infringed its trademark and diluted its brand. We became interested in the case because of its implications for creators of all sorts (beyond companies making humorous parody products). 

As we explain in our amicus brief, authors rely on their ability to use popular brands in their works. For example, fiction authors might send their characters to real-life colleges and universities, set scenes where characters dine at real-life restaurant chains, and use other cultural touchstones to enrich their works and ultimately, to express themselves. While the case is about trademark, the First Amendment looms large in the background. A creator’s right to parody brands, stories, and other cultural objects is an important part of our First Amendment rights, and is particularly important for authors. 

Trademark law is about protecting consumers from being confused as to the source of the goods and services they purchase. But it is important that trademark law be enforced consistent with the First Amendment and its guarantees of free expression. And importantly, trademark litigation is hugely expensive, often involving costly consumer surveys and multiple rounds of hearings and appeals. We are concerned that even the threat of litigation could create a chilling effect on authors, who might sensibly decide not to use popular brands in their works based on the possibility of being sued. 

In our brief, we suggested that the Court implement a framework like the one established by the Second Circuit in Rogers v. Grimaldi, “a threshold test . . . designed to protect First Amendment interests in the trademark context.” Under Rogers, in cases of creative expressive works, trademark infringement should only come into play “where the public interest in avoiding consumer confusion outweighs the public interest in free expression.” It establishes that trademark law should only be applied where the use “has no artistic relevance to the underlying work whatsoever, or, if it has some artistic relevance, unless the [second work] explicitly misleads as to the source or the content of the work.”

The Supreme Court’s Decision

In today’s decision, the Court held that “[w]hen an alleged infringer uses a trademark as a designation of source for the infringer’s own goods, the Rogers test does not apply.” Without directly taking a position on the viability of the Rogers test, the Court found that, in this circumstance, where it believed that VIP Products used Jack Daniel’s trademarks for “source identifiers,” the test was inapplicable. It held that the Rogers test is not appropriate when the accused infringer has used a trademark to designate the source of its own goods—in other words, has used a trademark as a trademark.” The fact that the dog toy had “expressive content” did not disturb this conclusion. 

Describing Rogers as a noncommercial exclusion, the Court said that VIP’s use was commercial, as it was on a dog toy available for sale (i.e. a commercial product). Further supporting this conclusion, the Court pointed to the fact that VIP Products had registered a trademark in “Bad Spaniels.” It found that the Ninth Circuit’s interpretation of the “noncommercial use exception” was overly broad, noting that the Rogers case itself concerned a film, an expressive work entitled to the highest First Amendment protection, and vacating the lower court’s decision. 

The Court instead directed the lower court to consider a different inquiry, whether consumers will be confused as to whether Bad Spaniels is associated with Jack Daniel’s, rather than focusing on the expressive elements of the Bad Spaniels toy. But the Court also explained that “a trademark’s expressive message—particularly a parodic one, as VIP asserts—may properly figure in assessing the likelihood of confusion.” In other words, the fact that the Bad Spaniels toy is (at least in our view) a clear parody of Jack Daniel’s may make it more likely that consumers are not confused into thinking that Jack Daniel’s is associated with the toy. In her concurrence, Justice Sotomayor underscored this point by cautioning lower courts against relying too heavily on survey evidence when deciding whether consumers are confused “in the context of parodies and potentially other uses implicating First Amendment concerns.” In so doing, Justice Sotomayor emphasized the importance of parody as a form of First Amendment expression. 

The Court’s decision is quite narrow. It does not disapprove of the Rogers test in other contexts, such as when a trademark is used in an expressive work, and as such, it is unlikely to have a large impact on authors using brands and marks in their books and other creative expression. Lower courts across the country that do use the Rogers test may continue to do so under VIP Products.However, Justice Gorsuch’s concurrence does express some skepticism about the Rogers test and its applications, cautioning lower courts to handle the test with care. However, as a concurrence, this opinion has much less precedential effect than the majority’s. 

Remaining Questions

All of this being said, the Court does not explain why First Amendment scrutiny should not apply in this case, but merely reiterates that Rogers as a doctrine is and has always been “cabined,” with  “the Rogers test [applying] only to cases involving ‘non-trademark uses[.]’” The Court relies on that history and precedent rather than explaining the reasoning. Nor does the Court discuss the relevance of the commercial/noncommercial use distinction when it comes to the role of the First Amendment in trademark law. In our view, the Bad Spaniels toy did contain some highly expressive elements and functioned as a parody, so this omission is significant. And it may create some confusion for closer cases—at one point, Justice Kagan explains that “the likelihood-of-confusion inquiry does enough work to account for the interest in free expression,” “except, potentially, in rare situations.” We are left to wonder what those are. She further notes that the Court’s narrow decision “does not decide how far the ‘noncommercial use exclusion’ goes.” This may leave lower courts without sufficient guidance as to the reach of the noncommercial use exclusion from trademark liability and what “rare situations” merit application of the Rogers test to commercial or quasi-commercial uses.

Read your open access publishing agreements, or: how you might accidentally give Elsevier or Wiley the exclusive right to profit from your OA article

Posted June 5, 2023

Reading publishing agreements–even for short academic articles–can be extremely time consuming. For many academic publishers, you’ll find an array of information about your rights and obligations as an author, often spread across multiple websites and guides, in addition to the publishing contract itself. It’s tempting to just assume that these terms are standard and reasonable.  For open access publications, I’ve unfortunately found this attitude to be especially prevalent because authors tend to think that by publishing on an OA basis, the only contract terms that really matter are those of the Creative Commons license they choose for their article.

That can be a dangerous strategy.  Elsevier and Wiley OA publishing agreements, which have long-standing issues along these lines as noted here, here, here, and here, highlight the problem really well.

Those publishing agreements do provide what many authors want in OA publishing–free online access and broad reuse rights to users. But, if authors select the wrong option, they are also giving away their own residual rights while granting Elsevier or Wiley the exclusive right to commercially exploit their work. That includes the right for those publishers to exclude the author herself from making or authorizing even the most basic of commercial uses, such as posting the article to a for-profit repository like Researchgate or even SSRN. This is not a result I think most authors intend, but it’s hard to spot the problem unless you read these publication agreements carefully. 

Let’s dig into the agreements to understand what’s going on. 

CC License Restrictions and Some Thoughts on Why Authors Choose Them

First, a quick primer on open access licensing (you can read a longer introduction and overview of open access in our dedicated guide on the topic). Just about every major academic publisher now offers some option to make your scholarly article available open access. I won’t get into the debate about what exactly constitutes “open access.” I think its sufficient to say that for most authors, “open access” means at minimum free online access to the work combined with some grant of permissive reuse rights to readers. While there are some exceptions, Creative Commons licenses have emerged as the defacto default legal infrastructure through which those reuse rights are granted.  

Creative Commons licenses give rightsholders a number of options to exercise control over their work even while freely distributing it. The most common and basic CC license, CC-BY, does so by allowing basically all types of reuse (copying, commercial distribution, creation of derivative works) on the condition that the reuser appropriately attribute the original work. Creative Commons also has other licenses that limit downstream reuse in a few ways. Two of the most common for scholarly works are CC-BY-NC, and CC-BY-NC-ND, which respectively limit reuse to non-commercial uses (non-commercial or “NC”) and limit reuses to disallow distribution of derivative works (no derivatives or “ND”). Creative Commons also offers a CC-BY-ND license, which permits commercial uses but not the distribution of derivative works, but this is a less popular option. OpenAlex (an awesome research tool from OurResearch) indicates that there  are some 5.5+ million scholarly works (mostly articles and similar) published under CC-BY-NC and CC-BY-NC-ND licenses. 

In my experience, authors select these more restrictive licenses for a few reasons. Typically, authors will select a non-derivatives (ND) license because they’re concerned about some downstream user modifying their work and creating a new work that misrepresents the original or that is just of poor quality (think of a bad translation). For those authors, they want a say in how their work is built upon to create new derivatives. I’ve found this to be especially important to authors of controversial works that could be recast or adapted in ways that don’t include appropriate context. 

For authors selecting the non-commericial (NC) license restriction, the reasons are more varied, but I typically hear authors express concern about others profiting without their consent, especially from those who are attuned to the problems of large corporate interests who may seek to republish their work for a profit without the author’s input. 

The Elsevier and Wiley OA Publishing Agreements

I have never had an author say that they selected a CC-BY-NC or CC-BY-NC-ND license because they wanted to be sure that only their large, multinational commercial publisher could profit from their article, to the exclusion of everyone else including the author herself. Yet, if you read these agreements closely, that’s exactly what some publishers’ agreements do. 

Let’s start with Elsevier. It’s agreement is at least somewhat upfront about what’s going on. Elsevier’s sample CC-BY-NC publishing agreement states in the first paragraph that the author grants Elsevier “an exclusive publishing and distribution license in the manuscript identified above . . . in print, electronic and all other media (whether now known or later developed), in any form, in all languages, throughout the world, for the full term of copyright, and the right to license others to do the same[.]”

The key word in that license grant is the word “exclusive,” which means that Elsevier has the right to exclude everyone else (including the author) from using the article, except as agreed through the CC-BY-NC-ND license. In case there was any doubt, Elsevier makes clear on the same page that “I understand that the license of publishing rights I have granted to the Journal gives the Journal the exclusive right to make or sub-license commercial use.” The agreement does include a narrow carve out for authors to engage in some narrow categories of reuse that may go beyond the CC-BY-NC-ND license (e.g., lengthen the article to book form), but they are a far cry from the rights the author would otherwise have had he or she retained copyright and granted Elsvier a simple non-exclusive license to publish the article. 

The Wiley journal agreement ultimately accomplishes a similar result, though in my opinion it is a bit more misleading. First, authors will find Wiley’s OA sample publishing agreements through a page that advertises “Retain copyright with a Creative Commons license.” It states, innocently, that “with Creative Commons licenses, the author retains copyright and the public is allowed to reuse the content. You grant Wiley a license to publish the article and to identify as the original publisher.” 

If you read the sample Wiley agreements for publishing under a CC-BY-NC or CC-BY-NC-ND license, you will in fact find that the agreements do in fact provide that “The Contributor . . . retains all proprietary rights, such as copyright and patent rights in any process, procedure or article of manufacture described in the Contribution.” 

This sounds great! The problem comes if you keep reading the rest of the agreement. Later in the agreement, you will find that while the author “retains copyright,” that copyright is reduced to a shell of itself. You’ll see that Wiley (which actually refers to itself as the “Owner,” to set the tone) has the author agree to grant “to the Owner [Wiley], during the full term of the Contributor’s copyright and any extensions or renewals, an exclusive license of all rights of copyright in and to the Contribution that the Contributor does not grant under the CC-BY-NC-ND license.” So, if the author’s intent is to retain control over commercial reuse or derivative works, think again. 

Like Elsevier, Wiley does grant back some slivers of those rights to authors. For example, the right to make a translation as long as you only post it to your personal website, or the right to reuse the article in a collection published by a scholarly society (but, it definitely can’t be in any work with outside commercial sponsorship; Wiley seems particularly concerned with volumes sponsored by pharmaceutical companies, which they specifically target in the agreement). 

A few tips for reading your OA publishing agreement 

  1. Read (and negotiate) your publishing agreement! Clearly, reading your agreements is important. For OA agreements, you should specifically look for language that either transfers copyright to the publisher or language that grants the publisher a broad exclusive license. If it does contain such a grant or license, think about what rights you might need that go beyond the rights granted to the general public under the CC license that you chose. The best publishing agreements are simple and straightforward, granting the publisher a license to publish and otherwise leaving all rights with the author. There are lots of good examples–e.g., this is one of my favorites, from Emory and the University of Michigan for long-form scholarship. And for more tips on understanding and negotiating your publishing agreement, check out our dedicated guide on the topic. 
  1. Don’t buy the website sales pitch. If there is a conflict between what the publisher says on its website and what the contract says, the contract will absolutely control. Be careful about any assurances that exist outside the four corners of your contract. More than once I’ve found authors ask editors via email about reuses that go beyond the agreement. Typically, editors are happy to assure authors that they can do reasonable things with their own articles, but unfortunately, the standard publishing agreements are far less reasonable than most editors. Where the editors’ assurance and publishing agreements conflict, once again the terms of the publishing agreement will prevail. 
  1. Watch for contract language about retaining rights. Don’t be fooled into thinking that you’ll retain significant rights in your work by the sleight of hand that says you “retain copyright,” or that you will have “copyright in your name.” If a publisher is obtaining a license of exclusive rights from you, that means the publisher can exclude you and everyone else from making use of those rights unless the agreement contains an explicit grant back of rights to engage in those activities. This is actually very common in non-OA publishing agreements, but as the Elsevier and Wiley agreements illustrate, you need to watch out for it in OA publishing agreements as well. 

Copyright Office Holds Listening Session on Copyright Issues in AI-Generated Music and Sound Recordings

Posted June 2, 2023
Photo by Possessed Photography on Unsplash

Earlier this week, the Copyright Office convened a listening session on the topic of copyright issues in AI-generated music and sound recordings, the fourth in its listening session series on copyright issues in different types of AI-generated creative works. Authors Alliance participated in the first listening session on AI-generated textual works, and we wrote about the second listening session on AI-generated images here. The AI-generated music listening session participants included music industry trade organizations like the Recording Industry Association of America, Songwriters of North America, and the National Music Publishers’ Association; generative AI music companies like Boomy, Tuney, and Infinite Album; music labels like the Universal Music Group and Wixen; and individual musicians, artists, and songwriters. Streaming service Spotify and collective-licensing group SoundExchange also participated. 

Generative AI Tools in the Music Industry

Many listening session participants discussed the fact that some musical artists, such as Radiohead and Brian Eno, have been using generative AI tools as part of their work for decades. For those creators, generative AI music is nothing new, but rather an expansion of existing tools and techniques. What is new is the ease with which ordinary internet users without musical training can assemble songs using AI tools—programs like Boomy enable users to generate melodies and musical compositions, with options to overlay voices or add other sounds. Some participants sought to distinguish generative tools from so-called “assistive tools,” with the latter being more established for professional and amateur musicians. 

Where some established artists themselves have long relied on assistive AI tools to create their works, AI-generated music has lowered barriers to entry for music creation significantly. Some take the view that this is a good thing, enabling more creation by more people who could not otherwise produce music. Others protest that those with musical talent and training are being harmed by the influx of new participants in music creation, as these types of songs flood the market. In my view, it’s important to remember that the purpose of copyright, furthering the progress of science and the useful arts, is served when more people can generate creative works, including music. Yet AI-generated music may already be at or past the point where it can be indistinguishable from works created by human artists without the use of these tools, at least to some listeners. It may be the case that, as at least one participant suggested, audio generated works are somehow different from AI-generated textual works such that they may require different forms of regulation. 

Right of Publicity and Name, Image, and Likeness

Although the topic of the listening session was federal copyright law, several participants discussed artists’ rights in both their identities and voices—aspects of the “right of publicity” or the related name, image, and likeness (“NIL”) doctrine. These rights are creatures of state law, rather than federal law, and allow individuals, particularly celebrities, to control what uses various aspects of their identities may be put to. In one well-known right of publicity case, Ford used a Bette Midler “sound alike” for a car commercial, which was found to violate her right of publicity. That case and others like it have popularized the idea that the right of publicity can cover voice. This is a particularly salient issue within the context of AI-generated music due to the rise of “soundalikes” or “voice cloning” songs that have garnered substantial popularity and controversy, such as the recent Drake soundalike, “Heart on My Sleeve.” Some worry that listeners could believe they are listening to the named musical artist when in fact they are listening to an imitation, potentially harming the market for that artist’s work. 

The representative from the Music Artists Coalition argued that the hodge podge of state laws governing the right of publicity could be one reason why soundalikes have proliferated: different states have different levels of protection, and the lack of unified guidance on how these types of songs are governed under the law can create uncertainty as to how they will be regulated. And the representative from Controlla argued that copyright protection should be expanded to cover voice or identity rights. In my view, expanding the scope of copyright in this way is neither reasonable nor necessary as a matter of policy (and furthermore, would be a matter for Congress, and not the Copyright Office, to address), but it does show the breadth of the soundalike problem for the music industry. 

Copyrightability of AI-Generated Songs

Several listening session participants argued for intellectual property rights in AI-generated songs, and others argued that the law should continue to center human creators. The Copyright Office’s recent guidance regarding copyright in AI-generated works suggests that the Office does not believe that there is any copyright in the AI-generated materials due to the lack of human authorship, but human selection, editing, and compilation can be protected. The representatives from companies with AI-generating tools expressed a need for some form of copyright protection for the songs these programs produce, explaining that they cannot be effectively commercialized if they are not protected. In my view, this can be accomplished through protection for the songs as compilations of uncopyrightable materials or as original works, owing to human input and editing. Yet, as many listening session participants across these sessions have argued, the Copyright Office registration guidance does not make clear precisely how much human input or editing is needed to render an AI-generated work a protectable original work of authorship. 

Licensing or Fair Use of AI Training Data

In contrast to the view taken by many during the AI-generated text listening session, none of the participants in this listening session argued outright that training generative AI programs on in-copyright musical works was fair use. Instead, much of the discussion focused on the need for a licensing scheme for audio materials used to train generative AI audio programs. Unlike the situations with many text and image-based generative AI programs, the representatives from generative AI music programs expressed an interest and willingness to enter into licensing agreements with music labels or artists. In fact, there is some evidence that licensing conversations are already taking place. 

The lack of fair use arguments during this listening session may be due to the particular participants, industry norms, or the “safety” of expressing this view in the context of the music industry. But regardless, it provides an interesting contrast to views around training data text-generating programs like ChatGPT, which many (including Authors Alliance) have argued are fair uses. This is particularly remarkable since at least some of these programs, in our view, use the audio data they are trained on for a highly transformative purpose. Infinite Album, for example, allows users to generate “infinite music” to accompany video games. The music reacts to events in the video game—becoming more joyful and upbeat for victories, or sad for defeats—and can even work interactively for those streaming their games, where those watching the stream can temporarily influence the music. This seems like precisely the sort of “new and different purpose” that fair use contemplates, and similarly like a service that is unlikely to compete directly with individual songs and records. 

Generative AI and Warhol Foundation v. Goldsmith

Many listening session participants discussed the interactions between how AI-generated music should be regulated under copyright law and the recent Supreme Court fair use decision in Warhol Foundation v. Goldsmith (you can read our coverage of that decision here), which also considered whether a particular use which could have been licensed was fair use. And some participants argued that the decision in Goldsmith makes it clear that training generative AI models (i.e., the input stage) is not a fair use under the law. It is not clear precisely how the decision will impact the fair use doctrine going forward, particularly as it applies to generative AI, and I think it is a stretch to call it a death knell for the argument that training generative AI models is a fair use. However, the Court did put a striking emphasis on the commerciality of the use in that case, deemphasizing the transformativeness inquiry somewhat. This could impact the fair use inquiry in the context of generative AI programs, as these programs tend overwhelmingly to be commercial, and the outputs they create can and are being used for commercial purposes. 

Athena Unbound and Untangling the Law of Open Access

Posted May 26, 2023

A few months ago, Authors Alliance and the Internet Archive co-hosted an engaging book talk featuring historian Peter Baldwin and librarian Chris Bourg. They discussed Baldwin’s new book, Athena Unbound: Why and How Scholarly Knowledge Should be Free For All. You can watch the recording of the talk here and access the book for free in open access format here.

Today, I’m beginning a series of posts aimed at clarifying legal issues in open access scholarship. Reflecting on some key takeaways from Athena Unbound seemed like a great place to start.

For those already well-versed in the open access community, you know that there is an abundance of literature covering the theory, economics, and sociological dimensions of OA. But, it’s easy to lose the forest for the trees.  Athena Unbound stands out by providing a comprehensive, high-level explanation of how we have reached the current state of open access affairs. The book offers much more than just commentary on the underlying legal structures that impact access to scholarly works. But, as we delve deeper into the legal aspects of open access in this series, I want to highlight three key takeaways on this issue:

  1. Copyright law does not cater to most academic authors.

“Open access does not seek to dispossess authors of their property nor to stint them of their rightful earnings. But authors are not all alike. Those whose creativity supplies their livelihood are entitled to the fruits of their labor. But most authors either do not make a living from their work or are already supported in other ways.” – Athena Unbound, Chapter 2, “The Variety of Authors and Their Content”

In theory, copyright law in the United States is designed to incentivize the creation of new works by granting strong and long-lasting economic rights. This framework assumes authors primarily function as independent operators (Baldwin likens them to “bohemian artistes”) who can negotiate these rights with publishers or directly with members of the public in exchange for financial support.

However, this framework does not align with the reality faced by most academic authors, who number in the millions. While scholarly authors deserve compensation for their work, their remuneration also often comes from sources like university employment. Their motivation to create stems from incentives to share ideas and discoveries with the world, as well as personal gains such as recognition and career advancement. For these authors, the publishing system and the laws that govern it have clash with their interests to such an extent that we now witness academic authors willingly paying thousands of dollars to persuade publishers to distribute their articles for free.

If anything, copyright law, with its excessively long duration, extensive economic control, and limited freedom for researchers to engage with creative works, hampers those authors’ goals in practice. As Baldwin explains, “the fundamental problem open access faces is worth restating. Copyright has become bloated, prey to the rent-seeking academic publishing industry… Legislators, dazzled into submission by the publishing industry’s success in portraying itself as the defender of creativity and cultural patrimony, bear much responsibility.”

As we explore the legal mechanisms that influence open access, it is crucial to remember that the default rules of the system are more often than not at odds with the goals of open access authors. 

  1. Open access must encompass more than contemporary scientific articles.

While much of the current open access discourse revolves around providing access to the latest scholarly research, particularly scientific articles, there is a vast amount of past scholarship that remains inaccessible. An inclusive approach to open access should address how to provide access to these works as well. The majority of research library holdings are not available online in any form. Baldwin uses the term “grey literature” to describe the extensive collections in research libraries that are no longer commercially available. As he points out, most books lose commercial viability rather quickly. “Of the 10,000 US books published in 1930, only 174 were still in print in 2001. Of the 63 books that won Australia’s Miles Franklin prize over the past half-century, ten are unavailable in any format.”

Many of these works have become so-called orphan works: they are so detached from the commercial marketplace that their publishers have gone out of business, authors have passed away, and any remaining rights holders who would benefit from potential sales are obscure, if they exist at all. Even Maria Pallante, former Register of Copyrights and current AAP president, agrees that in the case of true orphan works, “it does not further the objectives of the copyright system to deny use of the work, sometimes for decades. In other words, it is not good policy to protect a copyright when there is no evidence of a copyright owner.”

In addition to this issue around orphan works, a subset of what is known as the “20th Century black hole,” Athena Unbound also sheds light on the various concerns and challenges that act as barriers to open access in scholarly fields outside of the sciences. While the goals of open access may be the same across these different areas, the implementation can vary significantly. In the case of certain scholarly works, such as older books entangled in complex rights issues, we may need to settle for an imperfect form of “open,” such as read-only viewing via controlled digital lending—a far cry from what many consider true open access.

  1. The intricacies of ownership are significant.

Although this is not the primary focus of Athena Unbound, it is an important aspect that deserves attention. In simple terms, the legal pathway to open access appears straightforward: authors, often depicted as individual, independent actors, must retain sufficient rights to allow them to legally share and allow reuse of their writing.

However, reality is far more complex. Multiple-authored works, including in extreme cases thousands of joint authors on one scientific article, can complicate our understanding of who actually holds a copyright interest in a work and can therefore authorize an open license on it. 

Moreover, many if not most academic authors are employed by colleges or universities, each with its own perspective on copyright ownership of scholarly publications. In most cases, as Baldwin explains, universities have been hesitant to assert ownership of scholarly publications under the work-for-hire doctrine (a topic I will cover in a subsequent post), possibly based on the increasingly tenuous “teacher exception” to the work-for-hire doctrine. However, this approach is not universally adopted. For instance, some universities assert ownership of specific categories of scholarly work, such as articles produced under grant-funded projects. Others reserve broad licenses to use scholarly work for university purposes, albeit with ill-defined parameters.

Open access, or at least the type we commonly think of—copyrighted articles typically licensed under Creative Commons or similar licenses—depends heavily on obtaining affirmative permission from the rightsholder. But the identity of the rightsholder, whether it be the university, author, or even the funder, can vary significantly due to a wide range of factors, including state laws, university IP policies, and funder grant contracts. 

Stay tuned for more in this series, and if you have questions in the meantime, check out our open access guide and resource page.

Supreme Court Issues Decisions in Warhol Foundation and Gonzalez

Posted May 19, 2023
Photo by Claire Anderson on Unsplash

Yesterday, the Supreme Court released two important decisions in Warhol Foundation v. Goldsmith and Gonzalez v. Google—cases that Authors Alliance has been deeply invested in, submitting amicus briefs to the Court in both cases. 

Warhol Foundation v. Goldsmith and Transformativeness

First, the Court issued its long-awaited opinion in Warhol Foundation v. Goldsmith, a case Authors Alliance has been following for years, and for which we submitted an amicus brief last summer. The case concerned a series of screen prints of the late musical artist Prince created by Andy Warhol, and asked whether the creation and licensing of one of the images, an orange screen print inspired by Goldsmith’s black and white photograph (which the Court calls “Orange Prince”), constituted fair use. After the Southern District of New York found for the Warhol Foundation on fair use grounds, the Second Circuit overturned the ruling, finding that the Warhol Foundation’s use constituted infringement. The sole question before the Supreme Court was whether the first factor in fair use analysis, the purpose and character of the use, favored a finding of fair use. 

To our disappointment, the Supreme Court’s majority agreed with the holding of the district court, finding that the purpose and character of Warhol’s use favored Goldsmith, such that it did not support a finding of fair use. This being said, the decision focused narrowly on the Warhol Foundation’s “commercial licensing of Orange Prince to Condé Nast,” expressing “no opinion as to the creation, display, or sale of any of the original Prince Series works.” Because the Court cabins its opinion, focusing specifically on the licensing of Orange Prince to Condé Nast rather than the creation of the entire Prince series, the decision is less likely to have a deleterious effect on the fair use doctrine generally than a broader decision would have. 

Writing for the majority, Justice Sotomayor argued that Goldsmith’s photograph and the Prince screen print in question shared the same purpose, “portraits of Prince used to depict Prince in magazine stories about Prince.” Moreover, the Court found the use to be commercial, given that the screen print was licensed to Condé Nast. Justice Sotomayor explained that “if an original work and secondary use share the same or highly similar purposes, and the secondary use is commercial, the first fair use factor is likely to weigh against fair use, absent some other justification for copying.” Justice Sotomayor found that the two works shared the same commercial purpose, and therefore concluded that factor one favored Goldsmith. 

Justice Kagan, joined by Chief Justice Roberts, issued a strongly worded dissenting opinion. The dissent admonished the majority for its departure from Campbell’s “new meaning or message test,” an inquiry that Authors Alliance advocated for in our amicus brief. Justice Kagan further criticized the majority’s shifting focus towards commerciality, arguing that the fact that the use was a licensing transaction should not be given so much importance in the analysis. While Authors Alliance agrees with these points, we are less sure that the majority’s decision goes so far as to “constrain creative expression” or “threaten[] the creative process. And while it’s uncertain what effect this case will have on the fair use doctrine more generally, one important takeaway is that the question of whether the use in question is commercial in nature—a consideration under the first factor—has been elevated to one of greater importance. 

While we thought this case offered a good opportunity for the Court to affirm a more nuanced approach to transformative use, we much prefer the Supreme Court’s approach to the Second Circuit’s decision, and applaud the Court on confining its ruling to the narrow question at issue. The holding does not, in our view, radically alter the doctrine of fair use or disrupt a bulk of established case law. Moreover, some aspects of arguments we made in our brief—such as the notion that transformativeness is a matter of degree, not a binary—are present in the Court’s decision. This is a good thing, in our view, as it will allow for more nuanced consideration of a use’s character and purpose, and stands in contrast to the Second Circuit’s all or nothing view of transformativeness. 

Gonzalez v. Google and the Missing Section 230

Also yesterday, the Court released its opinion in Gonzalez v. Google, a case that generated much attention because of its potential threat to Section 230, and another case in which Authors Alliance submitted an amicus brief. The case asked whether Google could be held liable under an anti-terrorism statute for harm caused by ISIS recruitment videos that YouTube’s algorithm recommended. In its per curiam decision (a unanimous one without a named Justice as author), the Court stated that Gonzalez’s complaint had failed to state a viable claim under the relevant anti-terrorism statute. Therefore, it did not reach the question of the applicability of Section 230 to the recommendations at issue. In other words, a case that generated tremendous concern about the Court disturbing Section 230 and harming internet creators, communities, and services that relied on it ended up saying nothing at all about the statute. 

Authors Alliance Welcomes Christian Howard-Sukhil as Text Data Mining Legal Fellow

Posted May 12, 2023

As we mentioned in our blog post on our Text Data Mining: Demonstrating Fair Use project a few weeks back, Authors Alliance is pleased to have Christian Howard-Sukhil on board as our brand new Text Data Mining legal fellow. As part of our project, generously funded by the Andrew W. Mellon Foundation, we established this new fellowship to provide research and writing support for our project. Christian will help us produce guidance for researchers and a report on the usability, successes, and challenges of the text data mining exemption to Section 1201’s prohibition on bypassing technical protection measures that Authors Alliance obtained in 2021. Christian begins her work with Authors Alliance this week, and we are thrilled to have her. 

Christian holds a PhD in English Language and Literature from the University of Virginia, and has just completed her second year of law school at UC Berkeley. Christian has extensive digital humanities and text data mining experience, including in previous roles at UVA and Bucknell University. Her work with Authors Alliance will focus on researching and writing about the ways that current law helps or hinders text and data mining researchers in the real world. She will also contribute to our blog—look out for posts from her later this year.

About her new role at Authors Alliance, Christian says, “I am delighted to join Authors Alliance and to help support text and data mining researchers navigate the various legal hurdles that they face. As a former academic and TDM researcher myself, I saw first-hand how our complicated legal structure can deter valid and generative forms of TDM research. In fact, these legal issues are, in part, what inspired me to attend law school. So being able to work with Authors Alliance on such an important project—and one so closely tied to my own background and interests—is as exciting as it is rewarding.”

Please join us in welcoming Christian!

An Update on our Text and Data Mining: Demonstrating Fair Use Project

Posted April 28, 2023

Back in December we announced a new Authors Alliance’s project, Text and Data Mining: Demonstrating Fair Use, which is about lowering and overcoming legal barriers for researchers who seek to exercise their fair use rights, specifically within the context of text data mining (“TDM”) research under current regulatory exemptions. We’ve heard from lots of you about the need for support in navigating the law in this area. This post gives a few updates. 

Text and Data Mining Workshops and Consultations

We’ve had a tremendous amount of interest and engagement with our offers to hold hands-on workshops and trainings on the scope of legal rights for TDM research. Already this spring, we’ve been able to hold two workshops in the Research Triangle hosted at Duke University, and a third workshop at Stanford followed by a lively lunch-time discussion. We have several more coming. Our next stop is in a few weeks at the University of Michigan, and we have plans in the works for workshops in the Boston area, New York, a few locations on the West Coast, and potentially others as well. If you are interested in attending or hosting a workshop with TDM researchers, librarians, or other research support staff, please let us know! We’d love to hear from you. The feedback so far has been really encouraging, and we have heard both from current TDM researchers and those for whom the workshops have opened their eyes to new possibilities. 

ACH Webinar: Overcoming Legal Barriers to Text and Data Mining
Join us! In addition to the hands-on in-person workshops on university campuses, we’re also offering online webinars on overcoming legal barriers to text and data mining. Our first is hosted by the Association for Computers and the Humanities on May 15 at 10am PT / 1pm ET. All are welcome to attend, and we’d love to see you online!
Read more and register here. 

Research 

A second aspect of our project is to research how the current law can both help and hinder TDM researchers, with specific attention to fair use and the DMCA exemption that Authors Alliance obtained for TDM researchers to break digital locks when building a corpus of digital content such as ebooks or DVDs.

Christian Howard-Sukhil, Authors Alliance Text and Data Mining Legal Fellow

To that end, we’re excited to announce that Christian Howard-Sukhil will be joining Authors Alliance as our Text and Data Mining Legal Fellow. Christian holds a PhD in English Language and Literature from the University of Virginia and is currently pursuing a JD from the UC Berkeley School of Law. Christian has extensive digital humanities and text data mining experience, including in previous roles at UVA and Bucknell University. Her work with Authors Alliance will focus on researching and writing about the ways that current law helps or hinders text and data mining researchers in the real world. 

The research portion of this project is focused on the practical implications of the law and will be based heavily on feedback we hear from TDM researchers. We’ve already had the opportunity to gather some feedback from researchers including through the workshops mentioned above, and plan to do more systematic outreach over the coming months. Again, if you’re working in this field (or want to but can’t because of concerns about legal issues), we’d love to hear from you. 

At this stage we want to share some preliminary observations, based on recent research into these issues (supported by the work of several teams of student clinicians) as well as our recent and ongoing work with TDM researchers:

1) Licenses restrictions are a problem. We’ve heard clearly that licenses and terms of use impose a significant barrier to TDM research. While researchers are able to identify uses that would qualify as fair use and also many uses that likely qualify under the DMCA exemption, terms of use accompanying ebook licenses can override both. These terms vary, from very specific prohibitions–e.g., Amazon’s, which says that users “may not attempt to bypass, modify, defeat, or otherwise circumvent any digital rights management system”–to more general prohibitions on uses that go beyond the specific permissions of the license–e.g., Apple’s terms, which state that “No portion of the Content or Services may be transferred or reproduced in any form or by any means, except as expressly permitted.” Even academic licenses, often negotiated by university libraries to have  more favorable terms, can still impose significant restrictions on reuse for TDM purposes. Although we haven’t heard of aggressive enforcement of those terms to restrict academic uses, even the mere existence of those terms can have chilling and negative real world impacts on research using TDM techniques.

The problem of licenses overriding researchers rights under fair use and other parts of copyright law is of course not limited to just inhibiting text and data mining research. We wrote about the issue, and how easy it is to evade fair use, a few months ago, discussing the many ways that restrictive license terms can inhibit normal, everyday uses of works such as criticism, commentary and quotation. We are currently working on a separate paper documenting the scope and extent of “contractual override,” and will be part of a symposium on the subject in May, hosted by the Association of Research Libraries and the American University, Washington College of Law Program on Information Justice and Intellectual Property.

2) The TDM exemption is flexible, but local interpretation and support can vary. We’ve heard that the current TDM exemption–allowing researchers to break technological protection measures such as DRM on ebooks and CSS on DVDs–is an important tool to facilitate research on modern digital works. And we believe the terms of that exemption are sufficiently flexible to meet the needs of a variety of research applications (how wide a variety remains to be seen through more research). But local understanding and support for researchers using the exemption can vary. 

For example, the exemption requires that the university that the TDM research is associated with implement “effective security measures” to ensure that the corpus of copyrighted works isn’t used for another purpose. The regulation further explains that in the absence of a standard negotiated with content holders, “effective security measures” means “measures that the institution uses to keep its own highly confidential information secure.” University  IT data security standards don’t always use the same language or define their standard to cover “highly confidential information” and so university IT offices must interpret this language and implement the standard in their own local context. This can create confusion about what precisely universities need to do to secure the TDM corpora. 

Some of these definitional issues are likely growing pains–the exemption is still new and universities need time to understand and implement standards to satisfy its terms in a reasonable way–it will be important to explore further where there is confusion on similar terms and how that might best be resolved. 

3) Collaboration and sharing are important. Text and data mining projects are often conceived of as part of a much larger research agenda, with multiple potential research outputs both from the initial inquiry and follow-up studies with a number of researchers, sometimes from a number of institutions. Fair use clearly allows for collaborative TDM work –e.g., in  Authors Guild v. HathiTrust, a foundational fair use case for TDM research in the US, we observe that the entire structure of HathiTrust is a collective of a number of research institutions with shared digital assets. And likewise, the TDM exemption permits a university to provide access to “researchers affiliated with other institutions of higher education solely for purposes of collaboration or replication of the research.” The collaborative aspect of this work raises some challenging questions, both operationally and conceptually. For example, the exemption for breaking digital locks doesn’t define precisely who qualifies as a researcher who is “affiliated,” leaving open questions for universities implementing the regulation. More conceptually, the issue of research collaboration raises questions about how precisely the TDM purpose must be defined when building a corpora under the existing exemption, for example when researchers collaborate but investigate different research questions over time. Finally, the issue of actually sharing copies of the corpus with researchers at other institutions is important because at least in some cases, local computing power is needed to effectively engage with the data. 

Again, just preliminary research, but some interesting and important questions! If you are working in this area in any capacity, we’d love to talk. The easiest way to reach us is at  info@authorsalliance.org

Want to Learn More?
This current Authors Alliance project is generously supported by the Mellon Foundation, which has also supported a number of other important text and data mining projects. We’ve been fortunate to be part of a broader network of individuals and organizations devoted to lowering legal barriers for TDM researchers. This includes efforts spearheaded by a team at UC Berkeley to produce the “Legal Literacies for Text Data Mining” and its current project to address cross-border TDM research, as well as efforts from the Global Network on Copyright and User Rights, which has (among other things) led efforts on copyright exceptions for TDM globally.

Authors Alliance Joins Copyright Office Listening Session On Copyright in AI-Generated Literary Works

Posted April 20, 2023
Photo by Possessed Photography on Unsplash

Yesterday, I represented Authors Alliance in a Copyright Office listening session on copyright issues in AI-generated literary works, in the first of two of such sessions that the Office convened yesterday afternoon. I was pleased to be invited to share our views with the Office and participate in a rousing discussion among nine other stakeholders, representing a diverse group of industries and positions. Generative AI raises challenging legal questions, particularly for its skeptics, but it also presents some incredible opportunities for authors and other creators.

During the listening session, I emphasized the potential for generative AI programs (like OpenAI’s Chat GPT, Microsoft’s Bing AI, Jasper, and others) to support authorship in a number of different ways. For instance, generative AI programs support authors by increasing the efficiency of some of the practical aspects of being a working author aside from their writings. But more importantly, generative AI programs can actually help authors express themselves and create new works of authorship. 

In the first category, generative AI programs can support authors by, for example, helping them create text for pitch letters to send to agents and editors, produce copy for their professional websites, and develop marketing strategies for their books. Making these activities more efficient frees up time for authors to focus on their writing, particularly for authors whose writing time is limited by other commitments. 

In the second category, generative AI has tremendous potential to help authors come up with new ideas for stories, develop characters, summarize their writings, and perform early stage edits of manuscripts. Moreover, and particularly for academic authors, generative AI can be an effective research tool for authors seeking to learn from a large corpus of texts. Generative AI programs can help authors research by providing short and simple summaries of complex issues, surveys of the landscape of various fields, or even guidance on what human works to turn to in their research. Authors Alliance is committed to protecting authors’ right to conduct research, and we see generative AI tools as a new, innovative, and efficient form of conducting this research. Making research easier helps authors save time, and has a particular benefit for authors with disabilities that make it difficult to travel to multiple libraries or otherwise rely on analog forms of research. 

These programs undoubtedly have the potential to serve as powerful creative tools that support authorship in these ways and more, but, when discussing the copyright implications of the programs and the works they produce, it’s important to remember just how new these technologies are. Because generative AI remains in its infancy, and the costs and benefits for different segments of the creative industry have yet to be seen, it seems to me to be sensible to preserve the development of these tools before crafting legal solutions to problems they might pose in the future. And in fact, in our view, U.S. copyright law already has the tools to deal with many of the legal challenges that these programs might post. When generative AI outputs look too much like the copyrighted inputs they are trained on, the substantial similarity test can be used to assess claims of copyright infringement to vindicate an authors’ exclusive rights in their works when those outputs do infringe. 

In any case, in order for generative AI programs to be effective creative tools, it’s necessary that they are trained on large corpora. Narrowing the corpus of works the programs are trained on—through compulsory licensing or other mechanisms—can have disastrous effects. For example, research has shown that narrow data sets are more likely to produce racial and gender bias in AI outputs. In our view, the “input” step, where the programs are trained on a large corpus of works, is a fair use of these texts. And the holdings in Google Books and HathiTrust indicate that it is consistent with fair use to build large corpora of works, including works that remain protected by copyright, for applications such as computational research and information discovery. Additionally, the Copyright Office has recognized this principle in the context of research and scholarship, as demonstrated by its approval of Authors Alliance’s petition for an exemption from DMCA restrictions for text and data mining

The question of the copyright status of AI-generated works is an important one. Most if not all of the stakeholders participating in this discussion agreed with the Copyright Office’s recent guidance regarding registration in AI-generated works: under ordinary copyright principles, the lack of human authorship means these texts are not protected by copyright. This being said, we also recognize that there may be challenges in reconciling existing copyright principles with these new types of works and the questions about authorship, creativity, and market competition that they might pose. 

But importantly, while this technology is still in its early stages, it serves the core purposes of copyright—furthering the progress of science and the useful arts by incentivizing new creation—to allow these systems to develop and confront new legal challenges as they emerge. Copyright is not only about protecting the exclusive rights of copyright holders (a concern that underlies many arguments against generative AI as a fair use), but incentivizing creativity for the public benefit. The new forms of creation made possible through generative AI can incentivize people who would not otherwise create expressive works to do so, bringing more people into creative industries and adding new creative expression to the world to the benefit of the public.

The listening sessions were recorded, and will be available on the Copyright Office website in the coming weeks. And these listening sessions are only the beginning of the Office’s investigation of copyright in AI generated works. Other listening sessions on visual works, music, and audiovisual works will be held in the coming weeks, and the Office has indicated that there will be an opportunity for written public comments in order for stakeholders to weigh in further. We are committed to remaining involved in these cutting edge issues, through written comments and otherwise, and we will keep our readers informed as policy around generative AI continues to evolve.