Tag Archives: Copyright

The AI Copyright Hype: Legal Claims That Didn’t Hold Up

Posted September 3, 2024

Over the past year, two dozen AI-related lawsuits and their myriad infringement claims have been winding their way through the court system. None have yet reached a jury trial. While we all anxiously await court rulings that can inform our future interaction with generative AI models, in the past few weeks, we are suddenly flooded by news reports with titles such as “US Artists Score Victory in Landmark AI Copyright Case,” “Artists Land a Win in Class Action Lawsuit Against A.I. Companies,” “Artists Score Major Win in Copyright Case Against AI Art Generators”—and the list goes on. The exuberant mood in these headlines mirror the enthusiasm of people actually involved in this particular case (Andersen v. Stability AI). The plaintiffs’ lawyer calls the court’s decision “a significant step forward for the case.” “We won BIG,” writes the plaintiff on X

In this blog post, we’ll explore the reality behind these headlines and statements. The “BIG” win in fact describes a portion of the plaintiffs’ claims surviving a pretrial motion to dismiss. If you are already familiar with the motion to dismiss per Federal Rules of Civil Procedure Rule 12(b)(6), please refer to Part II to find out what types of claims have been dismissed early on in the AI lawsuits. 

Part I: What is a motion to dismiss?

In the AI lawsuits filed over the last year, the majority of the plaintiffs’ claims have struggled to survive pretrial motions to dismiss. That may lead one to believe that claims made by plaintiffs are scrutinized harshly at this stage. But that is far from the truth. In fact, when looking at the broader legal landscape beyond the AI lawsuits, Rule 12(b)(6) motions are rarely successful.

In order to survive a Rule 12(b)(6) motion to dismiss filed by AI companies, plaintiffs in these lawsuits must make “plausible” claims in their complaint. At this stage, the court will assume that all of the factual allegations made by the plaintiffs are true and interpret everything in a way most favorable to plaintiffs. This allows the court to focus on the key legal questions without getting caught up in disputes about facts. When courts look at plaintiffs’ factual claims in the best possible light, if the defendant AI companies’ liability can plausibly be inferred based on facts stated by plaintiffs, then the claims will survive a motion to dismiss. Notably, the most important issues at the core of these AI lawsuits—namely, whether there has been direct copyright infringement and what may count as a fair use—are rarely decided at this stage, because these claims raise questions about facts as well as the law. 

On the other hand, if the AI companies will prevail as a matter of law even when the plaintiffs’ well-pleaded claims are taken as entirely true, then the plaintiffs’ claims will be dismissed by court. Merely stating that it is possible that the AI companies have done something unlawful, for instance, will not survive a motion to dismiss; there must be some reasonable expectation that evidence can be found later during discovery to support the plaintiffs’ claims. 

Procedurally, when a claim is dismissed, the court will often allow the plaintiffs to amend their complaint. That is exactly what happened with Andersen v. Stability AI (the case mentioned at the beginning of this blog post): the plaintiffs’ claims were first dismissed in October last year, and the court allowed the plaintiffs to amend their complaint to address the deficiencies in their allegations. The newly amended complaint contains infringement claims that survived new motions to dismiss, as well as other breach of contract, unjust enrichment, and DMCA claims that again were dismissed.

As you may have guessed, including something like the “motion to dismiss” in our court system can help save time and money, so parties don’t waste precious resources on meritless claims at trial. One judge dismissed a case against OpenAI earlier this year, stating that “the plaintiffs need to understand that they are in a court of law, not a town hall meeting.” The takeaway: plaintiffs need to bring claims that can plausibly entitle them to relief.

Part II: What claims are dismissed so far?

Most of the AI lawsuits are still at an early stage, and most of the court rulings we have seen so far are in response to the defendants’ motions to dismiss. From these rulings, we have learned which claims are viewed as meritless by courts. 

The removal of copyright management information (“CMI,” which includes information such as the title, the copyright holder, and other identifying information in a copyright notice) is a claim included in almost all plaintiffs’ complaints in the AI lawsuits, and this claim has failed to survive motions to dismiss without exception. DMCA Section 1202(b) restricts the intentional, unauthorized removal of CMI. Experts initially considered DMCA 1202(b) one of the biggest hurdles for non-licensed AI training. But courts so far have dismissed all DMCA 1202(b) claims, including in J. Doe 1 v. GitHub, Tremblay v. OpenAI, Andersen v. Stability AI, Kadrey v. Meta Platforms, and Silverman v. OpenAI. The plaintiffs’ DMCA Section 1202(b)(1) claims have failed because plaintiffs were not able to offer any evidence showing their CMI has been intentionally removed by the AI companies. For example, in Tremblay v. OpenAI and Silverman v. OpenAI, the courts held that the plaintiffs did not argue plausibly that OpenAI has intentionally removed CMI when ingesting plaintiffs’ works for training. Additionally, plaintiffs’ DMCA Section 1202(b)(3) have failed thus far because the plaintiffs’ claims did not fulfill the identicality requirement. For example, in J. Doe 1 v. GitHub, the court pointed out that Copilot’s output did not tend to represent verbatim copies of the original ingested code. We now see plaintiffs voluntarily dropping the DMCA claims in their amended complaints, such as in Leovy v Google (formerly J.L. vs Alphabet). 

Another claim that has been consistently dismissed by courts is that AI models are infringing derivative works of the training materials. The law defines a derivative work as “a work based upon one or more preexisting works, such as a translation, musical arrangement, … art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted.” To most of us, the idea that the model itself (as opposed to, say, outputs generated by the model) can be considered a derivative work seems to be a stretch. The courts have so far agreed. On November 20, 2023, the court in Kadrey v. Meta Platforms said it is “nonsensical” to consider an AI model a derivative work of a book just because the book is used for training. 

Similarly, claims that all AI outputs should be automatically considered infringing derivative works have been dismissed by courts, because the claims cannot point to specific evidence that an instance of output is substantially similar to an ingested work. In Andersen v. Stability AI, plaintiffs tried to argue “that all elements of [] Anderson’s copyrighted works [] were copied wholesale as Training Images and therefore the Output Images are necessarily derivative;” the court dismissed the argument because—besides the fact that plaintiffs are unlikely able to show substantial similarity—“it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted [] or that all [] Output Images rely upon (theoretically) copyrighted Training Images and therefore all Output images are derivative images. … [The argument for dismissing these claims is strong] especially in light of plaintiffs’ admission that Output Images are unlikely to look like the Training Images.”

Several of these AI cases have raised claims of vicarious liability—that is, liability for the service provider based on the actions of others, such as users of the AI models. Because a vicarious infringement claim must be based on a showing of direct infringement, the vicarious infringement claims are also dismissed in Tremblay v. OpenAI and Silverman v. OpenAI, when plaintiffs cannot point to any infringing similarity between AI output and the ingested books.

Many plaintiffs have also raised a number of non-copyright, state law claims (such as negligence or unfair competition) that have largely been dismissed based on copyright preemption. Copyright preemption prevents duplicitous state law claims when those state law claims are based on an exercise of rights that are equivalent to those provided for under the federal Copyright Act. In Andersen v. Stability AI, for example, the court dismissed the plaintiffs’ unjust enrichment claim because the plaintiffs failed to add any new elements that would distinguish their claim based on California’s Unfair Competition Law or common law from rights under the Copyright Act.

It is interesting to note that many of the dismissed claims in different AI lawsuits closely mimic one another, such as in Kadrey v. Meta Platforms, Andersen v. Stability AI, Tremblay v. OpenAI, and Silverman v. OpenAI. It turns out that the similarities are no coincidence—all these lawsuits are filed by the same law firm. These mass-produced complaints not only contain overbroad claims that are prone to dismissal, they also have overbroad class designations. In the next blog post, we will delve deeper into the class action aspect of the AI lawsuits. 

Clickbait arguments in AI Lawsuits (will number 3 shock you?)

Posted August 15, 2024

Image generated by Canva

The booming AI industry has sparked heated debates over what AI developers are legally allowed to do. So far, we have learned from the US Copyright Office and courts that AI created works are not protectable, unless it is combined with human authorship. 

As we monitor two dozen ongoing lawsuits and regulatory efforts that address various aspects of AI’s legality, we see legitimate legal questions that must be resolved. However, we also see some prominent yet flawed arguments that have been used to enflame discussions, particularly by publisher-plaintiffs and their supporters. For now, let’s focus on some clickbait arguments that sound appealing but are fundamentally baseless. 

Will AI doom human authorship?

Based on current research, AI tools can actually help authors improve creativity, productivity, as well as the longevity of their career

When AI tools such as ChatGPT first appeared online, many leading authors and creators publicly endorsed it as a useful tool like any other tech innovation that came before it. At the same time, many others claimed that authors and creators of lesser caliber will be disproportionately disadvantaged by the advent of AI. 

This intuition-driven hypothesis, that AI will be the bane of average authors, has so far proved to be misguided.

We now know that AI tools can greatly help authors during the ideation stage, especially for less creative authors. According to a study published last month, AI tools had minimal impact on the output of highly creative authors, but were able to enhance the works of less imaginative authors. 

AI can also serve as a readily-accessible editor for authors. Research shows that AI enhances the quality of routine communications. Without AI-powered tools, a less-skilled person will often struggle with the cognitive burden of managing data, which limits both the quality and quantity of their potential output. AI helps level the playing field by handling data-intensive tasks, allowing writers to focus more on making creative and other crucial decisions about their works. 

It is true that entirely AI-generated works of abysmal quality are available for purchase on some platforms. Some of these works are using human authors’ names without authorization. These AI-generated works may infringe on authors’ right of publicity, but they do not present commercially-viable alternatives to books authored by humans. Readers prefer higher-quality works produced with human supervision and interference (provided that digital platforms do not act recklessly towards their human authors despite generating huge profits from human authors).

Are lawsuits against AI companies brought with authors’ best interest in mind? 

In the ongoing debate over AI, publishers and copyright aggregators have suggested that they have brought these lawsuits to defend the interests of human authors. Consider the New York Times for example, in its complaint against OpenAI, NY Times describes their operations as “a creative and deeply human endeavor (¶31)” that necessitates “investment of human capital (¶196).” NY Times argues that OpenAI has built innovation on the stolen hard work and creative output from journalists, editors, photographers, data analysts, and others—an argument contrary to what the NY Times once argued in court in New York Times v. Tasini,  that authors’ rights must take a backseat to NY Times’ financial interests in new digital uses.  

It is also hard to believe that many of the publishers and aggregators are on the side of authors when we look at how they have approached licensing deals for AI training. These licensing deals can be extremely profitable for the publishers. For example, Taylor and Francis sold AI training data to OpenAI for 10 million USD. John Wiley and Sons earned $23 million from a similar deal with a non-disclosed tech company. Though we don’t have the details of these agreements, it seems easy to surmise that in return for the money received, the publishers will not harass the AI companies with future lawsuits. (See our previous blog post about these licensing deals and what you can do as an author.) It is ironic how an allegedly unethical and harmful practice quickly becomes acceptable once the publishers are profiting from it.

How much of the millions of dollars changing hands will go to individual authors? Limited data exist. We know that Cambridge University Press, a good-faith outlier, is offering authors 20% royalties if their work is licensed for AI training. Most publishers and aggregators are entirely opaque about how authors are to be compensated in these deals. Take the Copyright Clearance Center (CCC) for example, it offers zero information about how individual authors are consulted or compensated when their works are sold for AI training under CCC AI training license.

This is by no means a new problem for authors. We know that traditionally-published book authors receive around 10% of royalties from their publishers: a little under $2 per copy for most books. On an ebook, authors receive a similar amount for each “copy” sold. This little amount handed to authors only starts to look generous when compared to academic publishing, where authors increasingly pay publishers to have their articles published in journals. The journal authors receive zero royalties, despite the publishers’ growing profit

Even before the advent of AI technology, most authors were struggling to make a living on writing alone. According to an Authors Guild’s survey in 2018, the median income for full-time writers was $20,300, and for part-time writers, a mere $6,080. Fair wage and equitable profit sharing is an issue that needs to be settled between authors and publishers, even if publishers try to scapegoat AI companies. 

It’s worth acknowledging that it’s not just publishers and copyright industry organizations filing these lawsuits. Many of these ongoing lawsuits have been filed as class actions, with the plaintiffs claiming to represent a broad class of people who are similarly situated and (thus they alleged) hold similar views. Most notably, in Authors Guild v. OpenAI, Authors Guild and its named individual plaintiffs claim to represent all fiction writers in the US who have sold more than 5000 copies of a work. There’s also another case where plaintiff claims to represent all copyright holders of non-fiction works, including authors of academic journal articles, which got support from Authors Guild, and several others in which an individual plaintiff asserts the right to represent virtually all copyright holders of any type

As we (along with many others) have repeatedly pointed out, many authors disagree with the publishers and aggregators’ restrictive view on fair use in these cases, and don’t want or need a self-appointed guardian to “protect” their interests.  We have seen the same over-broad class designation in the Authors Guild v. Google case, which caused many authors to object, including many of our own 200 founding members.

Respect for copyright and human authors’ hard work means no more AI training under US copyright law? 

While we wait for courts to figure out the key questions on infringement and fair use, let’s take a moment to remember what copyright law does not regulate.

Copyright law in the US exists to further the Constitutional goal to “promote the Progress of Science and useful Arts.” In 1991, the Supreme Court held in Feist v. Rural Telephone Service that copyright cannot be granted solely based on how much time or energy authors have expended. “Compensation for hard work“ may be a valid ethical discussion, but it is not a relevant topic in the context of copyright law.

Publishers and aggregators preach that people must “respect copyright,” as if copyright is synonymous with the exclusive rights of the copyright holder. This is inaccurate and misleading. In order to safeguard the freedom of expression, copyright is designed to embody not only the rightsholders’ exclusive rights but also many exceptions and limitations to the rightsholders’ exclusive rights. Similarly, there’s no sound legal basis to claim that authors must have absolute control over their own work and its message. Knowledge and culture thrives because authors are permitted to build upon and reinterpret the works of others

Does this mean I should side with the AI companies in this debate?

Many of the largest AI companies exhibit troubling traits that they have in common with many publishers, copyright aggregators, digital platforms (e.g., Twitter, TikTok, Youtube, Amazon, Netflix, etc.), and many other companies with dominant market power. There’s no transparency or oversight afforded to the authors or the public. The authors and the public have little say in how the AI models are trained, just like how we have no influence over how content is moderated on digital platforms, how much royalties authors receive from the publishers, or how much publishers and copyright aggregators can charge users. None of these crucial systematic flaws will be fixed by granting publishers a share of AI companies’ revenue. 

Copyright also is not the entire story. As we’ve seen recently, there are some significant open questions about the right of publicity and somewhat related concerns about the ability of AI to churn out digital fakes for all sorts of purposes, some of which are innocent, but others are fraudulent, misleading, or exploitative. The US Copyright Office released a report on digital replicas on July 31 addressing the question of digital publicity rights, and on the same day the NO FAKES Act was officially introduced. Will the rights of authors and the public be adequately considered in that debate? Let’s remain vigilant as we wait to see the first-ever AI-generated public figure in a leading role to hit theaters in September 2024.