Category Archives: Blog

Government-Operated Platforms and the First Amendment in Schiff v. U.S. Office of Personnel Management

Posted March 25, 2025

This is a post authored by Maria Crusey, an intern with Authors Alliance and a third-year law student at Washington University School of Law. 

Introduction

Tension between government actions and freedom of speech under the First Amendment is nothing new. Since the adoption of the First Amendment, individuals and entities have alleged government actors violate free speech rights through actions ranging from the establishment of campaign finance laws to the imposition of book bans. In the digital age, allegations of free speech violations by government actors have started to take new forms that are largely made possible by technological developments. Most recently, in Schiff v. U.S. Office of Personnel Management, filed on March 12, 2025 in the District of Massachusetts, the plaintiffs allege direct censorship by the government through the takedown of select publications posted on a government-operated platform. There are few prior lawsuits in which plaintiffs allege direct censorship by the government on a government-operated platform, and most relevant lawsuits instead allege the government engaged in “censorship by proxy” through requiring online platforms to suppress particular kinds of speech and expression. Consequently, we can look to the paradigmatic “censorship by proxy” case, Murthy v. Missouri, to anticipate how a court might assess the legal arguments posited in Schiff.

The Complaint

The alleged First Amendment violation in Schiff arises out of the takedown of the plaintiffs’ scholarly publications from Patient Safety Network (PSNet), an online platform that publishes research articles and resources about patient safety. The plaintiffs argue that the removal only of articles that may “inculcate or promote” the government’s definition of “gender ideology” pursuant to a recent Executive Order imposes a viewpoint-based restriction in a government forum, PSNet, and the removal of such articles is not reasonable in light of the purpose of PSNet. Moreover, the plaintiffs argue they suffer ongoing undue and actual hardship and irreparable injury from the removal of their articles from PSNet and have no adequate remedy at law to correct this injury. As such, the plaintiffs seek to preliminarily and permanently enjoin the defendants from further censoring their research through implementation of the executive order.

PSNet is operated by the Agency of Healthcare Research and Quality (AHRQ), an executive branch agency within the federal government and a part of the Department of Health and Human Services. PSNet is a leading resource for information on patient safety in the United States, and all content is free and accessible to the public. PSNet’s scholarly publication facet is managed by its editorial team, a group of editors and a librarian that review submissions and select content for publication in various PSNet collections.

One such collection is PSNet’s Case Studies series. Publications in the Case Studies series are sourced from online form submissions by healthcare providers that include a case description of a given medical error and a short recommendation for how healthcare providers or systems might prevent similar errors from happening in the future and thus increase patient safety. PSNet’s editorial team selects submissions for publication based on a number of criteria, including clinical interest and educational value. Following selection of a case, the editorial team invites healthcare providers to submit a commentary based on the case. All articles published by PSNet include a disclaimer that “[r]eaders should not interpret any statement in this report as an official position of AHRQ or of the U.S. Department of Health and Human Services.”

Plaintiffs Gordon Schiff, M.D., and Celeste Royce, M.D., are associate professors of medicine at Harvard Medical School who publish research articles about patient safety in their respective medical specialties. Both have separately published articles about patient safety on PSNet. PSNet accepted a commentary written by Dr. Schiff and co-authors for a case on suicide assessment and prevention. In the publication process, the authors and the editorial team exchanged multiple drafts of the commentary. The second draft of the commentary included a sentence describing “high risk groups” for suicide as including individuals who identify as lesbian, gay, bisexual, or queer/questioning. The PSNet editorial team did not substantively modify this part of the sentence prior to publication, and the case commentary was published on PSNet on January 7, 2022 under the title “Multiple Missed Opportunities for Suicide Risk Assessment” and included the aforementioned disclaimer.

In a separate publication cycle, Dr. Royce and a co-author submitted a commentary for publication in the Case Studies series on delayed diagnosis of endometriosis. Like Dr. Schiff, Dr. Royce exchanged multiple drafts of her commentary with the PSNet editorial team in the publication process. The commentary included text stating that “endometriosis can occur in trans and non-gender-conforming people” and that lack of understanding of this fact could make diagnosis more challenging. No substantive comments or changes dealt with the statement that “endometriosis can occur in trans and non-gender-conforming people.” Dr. Royce’s case commentary was published on PSNet on June 24, 2020 under the title “Endometriosis Commentary” and included the aforementioned disclaimer.

On January 2, 2025, President Trump issued Executive Order 14168, which directed government agencies to combat “gender ideology,” which “replaces the biological category of sex with an ever-shifting concept of self-assessed gender identity,” by removing all statements that promote or otherwise inculcate gender ideology from federal platforms. The Office of Personnel Management subsequently issued a memo instructing all agencies, including AHRQ that oversees PSNet, to “take down all outward facing media that inculcate or promote gender ideology” and report all steps taken to implement this instruction. AHRQ removed articles from PSNet that contained words or terms that “inculcate or promote” the government’s definition of “gender ideology” and took down the plaintiffs’ articles on January 31, 2025.

Following the takedowns, PSNet’s editorial team shared separate emails with the plaintiffs explaining that Dr. Schiff’s article was removed due to inclusion of the words “transgender” and “LGBTQ” in his article and that Dr. Royce’s article was removed for its inclusion of the phrase “endometriosis can occur in trans and non-gender-conforming people” and the description that it may make diagnosis in those populations more challenging. AHRQ subsequently offered to repost the plaintiffs’ articles on the condition that the plaintiffs would remove the language in violation of the executive order. Both plaintiffs declined after AHRQ did not agree to their proposed revisions. As of March 12, 2025, both articles remain unavailable on PSNet.

So far as relief, the plaintiffs request that the court declare: (1) that the Office of Personnel Management’s internal direction to “take down all outward facing media . . . that inculcate or promote gender ideology” as applied to speech by private individuals and organizations on government-run forums is unconstitutional and unlawful and (2) that AHRQ’s implementation of the direction by removing or altering speech of private individuals and organizations on government-run forums is unconstitutional and unlawful.

“Censorship by Proxy” in Murthy v. Missouri
In recent years, the government has been sued for engaging in “censorship by proxy” through requiring digital platforms that widely share information to engage in content-moderation practices to comport with government standards. Multiple “censorship by proxy” suits are currently working their way through the federal courts, including one contesting state book bans and another contesting government-promoted “blacklist tools” that allegedly strip media organizations of advertising revenue, and the outcome of the suits appear to be fact-dependent. The most recent Supreme Court decision on “censorship by proxy” in Murthy v. Missouri found the plaintiffs were barred from alleging First Amendment violations due to procedural deficiencies in their claims. 

In Murthy v. Missouri, two states and five individual social-media users sued executive branch agencies and officials, alleging that they pressured social-media platforms to suppress protected speech. Specifically, the plaintiffs alleged that, during the COVID-19 pandemic, Facebook and Twitter deleted posts deemed to be false regarding measures to combat the spread of the virus, as well as “conspiracy theories” about the origins of the virus. The plaintiffs also alleged these same content-moderation practices were applied to content related to the 2020 United States presidential election. The district court issued the plaintiffs a preliminary injunction, finding that executive branch and agency actors likely coerced and significantly encouraged platforms to engage in content-moderation decisions that effectively were decisions of the government. The Fifth Circuit affirmed this part of the injunction, finding that both groups of plaintiffs had standing. The individuals had standing because the social media companies had suppressed the plaintiffs’ speech and were likely to do so again in the future, and the states had standing because the platforms limited the states’ “right to listen” to their citizens on social media.

On appeal, the Supreme Court reversed on the basis that neither the individual nor the state plaintiffs had standing. To have standing to bring suit in federal court, a plaintiff must demonstrate they have suffered or will suffer an injury that is concrete, particularized, and imminent; fairly traceable to the challenged action in the suit; and redressable through a favorable ruling by a court. As a preliminary determination, the Court found that the plaintiffs did not seek to enjoin the appropriate parties—the social-media platforms—for their “direct censorship injuries.” Because social-media platforms, not government actors, were alleged to implement the alleged censorship, the Court stated it was incapable of redressing injuries resulting from third party defendants not present in the suit. As to the traceability of the alleged injuries, the Court found the platforms had “independent incentives” to moderate content relating to the COVID-19 pandemic and 2020 presidential election. While the government defendants played some role in the moderation choices, there was no evidence to suggest every content-moderation decision alleged to be censorship was made under government direction. As such, the plaintiffs lacked a fairly traceable injury and did not have standing to sue.

The Court also found that the lower courts improperly enjoined the government defendants given the lack of standing. The fact that the plaintiffs sought forward-looking relief in the form of an injunction made any past alleged injuries relevant only if they were predictive of future censorship activities. The record did not include any specific findings about the causation of discrete instances of content moderation. Rather, the lower courts instead relied on statements that the platforms censored particular viewpoints on issues and the government defendants engaged in “a years-long pressure campaign” to ensure view-point suppression. The Court rejected these assertions as overly broad and unsupported by the record. Additionally, the Court found the plaintiffs misplaced their reliance on past government censorship as evidence that future censorship was likely. Because the plaintiffs failed to trace the alleged censorship by the platforms to the government’s role in the platforms’ content-moderation, their prior harms could not be used to establish standing to seek an injunction to prevent future harms.

The Court concluded by considering, and declining to agree with, the plaintiffs’ counterarguments in turn. First, the Court found the plaintiffs’ allegation that they suffered ongoing harm from having to self-censor on social media did not support standing. Specifically, as provided in precedent caselaw, the plaintiffs could not “manufacture standing merely by inflicting harm on themselves based on their fears of hypothetical future harm that is not certainly impending.” Second, the plaintiffs’ argument that the platforms continued to suppress their speech per policies initially adopted at the direction of the government failed due to a lack of redressability. The Court indicated that the lack of ongoing pressure from the government made the platforms’ continued content-moderation attributable only to the platforms, not the government, even though the practices may have initially been imposed due to governmental coercion. Moreover, as the government scaled back pandemic response measures, the platforms continued to enforce COVID-19 misinformation policies to the same degree as during the pandemic. Collectively, the Court found these facts suggested that the ongoing harm could only be redressed by legal action against the platforms, not the government, and that such redress could not be pursued in the suit.

Implications of Murthy v. Missouri in Schiff

It is difficult to anticipate how a court’s analysis will proceed in Schiff , but similarities and differences between the Supreme Court’s decision in Murthy and the pending suit in Schiff provide insights as to how a court may assess government censorship imposed through a government-owned platform.

Like the plaintiffs in Murthy, the Schiff plaintiffs sued for a preliminary injunction against the government actors in the case. As a result, they undoubtedly must fulfill the standing requirements necessary to bring suit for an injunction. Several aspects of the Schiff plaintiffs’ situation may help them avoid the pitfalls the Murthy plaintiffs faced in their standing analysis on appeal. 

First, the harms alleged by the two Schiff plaintiffs appear to be supported by allegations that are more detailed than those of the Murthy plaintiffs. The Murthy plaintiffs alleged sweeping allegations of government censorship on behalf of multiple individuals and states in a single action, and, as noted by the Supreme Court, the Murthy plaintiffs’ amended complaint provided few specifics about the causation of individual instances of censorship by particular government defendants. Conversely, the Schiff plaintiffs plead specific facts about a few particular government agencies and particular government actors who engaged in discrete acts to takedown the plaintiffs’ online publications in accordance with a particular order issued directly by the executive branch. On their face, these facts appear better positioned to allege a “concrete and particularized harm.”

Second, the traceability of the harms to the plaintiffs, as well as their imminency, may be more easily demonstrated in Schiff. The primary pitfall of the Murthy plaintiffs’ standing argument was the lack of traceability of the act of censoring through content moderation to the government defendants because the government was not involved in every content-moderation action taken by the third-party social media platforms. In Schiff, the lack of a non-government third party in the implementation and execution of the publication takedowns may strengthen an argument of traceability and prevent the disconnect between government actors and conduct like that in Murthy. Moreover, the facts that the Schiff plaintiffs’ publications were taken down pursuant to an ongoing executive order; that the plaintiffs were not offered viable alternative methods to restore their online publications in accordance with their protected speech; and that the publications have not been republished on PSNet as of March 12, 2025 could suggest the executive order poses an ongoing First Amendment violation.

Third, redressability of the harms to the plaintiffs through enjoining the defendants may be more successful in Schiff than in Murthy. In Murthy, the Supreme Court found redress could not be achieved through enjoining the government defendants given the defendants’ lack of ongoing involvement in the alleged censorship. Since there was no evidence that the platforms’ current content-moderation practices were pursuant to government directives, enjoining the government would not remedy the plaintiffs’ ongoing First Amendment violations. In Schiff, the fact that the plaintiffs’ alleged ongoing First Amendment violations (removal of their online publications from PSNet) are driven by a government directive (the executive order), enjoining the government defendants from enforcing the executive order may be more likely to redress the plaintiffs’ ongoing harm.

Additional Considerations in Schiff

Another feature of Schiff that warrants discussion is the justification provided by the U.S. Office of Personnel Management for adopting its guidelines that implement the executive order to the operations of the AHRQ. Per the complaint, Charles Ezell, the Acting Director of the U.S. Office of Personnel Management, cited 5 U.S.C. § 1103(a)(1) and (5) as grounds for his authority to provide the guidelines as the Director is charged with “securing accuracy, uniformity, and justice in the functions of the office” and “executing, administering, and enforcing” the civil service rules and regulations of the President and the Office and laws governing the civil service. Both of these responsibilities of the Director may be implicated in the suit as arguments advanced by the government to justify the implementation of the executive order.

First, the government could argue that the Director’s implementation of the guidelines and takedown of the plaintiffs’ publications were appropriate given his charge to secure the “accuracy” of the office. The plaintiffs’ publications that allegedly “promote or inculcate gender ideology” by suggesting there are only two sexes could be argued to be “inaccurate” in the eyes of the executive branch and thus takedown of the articles was necessary to further “accuracy” in the functions of the office. 

Second, the government could argue that the Director’s implementation of the guidelines and takedown of the plaintiffs’ publications are necessary as an execution and enforcement of the “civil service rules and regulations of the President and Office.” This argument would hinge upon whether the executive order can be classified as a “civil service rule and regulation.”

If the plaintiffs allege a viable First Amendment claim, both of the above arguments could play into the justification the government would need to provide to excuse the First Amendment violation. The plaintiffs allege the takedown of their publications constitutes a view-point discriminatory restriction on their protected speech. If the executive order is found to be a view-point discriminatory restriction, the First Amendment violation would be assessed under a “strict scrutiny” standard of review, in which the government would have to demonstrate that the executive order is “narrowly tailored” and necessary to achieve a “compelling governmental interest.” 

On its face, the executive order appears quite broad with its order that agencies remove all statements that “promote or inculcate gender ideology.” As such, it is uncertain whether it would satisfy the “narrowly tailored” requirement. Moreover, it is uncertain whether combating gender ideology and promoting a policy that there are only two sexes pass muster as a “compelling governmental interest.” The fact that this policy is enshrined in an executive order may heighten the importance of the interest. However, the language justifying the policy in the executive order itself is vague, providing only that “deny[ing] the biological reality of sex . . . is wrong” and “[t]he erasure of sex in language . . . has a corrosive impact not just on women but on the validity of the entire American system.” Consequently, whether the policy advanced by the executive order constitutes a compelling governmental interest may similarly be an open question.

Conclusion

As the Schiff plaintiffs await an appearance from the defendants in the suit, it is uncertain how broadly the executive order will continue to be applied. The plaintiffs in Schiff report that the Office of Personnel Management guidelines resulted in the removal of at least 20 total publications from PSNet, but the number of takedowns attributable to the executive order on other government-operated platforms is unknown. As proceedings in Schiff progress and additional suits may be filed that similarly allege First Amendment violations by the executive order, direct censorship by the government through government-operated platforms is sure to become an even more contentious issue.

New White Paper on Federal Public Access Policies

Posted March 20, 2025
Sunset picture of the Eisenhower Executive Office Building, home to the Office of Science and Technology Policy.
Eisenhower Executive Office Building, home to the Office of Science and Technology Policy (Official White House photo, by Carlos Fyfe; Public Domain; Source: Wikimedia Commons)

Authors Alliance and SPARC have released the second of four planned white papers addressing legal issues surrounding open access to scholarly publications under the 2022 OSTP memo (the “Nelson Memo”). The white papers are part of a larger project (described here) to support legal pathways to open access.

This first paper discussed the “Federal Purpose License” and how it supports federal public access policies under the Nelson Memo. This second paper discusses legal landscape surrounding the Federal Purpose License and the public access policies in light of concerns that the policies are not permissible government actions.  The white paper explains why they are. 

The White Paper is available here. Supporting materials, previous papers, and other formats are available here.

In the last couple of months there has been a lot of change in the Federal grants space, but so far the public access policies, including the latest announced by the Nelson Memo, are still in place. Several agencies have already implemented their responses to the Nelson Memo through regulation; the rest are due to finish the task later this year.

For Federal agencies to act permissibly, their actions must be grounded in valid Congressional delegation of authority.  Congress can’t escape its limitations by delegating beyond its own authority, so to be valid delegation the actions also must be permissible actions for Congress.  The first part of the paper examines Congress’s constitutional power to provide grants for research and development, finding support under both the Spending and Progress clauses. 

The Federal Purpose License places a condition on acceptance of grant funds. Congress doesn’t have unlimited power to place conditions on grants, as established by the Supreme Court  in South Dakota v. Dole. However, the Federal Purpose License safely falls within the limitations placed by Dole.  In particular, the Federal Purpose License as a condition does not violate either the First Amendment’s Speech Clause or the Fifth Amendment’s Takings Clause.

The second part of the paper looks at Congress’s delegation of authority and the agencies’ development of the policies. The paper explains how Congress expressly—and permissibly—delegated power and obligation to create the prototype of the public access policy to the National Institutes of Health, and how subsequent application of the policy to the rest of the grant-making agencies is strongly supported by principles of implicit delegation, and were established through appropriate rulemaking. Though the recent case of Loper Bright Enterprises v. Raimondo may require agencies to satisfy a somewhat higher burden when defending their actions, the Supreme Court’s abandonment of the “Chevron doctrine” does nothing to change the permissibility of the public access policies or the use of the Federal Purpose License.

The next paper will examine the interaction between institutional intellectual property policies and federal public access policies, and the final paper will discuss issues surrounding article versioning. Watch this space for more!

Thaler v. Perlmutter: D.C. Court of Appeals confirms that a non-human machine cannot be an author under the U.S. Copyright Act

Posted March 19, 2025
A Recent Entrance to Paradise, an image generated by Steven Thaler’s “Creativity Machine.”

Yesterday, the U.S. Court of Appeals for the District of Columbia Circuit issued its ruling in Thaler v. Perlmutter, a case centered on the question of whether a non-human author, without any intervention from a human, could be an author and hold copyright under the U.S. Copyright Act. The court found that a non-human machine could not be an author under the Act.  

In 2018, Steven Thaler filed an application to register a copyright claim in A Recent Entrance to Paradise with the U.S. Copyright Office. In that application, the author of the image was identified as the “Creativity Machine,” with Thaler listed as the claimant with a transfer statement: “ownership of the machine.” In his application, Thaler stated that the work “was autonomously created by a computer algorithm running on a machine” and he was “seeking to register this computer-generated work as a work-for-hire to the owner of the Creativity Machine.” The Copyright Office refused to register the work and later affirmed the denial of registration. Thaler subsequently sued the Copyright Office and lost.  He appealed and has now lost again.  

In virtually every way, this decision should not be surprising. While it is absolutely conceivable that the product of AI and human collaboration may result in copyrightable works, it is well settled law that non-human authorship is not recognized under the U.S. Copyright Act. This opinion is mostly a repetition of the positions taken by the U.S. Copyright Office in its denial of registration.

That acknowledged, there are some points worth highlighting from the opinion:  

  • First, the court centers much of its analysis on the text of the Copyright Act and the myriad ways in which the statutory language is dependent on humans as authors. Taken together, the Act is unarguably one that is built upon the premise of human authorship. The court says: “All of these statutory provisions collectively identify an “author” as a human being. Machines do not have property, traditional human lifespans, family members, domiciles, nationalities, mentes reae, or signatures.”
  • Part of the court’s analysis is focused on whether the public would benefit from granting copyright to machine-authored works and ultimately concludes that it would not.  The court says: “But the Supreme Court has long held that copyright law is intended to benefit the public, not authors. Copyright law “makes reward to the owner a secondary consideration. ‘[T]he primary object in conferring the monopoly lie[s] in the general benefits derived by the public from the labors of authors.’” 
  • It is important to remember that this opinion is only about the narrow question of whether a machine, working in isolation and with no human intervention, can be considered the author of a work. We should be careful not to try to extend this opinion beyond that. “Those line-drawing disagreements over how much artificial intelligence contributed to a particular human author’s work are neither here nor there in this case. That is because Dr. Thaler listed the Creativity Machine as the sole author of the work before us, and it is undeniably a machine, not a human being.” 
  • Finally, the district court found that Dr. Thaler had waived the argument that, as creator of the Creativity Machine, he was the work’s author.  The Court of Appeals found that Dr. Thaler had not challenged that waiver and that it therefore could not address the question of whether works generated by Artificial Intelligence might be authored by the creator of the AI. (“Dr. Thaler argues that he is the work’s author because he made and used the Creativity Machine. We cannot reach that argument.”) This leaves some ambiguity as to whether a future creator of an AI might successfully claim copyright in a work themselves. It also leaves open questions where the human user of AI claims to be the author of an AI-generated work or portions of a work. This is the question the court will have to address head-on in Allen v. Perlmutter, a case currently pending in Colorado. We will continue to watch this space, and share with you any new developments.  

Ultimately, the Thaler v. Perlmutter decision is limited to the fact that a machine cannot be an author under copyright law. This is a sensible result and consistent with sound public policy. 

AI Licensing: An Interview with Ben Denne of Cambridge University Press

Posted March 17, 2025

We’ve heard from lots of authors with questions about AI licensing of their works by their publishers. Cambridge University Press is one that has been in the news because it has undertaken a project to ask authors to opt into a contract addendum that would allow CUP to license AI rights for their books, giving authors a royalty on AI licensing net revenue. Cambridge has shared an FAQ with authors already, along with a further explanation of its approach last September and a report in January highlighting that it had contacted some 17,000 authors, the majority of whom have opted in. 

Below is an interview with Ben Denne, Director of Publishing, Academic Books, at Cambridge University Press, answering some questions about the program. 

Dave: Thank you, Ben, for talking with me. To start off, could you say what your role is at Cambridge University Press?

Ben: I’m the Director of Publishing for the Academic Books part of the Academic Division of Cambridge. In short, I’m the director overseeing the whole of the books program, the Academic Books program for Cambridge, except for the Bibles. That’s a specialist unit that runs separately that I don’t have anything to do with, but that means our textbooks, our research and reference books, and then we have a kind of small program of more traditional academic titles that sell a bit more to a bit of a wider audience.

Dave:  Thanks. My interest in talking with you is about generative AI licensing. And we’ve had quite a few authors actually forward us some emails that they’ve gotten from Cambridge presenting an AI license addendum to sign that goes with their contract and also an FAQ. I’d like to ask just a few questions about how that’s going and how that works.

What are Cambridge University Press’ goals with AI licensing?

Ben:  That’s a really good question. Broadly speaking, when this started to come our way, which was the same time a couple of years ago as this subject became really noisy. We’re looking at it and thinking, what’s the best way through this? How do we appropriately engage in this conversation? And I think it came back to us thinking about encouraging responsible use and thinking about our role as an academic publisher. 

And I think our role as an academic publisher is to push the academic debate forward, which means that we want our authors’ books to get read. We want them to get used. We want them to get cited. I think that’s really the kind of spirit we came into this conversation with is thinking, these developments are happening, right, that they’re happening anyway and the best thing we can do as a publisher is try and engage with this debate and push it in a direction that we think really helps to underline those principles of how good research is done.

Dave:  One of the things that I’ve seen with CUP’s rollout with this asking authors is, first of all, that you are asking authors.  Could you talk me through that decision? We’ve seen some other publishers in the news just announce that they have licensing deals with technology companies, and there was no outreach to authors as far as we can tell from those publishers. So could you talk through that thought process of this outreach?

Ben: Sure, so for us, when we first looked at this, we have a contract that authors sign, which is, probably in many ways, very similar to contracts that they signed with other publishers, and it includes all sorts of clauses about use and wide ranging licensing rights.  And one of the things it covers is derivative uses for content and the right to make derivatives. Our sense with that is when we looked at this in the context of these AI conversations and licensing, from a legal perspective, we looked at that and thought, well actually that derivative use clause technically does cover us for this kind of work. And I’m sure that’s the conclusion that some other people have reached too.

But we also thought, it just feels a bit like nobody knew that this kind of technology was emerging when they signed those contracts. And so from our perspective, we thought there’s a lot of noise about this subject in the whole ecosystem right now, you know, you can’t read the news without reading about AI, and people are nervous about it, understandably, and all of those kinds of things. So we felt that we should treat this as additional consent and approach it in that spirit. And that really underpins the decision to go out with the addendum for existing contracts.

I don’t want to jump onto any of your other questions, but that kind of principle, that we were going to ask for opt-ins, was important. Authors have to actively opt into this. We’re not saying to them. “if we don’t hear from you, we’ll assume you’ve opted in.” They have to actually come back to us and say that they’re happy for that use to happen.

Dave:  I think one of the things a lot of people don’t think about is how complicated rights clearance is, especially at scale, across a title list that is the size that you have. So this seems to me like a pretty big investment in just doing this process. Could you say how many of these  you have sent out? I gather that you’re doing this in batches, but do you have a sense of the scale of how many author addendum requests you anticipate making over the course of however long this process lasts?

Ben: It’s a really good question and it’s a moving target. At this stage, we have sent out multiple thousands. But I think we have about 45,000 books available in print and digitally at the moment. And we’re working our way through that list systematically. So we’re in the thousands and you’re right. It is a pretty big undertaking you know it’s quite a logistical challenge to do. We had to set up a whole kind of new workflow for doing this. We have a team that are working on the addenda and addressing the questions that authors have and all of those kinds of things.

Dave: This is maybe getting in the weeds, but it seems to me like there’s a pretty big difference between figuring this out for a sole-authored, single-part monograph, for instance, which is mostly what I’ve seen come through, and edited volumes. Have you tried to figure out those more complex books with multiple authors, multiple works within them?

Ben: Yeah, so the way it’s working for us is where we have several contracted authors for a book, we’re contacting them all and all of those authors have to opt in in order for us to agree that we have the licensing rights.

For edited volumes with multiple contributors, we’re not contacting the individual contributors for opt-in and there are a couple of different reasons for that. Typically, they don’t get paid royalties and also it would just be impossible for us to do. I mean, that’s logistically, you know, that’s a huge ask. So what we are doing is we are still contacting the editors for those volumes and the editors will opt in or not. So if the editor opts in, our understanding is that they’re opting in on behalf of the contributors as well.

But for multi-authored works, we get in touch with all of them. And in fact, we have quite a sizable number of books which are stuck because we’ve had some authors opt-in and some authors not opt-in.

Dave: This is a pretty fast-moving technology and I think a lot of authors are feeling just uncertain right now. And so I wonder about the opt-in window, if an author declines to opt in right now, is that it? Is there an opportunity to come back later after the dust settles and say, oh, no, actually, you know, I’d be happy to have my work used in this way? 

Ben:  Yeah, definitely. We’re in the process of putting something in place so that if authors don’t opt in now they are able to come back and opt in later. And by the way, if they don’t opt in, that’s fine for all the reasons that you just said;  some people are queasy about this and that’s okay. We’re not trying to, we’re not putting a hard sell on it

My sense with this is that for some of the people that we’re speaking to who haven’t opted in, it is because they haven’t yet really seen what the kind of use cases are for this kind of technology. Perhaps as those become more public, people will want to come back and opt in. 

I think some of the things that are out there are going to be quite powerful discovery tools in the future. So we want to make sure the authors do have the opportunity to opt in later if they want to, although we can’t, of course, be sure that if people opt in later the same opportunities will necessarily be available then, since this is quite a fast moving area.

Dave:  For your contracts moving forward for front list books, is a clause like this now a default in those agreements or will authors of new books have the option to opt-in or opt-out for AI licensing?

Ben: Good question. Currently, we have put a clause into our contracts to add AI licensing. But, where authors are asking us to remove that clause, we’re taking it out.

And again, coming back to your point before, those authors could opt in later. But for the contracts as they go out, we have it in as a clause now.

Dave:  Okay. So let’s shift to if you’re gathering all of these rights from authors, presumably at some point, then you would actually engage in the licensing with technology companies or others.  Could you say a little bit about that? Do you have any deals in place with tech companies already?  Or, the other thing that I’ve seen is, some publishers have been in the position of not doing those deals directly, but having sort of sub-licensing deals with others- I understand Proquest Clarivate is doing this. And I think Wiley is as well. Do you have any of those deals in place now?

Ben:  We’re still having those conversations at the moment. And we are talking to a range of different people who are looking at this kind of content. 

Dave:  Okay, that’s really helpful to know.

At the beginning, you talked a little bit about Cambridge University Press’s motivations with engaging in this space and doing licensing. Could you talk a little bit about important factors for what might show up in one of those kinds of deals with tech companies?  For instance, one of the things that I think aligns with the sort of values that you outlined at the beginning and that authors care a lot about is credit, right? We know that, especially for academic authors, credit is incredibly valuable and important. And so I wonder if you’ve thought about how ensuring author credit might factor into any sort of downstream deal that CUP might engage in?

Ben: Absolutely. So we’re having exactly those conversations at the moment with anybody that we’re talking to. And we’ve been very clear with our authors when they’ve asked questions about this, and you may have seen this alluded to in some of the information that you’ve had forwarded to you from authors, that those principles of attribution are 100% what we’re focused on.  Really, they’re kind of a red line for us. 

One of the things we’ve been in lots of conversations with people around this technology is the question of at what level does content need to be attributed? Our sense with this is that any kind of meaningful extract from somebody else’s work needs to be cited. 

I’m kind of repeating myself, but that’s how research works. People build on other people’s work, and so in a scenario where content is being ‘discovered’, if we can’t identify and cite that content, it can’t be accurately attributed. So that’s a red line for us.

Dave: Right.  I think figuring out that attribution, like at what level does that attribution need to kick in, is a really tricky thing. It seems to me, that if you’ve got a foundation model that is pulling in some texts and then someone’s using, say ChatGPT to write emails and somewhere in the model it gleans some structural components from sources like academic books,  I don’t think that’s the thing most authors care about – being cited for the fact that you help train this model to understand how to format citations or do other things like that. It’s the intellectual content that matters and that’s the really tricky piece of it.

Ben: Absolutely and I don’t have an easy answer for you there. So we’re having those conversations at the moment, but our sense is that any sort of direct quote, anything that could be, you know, anything that you would consider to be plagiarism or worthy of credit in a non-AI world should be attributed.

Dave:  I realize this question is asking a hypothetical because you don’t have any of these agreements in place yet, but it seems to me there’s a pretty big difference between use of Cambridge books for model training and uses such as for Retrieval Augmented Generation (RAG).

Have you thought about those distinctions in terms of how that might affect differences in Cambridge’s willingness to set a price on those things?  I assume retrieval-augmented generation (RAG) would come with a higher licensing price than others. But could you talk me through that thought process?

Ben: So it’s kind of interesting because I think there’s a little bit of a gray area,  because I think a lot of the RAG tools are combined with some aspect of LLM. So they might belooking to summarize some research or write a brief about X, Y, and Z.

I think it is quite interesting at the moment that most of the questions we get from people who are worried about this are really anxious about LLMs, but I feel like the really exciting place for academia and research is around that kind of retrieval augmented generation because that’s what’s going to help with discoverability for authors.  It is difficult to talk about at the moment because we don’t have any public deals that I can point to. But I’d say a lot of the conversations that we’re having are somewhere between those two things, you know, so it’s a combination of an  LLM that’s generating text and a citation engine or discovery engine sitting over content.

Dave: Leaving aside the legal situation for a moment, one of the things that I hear from authors pretty consistently is the sentiment that with these big technology companies coming in, they feel that these companies are sort of profiting off of content; that they are exploiting. And so they ought to return something to the system and to authors.

But there’s a really different sentiment about what happens when you have, say, academic researchers using content for AI or text data mining purposes to make new discoveries or learn new things both about the texts and about the world around them. We work a lot with text data mining researchers who are interested in large aggregations of content, not so they can build the next OpenAI,  but so they can understand how language has changed over time, or how has culture changed over time.

I wonder from CUP’s perspective, how do those two different kinds of use cases factor into your thinking about downstream licensing deals for AI/ text data mining?

Ben: Yeah, I think for us that the primary thing we’re really trying to lean on, because of course the whole thing is not quite that clear cut, because a lot of the time it’s the big tech companies that are facilitating a lot of that discovery or that a lot of the kind of discovery traffic goes through them. So I think from our perspective, I’m going to say we’re not ruling out working with anyone. We would put anybody– any partner that we had– through the same diligence process that we would have with onboarding anybody else, but we wouldn’t rule out those conversations with anybody. I think for us, the most important thing is coming back to, and I’m going to sound like a stuck record here, but those principles of attribution. And we have had conversations, some preliminary conversations with people who’ve said, “Well, we don’t think it would be possible to do what you’re asking,” and at that point, we’re saying, “well, okay, then you know that’s the red line for us.”

I think there’s quite a bit of cloudy territory between those two things. And I think for us, the most important thing is to make sure that authors are being credited where their work’s being used.

Dave:  All right,  I have a hypothetical that I wanted to give you. So we see that it’s a 20% royalty calculated on net revenue. Let’s say you received $5 million from an AI licensing deal. Can you walk me through how that might work out for the author? How do you calculate net revenue on that? And then, how that the individual author sitting there sees CUP signs a big deal. What can they expect?

Ben: That’s a tricky one because it would depend a little bit on the terms of the deal as well. But broadly speaking, the principle is, if that’s the net revenues that we receive, so in your situation, you had five million in there, the full licensing payment, is divided out across the list of titles. Authors then earn the royalty for that sale or license type per title, as they do now with all other forms of licensing. 

But, then, where a licensee can provide accurate title-level usage within their royalty statements, this would instead be used. So in an LLM situation that you were just talking about, that would be divided among those books.  With the retrieval augmented generation tool, I think that would work much more around the basis of usage. So, depending on what searches within that tool were bringing back particular content, then we would be attributing revenue that way.

Dave: Okay, that makes a lot of sense. I think this was in the FAQ: one of your use cases is in an authoritative database that’s used on a perpetual basis. But there was somewhere that talked about the removal of content once a licensing term has ended. I wonder if you’ve developed thinking internally about what a standard term would be, how long these things might last? 

Ben:  Yeah, I mean, it’s hard, isn’t it? Because where you’re licensing content to train an LLM, it would be sort of insincere to dress that up. Generally most agreements would be governed by a 2-5 year training term and at the end of that term the training data set would be destroyed, however, they would retain the output from the specific models that were developed during the training term. If they wanted to create new models they would need to renew the license/extend the term. 

For some of the other uses that’s all being discussed at the moment. I think there is still work on this, but there would be standard partnership length terms. What I would say is that from our perspective, we think it’s quite likely in the next few years, the focus will move more away from training large language models and into that area of discovery that these are going to become quite important revenue streams for academic publishers. 

Dave: Thanks, very helpful. As you work on these deals, what level of transparency do you plan on offering authors or the general public about what these licenses might look like? At least with other publishers, it’s been quite mysterious – I think with one, we learned about an AI licensing deal in a quarterly earnings report, for instance. I think authors do really care about what the details of these deals look like. 

Ben:  It’s tricky, isn’t it? It’s hard for me to talk about a deal that hasn’t been done already, and of course, these deals can be subject to the same commercial confidentiality requirements as any other partnership. But I think it’s fair to say that Cambridge University Press would endeavor to be pretty transparent about what we’re doing generally and most importantly, be transparent about why we’re doing it. So I don’t think we’d be concealing that information from anybody. And coming back to my point before, we’ve been quite clear that we only want to enter into these kinds of conversations with people that we think are using content responsibly, and we’d always aim to be open.

Dave: A few final questions. First, CUP has published a number of open-access books. For example, I believe CUP was part of the TOME initiative.  Do you feel like this kind of addendum is necessary for those open-access books, given that they already have some sort of open license attached to them? Or do you think that this is a necessary addition to those OA licenses? 

Ben: That’s a really good question, and it’s something that we’re grappling with at the moment. Without getting into the kind of weeds around open access, some of it depends on the license. Historically for books, our default license open access license was a Creative Commons CC BY NC license, which prohibits commercial reuse. I think at the moment, we’re looking at that (and I think a lot of publishers would say the same thing) and working through how that fits with AI licensing with commercial AI companies. The short answer to your question is if you have a CC BY license, then, people do have a broad license to reuse that content. So at the moment, we’re not actively going after those authors for opt-ins, nor are we including those books in licensing deals.

That we’re doing, but that’s also a relatively small number of books. I can say, we are now looking at using more CC-BY-NC-ND as the default, which restricts the creation of derivative works. You’ve touched on a conversation that is evolving, but we would be treating AI usage as requiring a derivative license and therefore not covered under a CC-BY-NC-ND license. 

Dave:  Thanks, that’s very helpful and I think that’s something a lot of authors are trying to figure out: how does AI downstream use factor into Creative Commons licensed works? And of course, the underlying legal situation matters. I didn’t ask, but I assume that the rights that you’re asking for in this addendum are worldwide, since that affects for example whether usage might be permitted under national law. 

Ben: Yes, the rights are worldwide. 

And thinking again about that, I mean, it’s interesting, isn’t it? Because even under the CC-BY license, it doubles down on that principle of attribution as well. That’s the nature of the license so some uses even then may not be covered by that license.

Dave:  Right. That attribution piece under the CC-BY license will be an important one [note: this issue is being litigated, most prominently in the Doe v. Github suit]. And then, there’s also the underlying question of what the law allows independently even if there is no license–open license or otherwise. I know right now there’s a consultation that just closed in the UK about what the law should be, and in the US, we’re fighting these things out in the courts. I think there are 39 lawsuits right now pending about various aspects of this, and a key question in most of them is just how far fair use goes. And of course, you know, if fair use applies then you don’t have to worry too much about what the license says, whether it’s CC BY or CC BY NC ND or anything else.  This is like reading tea leaves but I think the prevailing case law indicates that model training and coming up with the weights has a pretty strong fair use case, but for the output side, that’s where I think it starts to stumble a little bit when you’ve got systems that are producing outputs that are substantially similar to the inputs. So I wouldn’t be surprised if in some of these suits, we get a ruling in favor of fair use and then in some of them we get a different outcome. And then, the landscape is just sort of messy.

And I suppose in the UK, I imagine y’all are watching what that legal landscape looks like around the world as it’s changing.

Ben:  Yeah, absolutely. 

Dave: One final question: we’ve talked a lot about licensing books for AI, but CUP has a substantial journal portfolio as well. Can you say anything about CUP’s approach to use of journal content either as AI training data or for other AI uses? 

Ben: We’ve been more focussed on books, as this is where most of the demand has been to date, but we have seen a developing interest in journal content. We are, therefore, currently exploring this form of licensing in a consultative way with our journal partners. 

Dave:  Well, thank you for talking. And this was really, really helpful. And I think that this will be useful for authors who are trying to understand just more about what’s going on.

Ben:  It’s been a pleasure.

Authors Alliance Comment on US AI Action Plan

Posted March 14, 2025

Today, we submitted a response to a Request for Information from the Office of Science and Technology Policy (OSTP). The OSTP is seeking to develop an “AI Action Plan,” to sustain and accelerate the development of AI in the United States.  As an organization dedicated to advancing the interests of authors who wish to share their works broadly for the public good, we felt it imperative to weigh in on critical copyright and policy issues impacting AI innovation and access to knowledge.

In our response, we reaffirmed our belief that the use of copyrighted works specifically for AI training (distinct from other AI uses) is a quintessential fair use. We noted that Section 1202(b) of the Copyright Act has little utility and serves as an unnecessary stumbling block to the development of AI. We also highlighted the importance of high quality training data and pointed towards the work that is already being done to develop AI training corpora.  

A Few Key Points from Our Submission

Our response to the OSTP highlights several key areas where federal policy can support both authors and a thriving AI research environment:

1. The Role of Fair Use in AI Model Training

We emphasize that fair use has long been a cornerstone of innovation in the U.S.—enabling everything from web search engines to digitization projects. US Copyright law has played a major role in both developing the incredible creative industries homed in the US, as well as driving leading scientific research and commercial innovation. The key to this innovation policy has been a thoughtful balance between providing a degree of control over copyrighted works to copyright holders while allowing for flexibility when it comes to technological innovation and new transformative uses. AI development relies on the ability to analyze large datasets, many of which include copyrighted materials. The uncertainty surrounding the legal status of AI training data due to ongoing litigation threatens to slow innovation. We urge the federal government to explicitly support the application of fair use to AI training and provide much-needed clarity.

2. Addressing the Contractual Override of Fair Use

Many AI developers face contractual barriers that limit their ability to make fair use of content, particularly in text and data mining applications. We recommend legislative measures to prevent contracts from overriding fair use rights, ensuring that AI researchers and developers can continue innovating without undue restrictions.

3. Access to High Quality Datasets

Access to high-quality datasets is a foundational pillar for AI development, enabling models to learn, refine, and iteratively improve. However, the availability of such datasets is often hindered by restrictive licensing agreements, proprietary controls, and inconsistent data standards. To maximize the potential of AI while ensuring ethical and legally sound development, collaborations between academic institutions, libraries, public archives, and technology developers are essential. Government policies should facilitate public-private partnerships that allow for robust and thoughtfully curated datasets, ensuring that AI systems are trained on a rich range of representative materials.

We invite our community of authors, researchers, and policymakers to review our submission. Your engagement is crucial in shaping a responsible and forward-thinking AI policy in the U.S. You can always reach us at info@authorsalliance.org

Visit Authors Alliance’s Booth at AWP 2025

Posted March 11, 2025
Our booth at AWP 2024

This post is by Syn Ong, an LLM student at U.C. Berkeley Law School. This semester, Syn has been working as our intern on a project to examine the legal landscape for text and data mining and AI research across different jurisdictions. If you’re attending AWP this year, stop by our booth to meet Syn and ask about her work!

The Authors Alliance is excited to announce our participation in the 2025 Association of Writers & Writing Programs (AWP) Conference & Bookfair, taking place March 26–29, 2025, at the Los Angeles Convention Center. We invite all attendees to visit us at Booth T524, where we will be available to connect with writers, answer questions, and discuss the latest developments in publishing, copyright, and authorship.

We’ve participated in AWP in past years, engaging with authors on issues that matter most to them. In previous conferences, we’ve hosted discussions on authors’ rights, book contracts, and publishing strategies. This year, we’re continuing these important conversations at our booth, where you can speak with us about protecting your rights as an author, expanding the reach of your work, and navigating today’s evolving publishing landscape.

What We’ve Been Working On

One of the biggest challenges we’ve addressed recently is how artificial intelligence is affecting authorship—from AI-generated content to concerns about copyright and fair compensation. We maintain a resources page on AI to help authors navigate AI’s impact on their work, and we encourage you to check out our summary of the report from the U.S. Copyright Office on copyrightability and AI.

We’ve also been working closely with academics and researchers to make open access publishing more accessible and sustainable. Whether you’re a scholar navigating institutional requirements or an independent researcher looking to share your work widely, we can help you understand your options for open-access publishing. 

Protecting Your Rights as an Author

We know that authors care about maintaining control over their work, which is why we provide guidance on areas like negotiating book contracts, rights reversion, and termination of transfer. Whether you’re signing a contract for the first time or looking to regain rights to an older work, we can guide you through the key issues to watch for. We encourage you to check out our resources on rights reversion and termination of transfer for more details. If you have questions about how to advocate for fairer publishing terms, come speak with us at Booth T524.

Stay Connected With Us

Beyond AWP, there are many ways to engage with Authors Alliance. You can become a member to access exclusive consultation and support our advocacy efforts, subscribe to our newsletter for updates on authors’ rights and policy changes, and follow us on social media to stay engaged year-round.

The AWP Conference & Bookfair is a cornerstone event for the literary community, and we’re thrilled to be part of it again this year. We look forward to meeting writers, sharing our expertise, and discussing how we can work together to protect authors’ rights and expand access to knowledge. Join us at Booth T524 March 26–29, 2025—we can’t wait to see you in Los Angeles!

Updates on AI Copyright Law and Policy: Section 1202 of the DMCA,  Doe v. Github, and the UK Copyright and AI Consultation 

Posted March 7, 2025
some district courts have applied DMCA 1202(b) to physical copies, including textile, which means if you cut off parts of a fabric that contain copyright information, you could be liable for up to $25,000 in damages

The US Copyright Act has never been praised for its clarity or its intuitive simplicity—at a whopping 460 pages long, it is filled with hotly debated ambiguities and overly complex provisions. The copyright laws of most other jurisdictions aren’t much better.

Because of this complexity of copyright law, the implications of changes to copyright law and policy are not always clear to most authors. As we’ve said in the past, many of these issues seem arcane, and largely escape public attention. Yet entities with a vested interest in maximalist copyright—often at odds with the public interest—are certainly paying attention, and often claim to speak for all authors when they in fact represent only a small subset.  As part of our efforts to advocate for a future where copyright law offers ample clarity, certainty, and real focus on values such as the advancement of knowledge and free expression, we would like to share with you two recent projects we undertook:

The 1202 Issue Brief and Amicus Brief in Doe v. Github

Authors Alliance has been closely monitoring the impact of Digital Millennium Copyright Act (DMCA) Section 1202. As we have explained in a previous post, Section 1202(b) creates liability for those who remove or alter copyright management information (CMI) or distribute works with removed CMI. This provision, originally intended to prevent wide-spread piracy, has been increasingly invoked in AI copyright lawsuits, raising significant concerns for lawful use of copyrighted materials beyond training AI. While on its face, penalties for removing CMI might seem somewhat reasonable, the scope of CMI (including a wide variety of information such as website terms of service, affiliate links, and other information) combined with the challenge of including it with all downstream distribution of incomplete copies (imagine if you had to replicate and distribute something like the Amazon Kindle terms of service every time you quoted text from an ebook) could be potentially very disruptive for many users. 

In order to address the confusion regarding the (somewhat inaptly named) “identicality requirement” by the courts in the 9th Circuit, we have released an issue brief, as well undertaken to file an amicus brief in the Doe v. Github case now pending in the 9th Circuit.

Here are the key reasons why we care—and why you should care—about this seemingly obscure issue:

  • The Precedential Nature of Doe v. Github: The upcoming 9th Circuit case, Doe v. GitHub, will address whether Section 1202(b) should only apply when copies made or distributed are identical (or nearly identical) to the original. Lower courts have upheld this identicality requirement to prevent overbroad applications of the law, and the appellate ruling may set a crucial precedent for AI and fair use.
  • Potential Impact on Otherwise Legal Uses: It is not entirely certain if fair use is a defense to 1202(b) claims. If the identicality requirement is removed, Section 1202(b) could create liability for transformative fair uses, snippet reuse, text and data mining, and other lawful applications. This would introduce uncertainty for authors, researchers, and educators who rely on copyrighted materials in limited, legal ways. We advocate for maintaining the identicality requirement and clarifying that fair use applies as a defense to Section 1202 claims. 
  • Possibility of Frivolous Litigation: Section 1202(b) claims have surged in recent years, particularly in AI-related lawsuits. The statute’s vague language and broad applicability have raised fears that opportunistic litigants could use it to chill innovation, scholarship, and creative expression.

To find out more about what’s at stake, please take a look at our 1202(b) Issue Brief. You are also invited to share your stories with us, on how you have navigated this strange statute. 

Reply to the UK Open Consultation on Copyright and AI

We have members in the UK, and many of our US-based members publish in the UK. We have been watching the development in UK copyright law closely, and have recently filed a comment to the UK Open Consultation on Copyright and AI. In our comment, we emphasized the importance of ensuring that copyright policy serves the public interest. Our response’s key points include:

  • Competition Concerns: We alerted the policy-makers that their top objective must include preventing monopolies forming in the AI space. If licensing for AI training becomes the norm, we foresee power consolidating in a handful of tech companies and their unbridled monopoly permeating all aspects of our lives within a few decades—if not sooner. 
  • Fair Use as a Guiding Principle: We strongly believe that the use of works in the training and development of AI models constitutes fair use under US law. While this issue is currently being tested in courts, case law suggests that fair use will prevail, ensuring that AI training on copyrighted works remains permissible. The UK does not have an identical fair use statute, but has recognized that some of its functions—such as flexibility to permit new technological uses—are valuable. We argue that the wise approach is for the UK to update its laws to ensure its creative and tech sectors can meaningfully participate in the global arena. Our comment called for a broad AI and TDM exception allowing temporary copies of copyrighted works for AI training. We emphasized that when AI models extract uncopyrightable elements, such as facts and ideas, this should remain lawful and protected. 
  • Noncommercial Research Should Be Protected: We strongly advocated for the protection of noncommercial AI research, arguing that academic institutions and their researchers should not face legal barriers when using copyrighted works to train AI models for research purposes. Imposing additional licensing requirements would place undue burdens on academic institutions, which already pay significant fees to access research materials.

Fair Use, Censorship, and Struggle for Control of Facts

Posted February 27, 2025
Caption: 451 is the http error code when a webpage is unavailable for legal reasons; it is also the temperature at which books catch fire and burn. This public domain image is taken inside the Internet Archive

Imagine this: a high-profile aerospace and media billionaire threatens to sue you for writing an unauthorized and unflattering biography. In the course of writing, you rely on several news articles, including a series of in-depth pieces about the billionaire’s life written over a decade earlier. Given their closeness in time to real events, you quote, sometimes extensively, from those articles in several places. 

On the eve of publication, your manuscript is leaked. Through one of his associated companies, the billionaire buys up the copyrights to the articles from which you quote. The next day the company files an infringement lawsuit against you. 

Copyright Censorship: a Time-Honored Tradition

It’s easy to imagine such a suit brought by a modern billionaire—perhaps Elon Musk or Jeff Bezos. But using copyright as a tool for censorship is a time-honored tradition. In this case, Howard Hughes tried it out in 1966, using his company Rosemont Enterprises to file suit against Random House for a biography it would eventually publish.

As we’ve seen many times before and since, the courts turned to copyright’s “fair use” right to rescue the biography from censorship. Fair use, the court explained, exists so that “courts in passing upon particular claims of infringement must occasionally subordinate the copyright holder’s interest in a maximum financial return to the greater public interest in the development of art, science and industry.” 

Singling out the biographical nature of the work and its importance in surfacing underlying facts, the court explained: 

Biographies, of course, are fundamentally personal histories and it is both reasonable and customary for biographers to refer to and utilize earlier works dealing with the subject of the work and occasionally to quote directly from such works. . . . This practice is permitted because of the public benefit in encouraging the development of historical and biographical works and their public distribution, e.g., so “that the world may not be deprived of improvements, or the progress of the arts be retarded.”

Fair use playing this role is no accident. As the Supreme Court has explained, the relationship between copyright and free expression is complicated. On the one hand, the Court has explained,  “[T]he Framers intended copyright itself to be the engine of free expression. By establishing a marketable right to the use of one’s expression, copyright supplies the economic incentive to create and disseminate ideas.” But, recognizing that such exclusive control over expression could chill the very speech copyright seeks to enable, the law contains what the Court has described as two “traditional First Amendment safeguards” to ensure that facts and ideas remain available for free reuse: 1) protections against control over facts and ideas, and 2) fair use. 

But rescuing a biography that merely quotes, even extensively, from earlier articles seems like an easy call, especially when it seems so clear that the plaintiff has so clearly engineered the copyright suit not to protect legitimate economic interests but to suppress an unpopular narrative.  

The world is a little more complicated now. Can fair use continue to protect free expression from excessive enforcement of copyright? I think so, but two key areas are at risk: 

Fair Use and the Archives

It may have escaped your notice that large chunks of online content disappear each year. 

For years, archivists have recognized and worked to address the problem. Websites going dark is an annoyance for most of us, but in some cases, it can have real implications for understanding recent history, even as officially documented. For example, back in 2013, a report revealed that well over half of the websites linked to in Supreme Court opinions no longer work, jeopardizing our understanding of just what went into why and how the Court decided an issue.  

While most websites disappear from benign neglect, others are intentionally taken down to remove records from public scrutiny.  Exhibit A may be the 8,000+ government web pages recently removed by the new presidential administration, but there are many other examples (even whole “reputation management” firms devoted to scrubbing the web of information that may cast one in an unfavorable light). 

The most well-known bulwark against disappearing internet content is the Internet Archive, which has, at this point, archived over 900 billion web pages. Over and over again, we’ve seen its WayBack Machine used to shine a light on history that powerful people would rather have hidden. It’s also why the WayBack Machine has been blocked or threatened at various times in China, Russia, India, and other jurisdictions where free expression protections are weak.

It’s not just the open web that is disappearing. A recent report on the problem of “Vanishing Culture” highlights how this challenge pervades modern cultural works. Everything from 90s shareware video games to the entirety of the MTV News Archive are at risk.  As Jordan Mechner, a contributor to the report explains, “historical oblivion is the default, not the exception” to the human record. As the report explains, it’s not just disappearing content that poses a problem: libraries and consumers must grapple with electronic content that can be remotely changed by publishers or others as well. As just one example among many, in just the last few years we’ve seen surreptitious modifications to ebooks on readers’ devices—some changing important aspects of the plot—for works by authors such as RL Stine, Roald Dahl, and Agatha Christie.  

The case for preservation as a foundational necessity to combat censorship is straightforward. “There is no political power without power over the archive,” Jacques Derrida reminds us. Without access to a stable, high-fidelity copy of the historical record, there can be no meaningful reflection on what went right or wrong, or holding to account those in power who may oppose an accurate representation of their past. 

What sometimes goes unnoticed is that, without fair use, a large portion of these preservation efforts would be illegal. 

In a world where century-long copyright protection applies automatically to any human expression with even a “modicum of creativity,” virtually everything created in the last century is subject to copyright. This is a problem for digital works because practically any preservation effort involves making copies—often lots of them—to ensure the integrity of the content. Making those copies means that archivists must rely on fair use to preserve these works and make them available in meaningful ways to researchers and others. 

The upshot is that every time the Internet Archive archives a website, it’s an act of faith in fair use. Is that faith well-founded? 

I think so. But the answer is complicated. 

For preservation efforts like those of the Internet Archive, fair use is a foundation, but not an unshakable one. Two recent cases highlight the risk, one against its book lending program and the other objecting to its “Great 78” record project. Both take issue with how the Archive provides access to preserved digital copies in its collections. While not directly attacking the preservation of those materials, the suits effectively jeopardize their effective use. As archivists have long lamented, “preservation without access is pointless.” 

Beyond direct challenges to fair use, archives are threatened by spurious takedown demands, content removal requests, and legal challenges. Organizations like the Internet Archive have fought back, but many institutions simply cannot afford to, leading to a chilling effect where preservation efforts are scaled back or abandoned altogether.

Compounding this uncertainty is the growing use of technological protection measures (TPMs) and digital rights management (DRM) systems that restrict access to digital works. Under the Digital Millennium Copyright Act (DMCA), circumventing these restrictions is illegal—even for lawful purposes like preservation or research. This creates a paradox where a researcher or archivist may have a clear fair use justification for accessing and copying a work, but breaking an encryption lock to do so could expose them to legal liability.

Additionally, the rise of contractual overrides—such as restrictive licensing agreements on digital platforms—threatens to sideline fair use entirely. Many modern works, including e-books, streaming media, and even scholarly databases, are governed by terms of service that explicitly prohibit copying or analysis, even for noncommercial research. These contracts often supersede fair use rights, leaving archivists and researchers with no legal recourse.

Still, there are reasons for optimism. Courts have generally ruled favorably when fair use is invoked for transformative purposes, such as digitization for research, searchability, and access for disabled users. Landmark decisions, like those in Authors Guild v. Google and Authors Guild v. HathiTrust, upheld fair use in the context of large-scale digital libraries and text-mining projects. These cases suggest that courts recognize the essential role fair use plays in making knowledge accessible, particularly in an era of vast digital information.

Fair Use and the Freedom to Extract 

One of copyright’s other traditional First Amendment protections is that the copyright monopoly does not extend to facts or ideas. Fair use is critical in giving life to this protection by ensuring that facts and ideas remain accessible, providing a “freedom to extract” (a term I borrow from law professor Molly Van Houweling’s recent scholarship) even when they are embedded within copyrighted works. 

Copyright does not and cannot grant exclusive control over facts, but in practice, extracting those facts often requires using the work in ways that implicate the rightsholder’s copyright. Whether journalists referencing past reporting, historians identifying truths in archival materials, or researchers analyzing a vast corpus of written works, fair use provides the necessary legal space to operate without running afoul of copyright protections for rightsholders. 

The need is more urgent than ever given the sheer scale of the modern historical record.   In many cases, relying on individual researchers to sift through the record and extract important facts is impractical, if not impossible. Automated tools and processes, including AI and text data mining tools, are now indispensable for processing, retrieving, and analyzing facts from large amounts of massive amounts of text, images, and audio. From uncovering patterns in historical archives to verifying political statements against prior records, these tools serve as extensions of human analysis, making the extraction of factual information possible at an unprecedented scale. However, these technologies depend on fair use. If every instance of text or data mining required explicit permission from rights holders—who may have economic or political incentives to deny access—the ability to conduct meaningful research and discovery would be crippled.

For example, consider a researcher studying the roots of the opioid crisis, trying to mine the 4 million documents in the Opioid Industry Documents Archive—many of them legal materials, internal company communications, and regulatory filings. These documents, made public through litigation, provide critical insights into how pharmaceutical companies marketed opioids, downplayed their risks, and shaped public policy. But making sense of such a massive trove of records is impossible without computational tools that can analyze trends, track key players, and surface hidden patterns. 

Without fair use, researchers could face legal roadblocks to applying text and data mining techniques to extract the facts buried within these documents. If copyright law were used to restrict or complicate access to these records, it would not only hamper academic research but also shield corporate and governmental actors from exposure and accountability.

Conclusion

As information continues to proliferate across digital media, fair use remains one of the few safeguards ensuring that historical records and cultural artifacts do not become permanently locked away behind copyright barriers. It allows the past to be examined, challenged, and understood. If we allow excessive copyright restrictions to limit the ability to extract and analyze our shared past and culture, we risk not only stifling innovation but also eroding our collective ability to engage with history and truth.

Fair Use Week

This is my contribution to Fair Use Week. The read the other excellent posts from this week, check out Kyle Courtney’s Harvard Library Fair Use Week blog here.

The Public Interest Corpus: An Update and Opportunities for Co-Development 

Posted February 24, 2025
A Library salute to National Photography Month and the photographer’s skill for staging eye-catching compositions

In December 2024 we announced a new project to develop a public interest AI training corpus focused on books. Over the last few months we’ve been actively engaging a diverse set of stakeholders in the development of The Public Interest Corpus

The Public Interest Corpus is focused on developing large-scale, high-quality AI training data from the world’s memory organizations that serve the public interest. In the aggregate, memory organizations like libraries and archives are in a prime position to address this need given a multi-century focus on developing high-quality, locally and globally comprehensive collections of books, newspapers, scholarly journals, photographs, manuscript materials, and more. We seek to prioritize uses of The Public Interest Corpus that promote learning, access to knowledge, and broad benefits to the public. 

Project Team and Advisory Board

The  project team consists of Dave Hansen, Executive Director of Authors Alliance and Dan Cohen, Vice Provost for Information Collaboration, Dean of the Library, and Professor of History at Northeastern University. In January, I joined the team as the Public Interest AI Strategist. In this capacity I will leverage extensive experience developing community around responsible computational use of memory organization collections as data and responsible AI.  Giulia Taurino, recently joined the team as Project Coordinator. Giulia holds a doctoral degree in Media Studies and Visual Arts from the University of Bologna and the University of Montreal and is currently a member of the NULab for Digital Humanities and Computational Social Science and of AI & Arts interest group at The Alan Turing Institute.

The project team is guided by a strong advisory board composed of senior leaders and experts who think deeply about how authors, libraries, and AI can better serve the public interest. 

  • David Bamman, Associate Professor, UC Berkeley School of Information
  • Sandra Aya Enimil, Director of Scholarly Communications and Collection Strategy, Yale University Library
  • Mike Furlough, Executive Director, HathiTrust
  • David Smith, Associate Professor, Khoury College of Computer Sciences, Northeastern University
  • Claire Stewart, Dean of Libraries and University Librarian, University of Illinois, Urbana-Champaign 
  • Mehtab Khan, Assistant Professor of Law at Cleveland State University College of Law
  • Rachael Samberg, Director,  Scholarly Communications and Information Policy, UC Berkeley Library
  • Robin Sloan, NY Times best selling science fiction author
  • Günter Waibel, Associate Vice Provost & Executive Director, California Digital Library
  • Martha Whitehead, Vice President for the Harvard Library and University Librarian, Harvard University
  • John Wilkin, CEO, LYRASIS
  • Suzanne Wones, University Librarian, UC Berkeley Library
  • Ted Underwood, Professor of Information Science and English, University of Illinois at Urbana Champaign

How you can get involved 

Over the next year the project team will engage a diverse set of stakeholders in a co-development process that directly informs The Public Interest Corpus priorities, strategies, and partnerships. To kick things off we are holding a working event at Northeastern University Library in Boston, Massachusetts on March 3 where a group of senior library administrators, publishers, disciplinary researchers, authors, and technical experts will workshop core legal, technical, business model, and governance challenges. 

Moving forward we intend to hold additional focused in-person and virtual working events with a broad range of communities. We strongly believe that engaging with diverse stakeholders in a co-development process for this effort will be key to success. If you are interested in participating in a future event, hosting a Public Interest Corpus event, or have other ideas for how we might collaborate please let us know via the following form.

We look forward to advancing a public interest solution with you all.

Independent Publisher’s Lawsuit Against Audible Fails, Highlighting Challenges to Receive Fair Streaming Compensation

Posted February 21, 2025
Adobe Stock Image

Last November, we covered a case where a group of authors complained about McGraw Hill’s interpretation of publishing agreements related to compensation for ebooks. As subscription-based models become increasingly dominant in the publishing industry, authors must be vigilant about how their contracts define compensation. Platforms like Kindle Unlimited, Audible, and academic ebook services are reshaping traditional royalty structures. This is not just a concern for trade books; academic publishing is also shifting towards subscription-based access, as evidenced by ProQuest’s recent announcement that it is ending print sales and moving toward a “Netflix for books” model. 

Here we see yet another case where ambiguous contractual terms resulted in financial loss for an author— 

On Feb. 19th, the Second Circuit affirmed the lower court’s dismissal of Teri Woods Publishing’s copyright infringement and breach-of-contract claims against Audible and other audiobook distributors in Teri Woods Publ’g, LLC v. Amazon.com, Inc. The Plaintiff initially granted the rights (that are the subject of this dispute) to Urban Audios in a licensing agreement. Thereafter, Urban Audio granted the rights under that agreement to Blackstone, which then sublicensed its rights to Amazon and Audible.

The Plaintiff in this case, Teri Woods Publishing, is an independent publisher founded by urban fiction author Teri Woods. The Plaintiff argued—and the courts ultimately disagreed—that the licensing agreement did not unambiguously permit Defendants to distribute Teri Woods’ audiobooks through the Defendants’ online audiobook streaming subscription services. More specifically, on the question of compensation for online streaming, Plaintiff and Defendants disagreed on whether (1) online streaming counted as “internet downloads” or alternatively “other contrivances, appliances, mediums and means,” and (2) the licensing terms dealing with royalties prohibit subscription streaming.

The licensing terms in question are contained in the licensing agreement Plaintiff entered into in 2018, granting Urban Audios the 

“exclusive unabridged audio publishing rights, to manufacture, market, sell and distribute copies throughout the World, and in all markets, copies of unabridged readings of the [Licensed Works] on cassette, CD, MP3-CD, pre-loaded devices, as Internet downloads and on, and in, other contrivances, appliances, mediums and means (now known and hereafter developed) which are capable of emitting sounds derived for the recording of audiobooks.”

In exchange of this assignment of rights, Urban Audio—as the Licensee—must pay Plaintiff: 

“(a) Ten percent (10%) of Licensee’s net receipts from catalog, wholesale and other retail sales and rentals of the audio recordings of said literary work; 

(b) Twenty Five percent (25%) of net receipts on all internet downloads of said literary work. 

(c) Twenty Five percent (25%) of net receipts on Playaway format [under certain conditions].”

In case you are not familiar with the services Amazon Audible provides: members of Audible generally pay a monthly fee to digitally stream or download audiobooks, instead of making any specific payment for the specific audiobooks they are streaming or downloading. This method of distribution, the Plaintiff argues, led to drastically lower compensation than expected, as the audiobooks were made available to subscribers at a fraction of their retail price. 

Audible has a history of relying on ambiguous contractual terms to reduce author payouts. The “Audiblegate” controversy, for instance, exposed how Audible’s return policy allowed listeners to return audiobooks after extensive use, deducting royalties from authors without transparency. That practice came under legal scrutiny inn Golden Unicorn Enters. v. Audible Inc., where authors alleged that Audible deliberately structured its payment model to significantly reduce their earnings (unfortunately, the court in that case also largely sided with Audible)

Despite Audible’s track record, the courts were unsympathetic to Plaintiff’s grievance in the Teri Woods case, and held that the plain meaning of the phrase “other contrivances, appliances, mediums and means (now known and hereafter developed)” in the licensing agreement included digital streams and other future technological developments in distribution services. The courts also observed that the underlying licensing agreement did not provide for the payment of royalties on a per-unit basis; Plaintiff was only entitled to a percentage of “net receipts” received by Urban Audio for sales, rentals, and internet downloads. 

The ambiguity in defining what constitutes an “internet download,” and whether payment was due on a per unit basis, ultimately were interpreted in favor of Audible. This case serves to remind us again of the importance of adopting clear contractual language. 

Licensing agreements should be drafted with clear and precise language regarding revenue models and payment structures. Subscription-based compensation models, like those employed by Audible, fundamentally differ from traditional sales models, often leading to lower per-unit earnings for authors. By failing to anticipate and address these nuances, authors risk losing control over how their works are monetized. Ensuring that rights, distribution methods, and payment structures are clearly defined can prevent disputes and financial losses down the line.

Many authors assume that digital rights are similar to traditional print rights, but as this case demonstrates, vague phrasing can allow distributors to exploit gaps in understanding. If authors do not explicitly outline limitations on emerging distribution technologies, they may find themselves receiving significantly less compensation than they anticipate when signing the agreement. For example, authors should ensure their contracts specify whether subscription-based revenue falls under traditional royalty calculations, and whether distribution via new technological formats require renegotiation. Beyond the issues with ambiguous contractual terms, this case also highlights the broader issue of how digital platforms can negatively impact readers and authors alike. Readers no longer own the books they purchase; instead, they receive licensed access that can be revoked or restricted at any time. This shift undermines the traditional relationship between books and their readers. Authors are equally threatened by these digital intermediaries, who have the power to dictate distribution methods and unilaterally alter revenue models; an author’s right to fair compensation is too often sacrificed along the way. The situation is especially dire with audiobooks, where Audible dominates the market.