Category Archives: News

Athena Unbound and Untangling the Law of Open Access

Posted May 26, 2023

A few months ago, Authors Alliance and the Internet Archive co-hosted an engaging book talk featuring historian Peter Baldwin and librarian Chris Bourg. They discussed Baldwin’s new book, Athena Unbound: Why and How Scholarly Knowledge Should be Free For All. You can watch the recording of the talk here and access the book for free in open access format here.

Today, I’m beginning a series of posts aimed at clarifying legal issues in open access scholarship. Reflecting on some key takeaways from Athena Unbound seemed like a great place to start.

For those already well-versed in the open access community, you know that there is an abundance of literature covering the theory, economics, and sociological dimensions of OA. But, it’s easy to lose the forest for the trees.  Athena Unbound stands out by providing a comprehensive, high-level explanation of how we have reached the current state of open access affairs. The book offers much more than just commentary on the underlying legal structures that impact access to scholarly works. But, as we delve deeper into the legal aspects of open access in this series, I want to highlight three key takeaways on this issue:

  1. Copyright law does not cater to most academic authors.

“Open access does not seek to dispossess authors of their property nor to stint them of their rightful earnings. But authors are not all alike. Those whose creativity supplies their livelihood are entitled to the fruits of their labor. But most authors either do not make a living from their work or are already supported in other ways.” – Athena Unbound, Chapter 2, “The Variety of Authors and Their Content”

In theory, copyright law in the United States is designed to incentivize the creation of new works by granting strong and long-lasting economic rights. This framework assumes authors primarily function as independent operators (Baldwin likens them to “bohemian artistes”) who can negotiate these rights with publishers or directly with members of the public in exchange for financial support.

However, this framework does not align with the reality faced by most academic authors, who number in the millions. While scholarly authors deserve compensation for their work, their remuneration also often comes from sources like university employment. Their motivation to create stems from incentives to share ideas and discoveries with the world, as well as personal gains such as recognition and career advancement. For these authors, the publishing system and the laws that govern it have clash with their interests to such an extent that we now witness academic authors willingly paying thousands of dollars to persuade publishers to distribute their articles for free.

If anything, copyright law, with its excessively long duration, extensive economic control, and limited freedom for researchers to engage with creative works, hampers those authors’ goals in practice. As Baldwin explains, “the fundamental problem open access faces is worth restating. Copyright has become bloated, prey to the rent-seeking academic publishing industry… Legislators, dazzled into submission by the publishing industry’s success in portraying itself as the defender of creativity and cultural patrimony, bear much responsibility.”

As we explore the legal mechanisms that influence open access, it is crucial to remember that the default rules of the system are more often than not at odds with the goals of open access authors. 

  1. Open access must encompass more than contemporary scientific articles.

While much of the current open access discourse revolves around providing access to the latest scholarly research, particularly scientific articles, there is a vast amount of past scholarship that remains inaccessible. An inclusive approach to open access should address how to provide access to these works as well. The majority of research library holdings are not available online in any form. Baldwin uses the term “grey literature” to describe the extensive collections in research libraries that are no longer commercially available. As he points out, most books lose commercial viability rather quickly. “Of the 10,000 US books published in 1930, only 174 were still in print in 2001. Of the 63 books that won Australia’s Miles Franklin prize over the past half-century, ten are unavailable in any format.”

Many of these works have become so-called orphan works: they are so detached from the commercial marketplace that their publishers have gone out of business, authors have passed away, and any remaining rights holders who would benefit from potential sales are obscure, if they exist at all. Even Maria Pallante, former Register of Copyrights and current AAP president, agrees that in the case of true orphan works, “it does not further the objectives of the copyright system to deny use of the work, sometimes for decades. In other words, it is not good policy to protect a copyright when there is no evidence of a copyright owner.”

In addition to this issue around orphan works, a subset of what is known as the “20th Century black hole,” Athena Unbound also sheds light on the various concerns and challenges that act as barriers to open access in scholarly fields outside of the sciences. While the goals of open access may be the same across these different areas, the implementation can vary significantly. In the case of certain scholarly works, such as older books entangled in complex rights issues, we may need to settle for an imperfect form of “open,” such as read-only viewing via controlled digital lending—a far cry from what many consider true open access.

  1. The intricacies of ownership are significant.

Although this is not the primary focus of Athena Unbound, it is an important aspect that deserves attention. In simple terms, the legal pathway to open access appears straightforward: authors, often depicted as individual, independent actors, must retain sufficient rights to allow them to legally share and allow reuse of their writing.

However, reality is far more complex. Multiple-authored works, including in extreme cases thousands of joint authors on one scientific article, can complicate our understanding of who actually holds a copyright interest in a work and can therefore authorize an open license on it. 

Moreover, many if not most academic authors are employed by colleges or universities, each with its own perspective on copyright ownership of scholarly publications. In most cases, as Baldwin explains, universities have been hesitant to assert ownership of scholarly publications under the work-for-hire doctrine (a topic I will cover in a subsequent post), possibly based on the increasingly tenuous “teacher exception” to the work-for-hire doctrine. However, this approach is not universally adopted. For instance, some universities assert ownership of specific categories of scholarly work, such as articles produced under grant-funded projects. Others reserve broad licenses to use scholarly work for university purposes, albeit with ill-defined parameters.

Open access, or at least the type we commonly think of—copyrighted articles typically licensed under Creative Commons or similar licenses—depends heavily on obtaining affirmative permission from the rightsholder. But the identity of the rightsholder, whether it be the university, author, or even the funder, can vary significantly due to a wide range of factors, including state laws, university IP policies, and funder grant contracts. 

Stay tuned for more in this series, and if you have questions in the meantime, check out our open access guide and resource page.

Supreme Court Issues Decisions in Warhol Foundation and Gonzalez

Posted May 19, 2023
Photo by Claire Anderson on Unsplash

Yesterday, the Supreme Court released two important decisions in Warhol Foundation v. Goldsmith and Gonzalez v. Google—cases that Authors Alliance has been deeply invested in, submitting amicus briefs to the Court in both cases. 

Warhol Foundation v. Goldsmith and Transformativeness

First, the Court issued its long-awaited opinion in Warhol Foundation v. Goldsmith, a case Authors Alliance has been following for years, and for which we submitted an amicus brief last summer. The case concerned a series of screen prints of the late musical artist Prince created by Andy Warhol, and asked whether the creation and licensing of one of the images, an orange screen print inspired by Goldsmith’s black and white photograph (which the Court calls “Orange Prince”), constituted fair use. After the Southern District of New York found for the Warhol Foundation on fair use grounds, the Second Circuit overturned the ruling, finding that the Warhol Foundation’s use constituted infringement. The sole question before the Supreme Court was whether the first factor in fair use analysis, the purpose and character of the use, favored a finding of fair use. 

To our disappointment, the Supreme Court’s majority agreed with the holding of the district court, finding that the purpose and character of Warhol’s use favored Goldsmith, such that it did not support a finding of fair use. This being said, the decision focused narrowly on the Warhol Foundation’s “commercial licensing of Orange Prince to Condé Nast,” expressing “no opinion as to the creation, display, or sale of any of the original Prince Series works.” Because the Court cabins its opinion, focusing specifically on the licensing of Orange Prince to Condé Nast rather than the creation of the entire Prince series, the decision is less likely to have a deleterious effect on the fair use doctrine generally than a broader decision would have. 

Writing for the majority, Justice Sotomayor argued that Goldsmith’s photograph and the Prince screen print in question shared the same purpose, “portraits of Prince used to depict Prince in magazine stories about Prince.” Moreover, the Court found the use to be commercial, given that the screen print was licensed to Condé Nast. Justice Sotomayor explained that “if an original work and secondary use share the same or highly similar purposes, and the secondary use is commercial, the first fair use factor is likely to weigh against fair use, absent some other justification for copying.” Justice Sotomayor found that the two works shared the same commercial purpose, and therefore concluded that factor one favored Goldsmith. 

Justice Kagan, joined by Chief Justice Roberts, issued a strongly worded dissenting opinion. The dissent admonished the majority for its departure from Campbell’s “new meaning or message test,” an inquiry that Authors Alliance advocated for in our amicus brief. Justice Kagan further criticized the majority’s shifting focus towards commerciality, arguing that the fact that the use was a licensing transaction should not be given so much importance in the analysis. While Authors Alliance agrees with these points, we are less sure that the majority’s decision goes so far as to “constrain creative expression” or “threaten[] the creative process. And while it’s uncertain what effect this case will have on the fair use doctrine more generally, one important takeaway is that the question of whether the use in question is commercial in nature—a consideration under the first factor—has been elevated to one of greater importance. 

While we thought this case offered a good opportunity for the Court to affirm a more nuanced approach to transformative use, we much prefer the Supreme Court’s approach to the Second Circuit’s decision, and applaud the Court on confining its ruling to the narrow question at issue. The holding does not, in our view, radically alter the doctrine of fair use or disrupt a bulk of established case law. Moreover, some aspects of arguments we made in our brief—such as the notion that transformativeness is a matter of degree, not a binary—are present in the Court’s decision. This is a good thing, in our view, as it will allow for more nuanced consideration of a use’s character and purpose, and stands in contrast to the Second Circuit’s all or nothing view of transformativeness. 

Gonzalez v. Google and the Missing Section 230

Also yesterday, the Court released its opinion in Gonzalez v. Google, a case that generated much attention because of its potential threat to Section 230, and another case in which Authors Alliance submitted an amicus brief. The case asked whether Google could be held liable under an anti-terrorism statute for harm caused by ISIS recruitment videos that YouTube’s algorithm recommended. In its per curiam decision (a unanimous one without a named Justice as author), the Court stated that Gonzalez’s complaint had failed to state a viable claim under the relevant anti-terrorism statute. Therefore, it did not reach the question of the applicability of Section 230 to the recommendations at issue. In other words, a case that generated tremendous concern about the Court disturbing Section 230 and harming internet creators, communities, and services that relied on it ended up saying nothing at all about the statute. 

Authors Alliance Welcomes Christian Howard-Sukhil as Text Data Mining Legal Fellow

Posted May 12, 2023

As we mentioned in our blog post on our Text Data Mining: Demonstrating Fair Use project a few weeks back, Authors Alliance is pleased to have Christian Howard-Sukhil on board as our brand new Text Data Mining legal fellow. As part of our project, generously funded by the Andrew W. Mellon Foundation, we established this new fellowship to provide research and writing support for our project. Christian will help us produce guidance for researchers and a report on the usability, successes, and challenges of the text data mining exemption to Section 1201’s prohibition on bypassing technical protection measures that Authors Alliance obtained in 2021. Christian begins her work with Authors Alliance this week, and we are thrilled to have her. 

Christian holds a PhD in English Language and Literature from the University of Virginia, and has just completed her second year of law school at UC Berkeley. Christian has extensive digital humanities and text data mining experience, including in previous roles at UVA and Bucknell University. Her work with Authors Alliance will focus on researching and writing about the ways that current law helps or hinders text and data mining researchers in the real world. She will also contribute to our blog—look out for posts from her later this year.

About her new role at Authors Alliance, Christian says, “I am delighted to join Authors Alliance and to help support text and data mining researchers navigate the various legal hurdles that they face. As a former academic and TDM researcher myself, I saw first-hand how our complicated legal structure can deter valid and generative forms of TDM research. In fact, these legal issues are, in part, what inspired me to attend law school. So being able to work with Authors Alliance on such an important project—and one so closely tied to my own background and interests—is as exciting as it is rewarding.”

Please join us in welcoming Christian!

An Update on our Text and Data Mining: Demonstrating Fair Use Project

Posted April 28, 2023

Back in December we announced a new Authors Alliance’s project, Text and Data Mining: Demonstrating Fair Use, which is about lowering and overcoming legal barriers for researchers who seek to exercise their fair use rights, specifically within the context of text data mining (“TDM”) research under current regulatory exemptions. We’ve heard from lots of you about the need for support in navigating the law in this area. This post gives a few updates. 

Text and Data Mining Workshops and Consultations

We’ve had a tremendous amount of interest and engagement with our offers to hold hands-on workshops and trainings on the scope of legal rights for TDM research. Already this spring, we’ve been able to hold two workshops in the Research Triangle hosted at Duke University, and a third workshop at Stanford followed by a lively lunch-time discussion. We have several more coming. Our next stop is in a few weeks at the University of Michigan, and we have plans in the works for workshops in the Boston area, New York, a few locations on the West Coast, and potentially others as well. If you are interested in attending or hosting a workshop with TDM researchers, librarians, or other research support staff, please let us know! We’d love to hear from you. The feedback so far has been really encouraging, and we have heard both from current TDM researchers and those for whom the workshops have opened their eyes to new possibilities. 

ACH Webinar: Overcoming Legal Barriers to Text and Data Mining
Join us! In addition to the hands-on in-person workshops on university campuses, we’re also offering online webinars on overcoming legal barriers to text and data mining. Our first is hosted by the Association for Computers and the Humanities on May 15 at 10am PT / 1pm ET. All are welcome to attend, and we’d love to see you online!
Read more and register here. 

Research 

A second aspect of our project is to research how the current law can both help and hinder TDM researchers, with specific attention to fair use and the DMCA exemption that Authors Alliance obtained for TDM researchers to break digital locks when building a corpus of digital content such as ebooks or DVDs.

Christian Howard-Sukhil, Authors Alliance Text and Data Mining Legal Fellow

To that end, we’re excited to announce that Christian Howard-Sukhil will be joining Authors Alliance as our Text and Data Mining Legal Fellow. Christian holds a PhD in English Language and Literature from the University of Virginia and is currently pursuing a JD from the UC Berkeley School of Law. Christian has extensive digital humanities and text data mining experience, including in previous roles at UVA and Bucknell University. Her work with Authors Alliance will focus on researching and writing about the ways that current law helps or hinders text and data mining researchers in the real world. 

The research portion of this project is focused on the practical implications of the law and will be based heavily on feedback we hear from TDM researchers. We’ve already had the opportunity to gather some feedback from researchers including through the workshops mentioned above, and plan to do more systematic outreach over the coming months. Again, if you’re working in this field (or want to but can’t because of concerns about legal issues), we’d love to hear from you. 

At this stage we want to share some preliminary observations, based on recent research into these issues (supported by the work of several teams of student clinicians) as well as our recent and ongoing work with TDM researchers:

1) Licenses restrictions are a problem. We’ve heard clearly that licenses and terms of use impose a significant barrier to TDM research. While researchers are able to identify uses that would qualify as fair use and also many uses that likely qualify under the DMCA exemption, terms of use accompanying ebook licenses can override both. These terms vary, from very specific prohibitions–e.g., Amazon’s, which says that users “may not attempt to bypass, modify, defeat, or otherwise circumvent any digital rights management system”–to more general prohibitions on uses that go beyond the specific permissions of the license–e.g., Apple’s terms, which state that “No portion of the Content or Services may be transferred or reproduced in any form or by any means, except as expressly permitted.” Even academic licenses, often negotiated by university libraries to have  more favorable terms, can still impose significant restrictions on reuse for TDM purposes. Although we haven’t heard of aggressive enforcement of those terms to restrict academic uses, even the mere existence of those terms can have chilling and negative real world impacts on research using TDM techniques.

The problem of licenses overriding researchers rights under fair use and other parts of copyright law is of course not limited to just inhibiting text and data mining research. We wrote about the issue, and how easy it is to evade fair use, a few months ago, discussing the many ways that restrictive license terms can inhibit normal, everyday uses of works such as criticism, commentary and quotation. We are currently working on a separate paper documenting the scope and extent of “contractual override,” and will be part of a symposium on the subject in May, hosted by the Association of Research Libraries and the American University, Washington College of Law Program on Information Justice and Intellectual Property.

2) The TDM exemption is flexible, but local interpretation and support can vary. We’ve heard that the current TDM exemption–allowing researchers to break technological protection measures such as DRM on ebooks and CSS on DVDs–is an important tool to facilitate research on modern digital works. And we believe the terms of that exemption are sufficiently flexible to meet the needs of a variety of research applications (how wide a variety remains to be seen through more research). But local understanding and support for researchers using the exemption can vary. 

For example, the exemption requires that the university that the TDM research is associated with implement “effective security measures” to ensure that the corpus of copyrighted works isn’t used for another purpose. The regulation further explains that in the absence of a standard negotiated with content holders, “effective security measures” means “measures that the institution uses to keep its own highly confidential information secure.” University  IT data security standards don’t always use the same language or define their standard to cover “highly confidential information” and so university IT offices must interpret this language and implement the standard in their own local context. This can create confusion about what precisely universities need to do to secure the TDM corpora. 

Some of these definitional issues are likely growing pains–the exemption is still new and universities need time to understand and implement standards to satisfy its terms in a reasonable way–it will be important to explore further where there is confusion on similar terms and how that might best be resolved. 

3) Collaboration and sharing are important. Text and data mining projects are often conceived of as part of a much larger research agenda, with multiple potential research outputs both from the initial inquiry and follow-up studies with a number of researchers, sometimes from a number of institutions. Fair use clearly allows for collaborative TDM work –e.g., in  Authors Guild v. HathiTrust, a foundational fair use case for TDM research in the US, we observe that the entire structure of HathiTrust is a collective of a number of research institutions with shared digital assets. And likewise, the TDM exemption permits a university to provide access to “researchers affiliated with other institutions of higher education solely for purposes of collaboration or replication of the research.” The collaborative aspect of this work raises some challenging questions, both operationally and conceptually. For example, the exemption for breaking digital locks doesn’t define precisely who qualifies as a researcher who is “affiliated,” leaving open questions for universities implementing the regulation. More conceptually, the issue of research collaboration raises questions about how precisely the TDM purpose must be defined when building a corpora under the existing exemption, for example when researchers collaborate but investigate different research questions over time. Finally, the issue of actually sharing copies of the corpus with researchers at other institutions is important because at least in some cases, local computing power is needed to effectively engage with the data. 

Again, just preliminary research, but some interesting and important questions! If you are working in this area in any capacity, we’d love to talk. The easiest way to reach us is at  info@authorsalliance.org

Want to Learn More?
This current Authors Alliance project is generously supported by the Mellon Foundation, which has also supported a number of other important text and data mining projects. We’ve been fortunate to be part of a broader network of individuals and organizations devoted to lowering legal barriers for TDM researchers. This includes efforts spearheaded by a team at UC Berkeley to produce the “Legal Literacies for Text Data Mining” and its current project to address cross-border TDM research, as well as efforts from the Global Network on Copyright and User Rights, which has (among other things) led efforts on copyright exceptions for TDM globally.

Authors Alliance Joins Copyright Office Listening Session On Copyright in AI-Generated Literary Works

Posted April 20, 2023
Photo by Possessed Photography on Unsplash

Yesterday, I represented Authors Alliance in a Copyright Office listening session on copyright issues in AI-generated literary works, in the first of two of such sessions that the Office convened yesterday afternoon. I was pleased to be invited to share our views with the Office and participate in a rousing discussion among nine other stakeholders, representing a diverse group of industries and positions. Generative AI raises challenging legal questions, particularly for its skeptics, but it also presents some incredible opportunities for authors and other creators.

During the listening session, I emphasized the potential for generative AI programs (like OpenAI’s Chat GPT, Microsoft’s Bing AI, Jasper, and others) to support authorship in a number of different ways. For instance, generative AI programs support authors by increasing the efficiency of some of the practical aspects of being a working author aside from their writings. But more importantly, generative AI programs can actually help authors express themselves and create new works of authorship. 

In the first category, generative AI programs can support authors by, for example, helping them create text for pitch letters to send to agents and editors, produce copy for their professional websites, and develop marketing strategies for their books. Making these activities more efficient frees up time for authors to focus on their writing, particularly for authors whose writing time is limited by other commitments. 

In the second category, generative AI has tremendous potential to help authors come up with new ideas for stories, develop characters, summarize their writings, and perform early stage edits of manuscripts. Moreover, and particularly for academic authors, generative AI can be an effective research tool for authors seeking to learn from a large corpus of texts. Generative AI programs can help authors research by providing short and simple summaries of complex issues, surveys of the landscape of various fields, or even guidance on what human works to turn to in their research. Authors Alliance is committed to protecting authors’ right to conduct research, and we see generative AI tools as a new, innovative, and efficient form of conducting this research. Making research easier helps authors save time, and has a particular benefit for authors with disabilities that make it difficult to travel to multiple libraries or otherwise rely on analog forms of research. 

These programs undoubtedly have the potential to serve as powerful creative tools that support authorship in these ways and more, but, when discussing the copyright implications of the programs and the works they produce, it’s important to remember just how new these technologies are. Because generative AI remains in its infancy, and the costs and benefits for different segments of the creative industry have yet to be seen, it seems to me to be sensible to preserve the development of these tools before crafting legal solutions to problems they might pose in the future. And in fact, in our view, U.S. copyright law already has the tools to deal with many of the legal challenges that these programs might post. When generative AI outputs look too much like the copyrighted inputs they are trained on, the substantial similarity test can be used to assess claims of copyright infringement to vindicate an authors’ exclusive rights in their works when those outputs do infringe. 

In any case, in order for generative AI programs to be effective creative tools, it’s necessary that they are trained on large corpora. Narrowing the corpus of works the programs are trained on—through compulsory licensing or other mechanisms—can have disastrous effects. For example, research has shown that narrow data sets are more likely to produce racial and gender bias in AI outputs. In our view, the “input” step, where the programs are trained on a large corpus of works, is a fair use of these texts. And the holdings in Google Books and HathiTrust indicate that it is consistent with fair use to build large corpora of works, including works that remain protected by copyright, for applications such as computational research and information discovery. Additionally, the Copyright Office has recognized this principle in the context of research and scholarship, as demonstrated by its approval of Authors Alliance’s petition for an exemption from DMCA restrictions for text and data mining

The question of the copyright status of AI-generated works is an important one. Most if not all of the stakeholders participating in this discussion agreed with the Copyright Office’s recent guidance regarding registration in AI-generated works: under ordinary copyright principles, the lack of human authorship means these texts are not protected by copyright. This being said, we also recognize that there may be challenges in reconciling existing copyright principles with these new types of works and the questions about authorship, creativity, and market competition that they might pose. 

But importantly, while this technology is still in its early stages, it serves the core purposes of copyright—furthering the progress of science and the useful arts by incentivizing new creation—to allow these systems to develop and confront new legal challenges as they emerge. Copyright is not only about protecting the exclusive rights of copyright holders (a concern that underlies many arguments against generative AI as a fair use), but incentivizing creativity for the public benefit. The new forms of creation made possible through generative AI can incentivize people who would not otherwise create expressive works to do so, bringing more people into creative industries and adding new creative expression to the world to the benefit of the public.

The listening sessions were recorded, and will be available on the Copyright Office website in the coming weeks. And these listening sessions are only the beginning of the Office’s investigation of copyright in AI generated works. Other listening sessions on visual works, music, and audiovisual works will be held in the coming weeks, and the Office has indicated that there will be an opportunity for written public comments in order for stakeholders to weigh in further. We are committed to remaining involved in these cutting edge issues, through written comments and otherwise, and we will keep our readers informed as policy around generative AI continues to evolve. 

Book Talk: Digital Copyright with Jessica Litman

Authors Alliance is pleased to announce the next in our joint book talk series with the Internet Archive. Join us as we host Internet Archive’s founder BREWSTER KAHLE in conversation with JESSICA LITMAN to talk about her book, Digital Copyright.

In Digital Copyright (read now), law professor Jessica Litman questions whether copyright laws crafted by lawyers and their lobbyists really make sense for the vast majority of us. Should every interaction between ordinary consumers and copyright-protected works be restricted by law? Is it practical to enforce such laws, or expect consumers to obey them? What are the effects of such laws on the exchange of information in a free society?

REGISTER NOW

Read Digital Copyright now.

PROFESSOR JESSICA LITMAN, the John F. Nickoll Professor of Law, is the author of Digital Copyright and the co-author, with Jane Ginsburg and Mary Lou Kevlin, of the casebook Trademarks and Unfair Competition Law: Cases and Materials.

BREWSTER KAHLE, founder and digital librarian of the Internet Archive, has been working to provide universal access to all knowledge for more than 25 years.

Book Talk: Digital Copyright
April 20, 2023 @ 10am PT / 1pm ET
Register now for the free, virtual discussion

Authors Alliance Submits Comment to Copyright Office Regarding Ex Parte Communications

Posted April 4, 2023
Photo by erica steeves on Unsplash

Yesterday, Authors Alliance submitted a comment to the U.S. Copyright Office in response to a notice of proposed rulemaking asking for feedback from the public on new rules to govern ex parte communications. “Ex parte communications” refer to communications outside the normal, permitted channels of communication—in this case, to communications between organizations or members of the public and Copyright Office staff outside of hearings or other formal proceedings. Ex parte communications with the Copyright Office are important, because they allow stakeholders and the office to work out open questions in rulemakings or other proceedings outside of the formal channels. Authors Alliance relied on our ability to make ex parte communications during the last Section1201 rulemaking cycle (where we obtained our text data mining exemption) in order to clarify certain issues. Now, the Office is proposing establishing formal rules for how these communications can be made, as well as establishing transparency around them. We support this proposal, and shared our thoughts in a comment. You can read our full comment here.

Judge Rules Against Internet Archive on Controlled Digital Lending

Posted March 28, 2023
Photo by Wesley Tingey on Unsplash

On Friday, Southern District of New York Judge John Koeltl issued a much-anticipated decision in Hachette Books v. Internet Archive. Unfortunately, as many of our members and allies are aware, the judge ruled against the Internet Archive, finding that its CDL program was not protected by the doctrine of fair use and granting the publishers’ motion for summary judgment. You can read the 47-page decision for yourself here

In his fair use analysis, Judge Koeltl found that each of the four fair use factors weighed in favor of the publishers, emphasizing above all else his view that IA’s controlled digital lending program was not transformative, an important consideration under the first fair use factor, which considers the purpose and character of the use. This inquiry also involves asking whether the use in question was commercial. To the surprise of many, the decision stated that IA’s use of the publishers’ works was commercial, because the Open Library is part of the IA’s website, which it uses “to attract new members, solicit donations, and bolster its standing in the library community.” The judge found this to be the case in spite of the fact that IA “does not make a monetary profit” from CDL. In other words, the judge held that the indirect, attenuated benefits the Internet Archive (which is, after all, a nonprofit) reaps from operating the Open Library makes its CDL program commercial. 

Judge Koeltl gave less attention to the fourth factor in the fair use analysis, “the effect of the use on the potential market for the work,” which is often held up to be of significant importance. One consideration under this factor is whether the use creates a competing substitute with the original work. Unfortunately, on this point too, the court—in our view—missed the mark. This is because the decision does not draw a distinction between CDL scans and ebooks, going so far as to call CDL scans “ebooks” throughout. As we explained in our summary of the proceedings last week, many features of both CDL and ebooks make them both functionally and aesthetically distinct from one another. By glossing over these differences, the judge reached the conclusion that CDL scans are direct substitutes for licensed ebooks.

Authors Alliance is deeply concerned about the ramifications of this decision, which was exceedingly broad in scope, striking a tremendous blow to the CDL model, rather than only IA’s implementation of it. Local libraries across the country practice CDL, and library patrons and authors alike depend on it to read, research, and participate in academic discourse. 

As it stands, this decision only applies to Internet Archive and is only about the 127 books on which the publishers based their lawsuit. It does not set a binding precedent for any other library, but if left in place (or worse, if affirmed on appeal), it could cause libraries to avoid digitizing and lending books under a CDL model, which in our view would not serve the interests of many authors. This decision makes it harder for those authors to reach wide audiences: CDL enables many authors to reach more readers than they could otherwise, and authors like our members who write to be read would not be served if fewer readers could access their books. 

The decision also hampers efforts to preserve books—aside from IA’s scanning program, there are few if any centralized efforts to preserve books in digital format once their commercial life is over. Without CDL, those books could quite literally disappear, and the knowledge they advance could be lost. IA’s scanning operations do preserve such books, which is one reason we have strongly supported them in this lawsuit. By the same token, if this decision stands, it will also limit authors’ ability to conduct efficient research online. The CDL survey we launched last year revealed that CDL is an effective research tool for authors who need to consult other books as part of their writing process, and in many cases it enables them to access far more works than they could at their local library alone. Authors who rely on CDL in this way would be harmed by this decision, as they could well be forced to undergo a more time-consuming research process, detracting from time that could be spent writing. 

The Internet Archive has already indicated that it will be appealing Judge Koeltl’s ruling, and we look forward to supporting those efforts. We will continue to keep our readers and members apprised of updates as this case moves forward.

Judge Hears Oral Arguments in Hachette Book Group v. Internet Archive

Posted March 20, 2023
Photo by Timothy L Brock on Unsplash

Earlier today, Judge John Koeltl of the Southern District of New York heard oral arguments in Hachette Book Group v. Internet Archive—a case Authors Alliance has been following since the lawsuit was first filed back in 2020. The case is about—among other things—whether Internet Archive’s controlled digital lending program qualifies as a fair use. Authors Alliance submitted an amicus brief in support of the Internet Archive back in July, arguing that CDL serves the interests of authors who write to be read. IA’s attorney cited to our brief during oral argument, and we are pleased that we were able to magnify the voices of authors who write to be read through its submission. You can learn more about the case and read our brief here.

In the hearing, the judge considered each party’s motion for summary judgment. The parties hotly contested a number of key issues in the case, including whether each side’s experts had properly demonstrated market harm (or lackthereof), what the appropriate market to consider was for purposes of fair use analysis, the commerciality of IA’s use, and what legal cases supported both arguments in favor of and against fair use. Judge Koeltl asked the Internet Archive’s attorney a number of probing questions on these points, grappling with the difficult questions in this case. The judge further implied that there may be open issues of fact in this case, which could indicate the need for additional briefings or hearings. 

CDL and Commerciality

The parties disagreed on the commerciality of IA’s use when it produces and makes CDL scans available. The publishers attorney argued that IA’s CDL operations are “intertwined” with its other functions, such as its ownership of the book vendor Better World Books, and further emphasizing its argument that CDL loans result in lost revenue for the publisher—in other words, that the supposed commercial harm to the publishers that results from CDL lending makes the CDL lending itself commercial. The Internet Archive’s attorney answered that IA is a nonprofit organization that does not profit at all from its CDL program. He pointed to the fact that traditional library lending is not commercial in nature and does not provide libraries like IA with commercial benefits. 

CDL and Market Effects

The plaintiffs’ attorney began by setting forth plaintiffs’ views on the issue of market harm—the fourth factor in fair use analysis, often cited as one of the most important factors in the inquiry. Plaintiffs discussed what they see as massive financial harm stemming from IA’s CDL program, which they estimated to amount to “millions of dollars in licensing revenues.” Plaintiffs also emphasized that, were CDL “given the green light,” or upheld as a fair use, the plaintiffs would suffer even greater losses. Throughout her argument, plaintiffs’ attorney emphasized the “basic economic principle and common sense is that you cannot compete with free.” In other words, the publishers argue that the ebook library licensing market could collapse altogether if CDL were allowed to continue. Yet this misses the point that CDL is a longstanding and established practice, which has seen adoption and growth in libraries across the country while the ebook licensing market has continued to thrive. 

Judge Koeltl, however, pressed the publishers on whether they had shown evidence of actual market harm, i.e. proof that IA’s CDL program had directly harmed their bottom line. In response, plaintiffs criticized the expert evidence offered by IA’s experts to show that no such harm had occurred. This is a difficult question because the party asserting a fair use defense typically has the burden of showing that the use has not harmed the market, but it exceedingly difficult to prove a negative. 

The judge also questioned whether CDL actually could represent such a loss: the publishers’ argument rests on the premise that libraries loan out CDL scans in lieu of paying to license ebooks, and were CDL not permitted under the law, IA and other libraries would instead choose to pay licensing fees to lend out ebooks. The judge pointed out that the result might in fact be that libraries would choose not to lend digital copies of works out at all, or would instead lend out physical books, undercutting the lost licensing revenue argument. 

IA’s attorney argued that the publishers had not offered empirical evidence of market harm in this case, focusing on the fact that when a library lends out a CDL scan, it does so in lieu of a physical book, “simulating the limitations of physical books.” This is due to CDL’s “owned to loaned” ratio requirement: a library can only loan out the number of CDL scans as it has physical books in its collection, and can only loan these scans out to one patron at a time. When a library lends out a CDL scan, it does so in lieu of loaning the physical book, for which it has already paid. And while the plaintiffs mentioned harm to authors (who are, after all, the people that copyright law is intended to protect) several times during their argument, they did this in a way that linked authors with publishers as parties that are financially invested in a works’ sale—author interests and the finer details of the economics of author income and library lending were absent from the discussion. 

The parties also disagreed about which market was the appropriate one to look to when discussing market harm in the context of fair use analysis. The publishers argued, and the judge seemed to assume, that the proper market is the library ebook licensing market. The judge opined that libraries could, instead of using CDL to lend out their books, simply purchase an ebook license. He seemed to view CDL scans and licensed ebooks as one and the same, despite the fact that there are several key differences between these types of loans, both in form and function, as explained in other amicus briefs in the case. Moreover, missing from the argument was the fact that, in many cases, libraries loan out CDL scans because no ebook is available to them: particularly for older books in a publisher’s backlist, or for books that are no longer available commercially, there is in many cases no ebook available, or no ebook available to libraries. Library patrons with print or mobility disabilities in need of digital copies of these kinds of works in order to read them would be greatly harmed if CDL were no longer permitted. 

CDL and Transformativeness

The publishers’ attorney started from the premise that CDL as a use was not transformative, explaining that a licensed ebook and a CDL scan served precisely the same function. In response, IA’s attorney in response argued that CDL is a transformative use because it “utilizes technology to achieve the transformative purpose of improving efficiency of delivering content without unreasonably encroaching on the rights of the rightsholder.” He further explained that fair uses are favored when they serve the key purpose of copyright: incentivizing new creation for the public benefit without harming the interests of rightsholders. To illustrate these benefits, he cited to Authors Alliance’s amicus brief, in which we explained the myriad ways that CDL benefits authors and can even incentivize the creation of new works. 

Adding to its transformativeness argument, IA explained that, when it comes to speculative or actual market harm, such an effect must be balanced against the public benefit that results from the use. And when it comes to CDL, this public benefit is tremendous: numerous amici, as well as Authors Alliance, explained that CDL serves the interests of library patrons, authors, and the public writ large. 

What’s Next?

Now that the judge has heard both sides’ arguments, he will issue a decision in the case. While there is no way of knowing exactly when this will happen, Judge Koeltl is known for issuing decisions fairly quickly, so we may have a decision as soon as later this week. As always, we will keep our members and readers apprised of any developments in this pivotal case as it moves forward.

Copyright Office Issues Opinion Letter on Copyright in AI-Generated Images

Posted March 8, 2023
Photo by Michael Dziedzic on Unsplash

In late February, the Copyright Office issued a letter revoking a copyright registration it had previously granted artist Kristina Kashtanova for a comic that used images generated using Midjourney, a generative AI program that creates images in response to user prompts. While this may seem minor, or simply another data point in the ongoing fight about copyright protection for AI-generated works, the determination is quite significant: it comes at a moment when AI-generated art has captured public attention, and moreover shows the Copyright Office’s thoughts on the important question of whether an artist who relies on a program like Midjourney can obtain copyright protection for an original compilation of AI-generated works. In today’s post, we explain the Copyright Office letter, contextualize it within the growing debate over AI and copyright, and share our thoughts on what all of this might mean for authors who write to be read. 

Copyright and Human Authorship

As technology has advanced to allow the creation of works without the direct involvement of a human, courts have grappled with whether these creations are entitled to copyright protection. In the late 19th century, the Supreme Court established that copyright was intended to protect the products of human labors and creativity, creating the “human authorship” requirement. In an early case on the topic, the Court held that a photograph was copyrightable despite the fact that a camera literally created the image, since photographs were “representatives of original intellectual conceptions of the author.” It cautioned, however, that when it came to creations resulting from processes that were “merely mechanical,” lacking “novelty, invention, or originality” by a human author, such hypothetical works might be beyond the scope of copyright protection.

This principle was tested in the 2010s: in 2011, an Indonesian crested macaque monkey named Naruto seized a photographer’s camera and took hundreds of images of himself. The photographer, David Slater, shared some of these images online, which promptly went viral. Several websites posted these images as well, prompting Slater to assert that he owned the copyright in the images and request their removal. The Wikimedia Foundation, which had uploaded the image to Wikimedia Commons, a repository of public domain and free license content, argued that the image was a part of the public domain due to the lack of a human creator. Several years later, Slater published a book of nature photographs which included Naruto’s selfie. Then, in 2015, the People for the Ethical Treatment of Animals (PETA) filed a lawsuit in the Northern District of California on Naruto’s behalf, asserting that the macaque owned the copyright in the image and requesting damages. The district court judge held that Naruto could not own the copyright in the image due to copyright’s human authorship requirement. However, the judge did indicate that Congress might be free to do away with the human authorship requirement and permit copyright ownership by animals, suggesting that the requirement was not a constitutional one, but indicating that it was beyond the power of the judiciary to decide. The Ninth Circuit Court of Appeals later affirmed the district court’s ruling.

Currently, the Copyright Office is defending a lawsuit in the D.C. district court brought by AI system developer, Dr. Stephen Thaylor, regarding the constitutionality of copyright law’s human authorship requirement. Thaylor argues that the Copyright Act does not forbid treating AI systems as “authors” for the purpose of copyright law, and contends that the human authorship principle is unsupported by contemporary case law. While it seems unlikely that Thaylor will prevail on this argument, the case will at the very least generate new attention about the human authorship requirement and how it fits into creation in the digital age. 

The Creativity Requirement and Zarya of the Dawn

Kashtanova’s assertion of copyright ownership in her comic, Zarya of the Dawn, is in many ways similar to the photographer David Slater’s claim that he owned the copyright in Naruto’s selfie. In each case, the Copyright Office indicated that when a work is not the product of human authorship, a human may not claim copyright in that work (the latest compendium of Copyright Office practices lists “a photograph taken by a monkey” as an example of work that is not entitled to copyright protection since it does not meet the human authorship requirement). 

Kashtanova’s attorney had argued that Midjourney served “merely as an assistive tool,” and that Kashtanova should be considered the work’s author. But the Office likened Midjourney to a “merely mechanical process” lacking “novelty, invention, or originality” by a human creator, quoting the Supreme Court’s warning about the limits of copyright protection in the 19th century case discussed earlier in this post. And it was not only the human authorship requirement that made Zarya of the Dawn beyond the scope of copyright protection, but also copyright’s creativity requirement: for a work to be copyrightable, it must possess at least a “modicum” of creativity, a very low bar that rarely forecloses copyright protection for works of human authorship. 

The Office explained that Midjourney generates images in response to user prompts, “text commands entered in one of Midjourney’s channels.” But these are not “specific instructions” for generating an image, rather input data that Midjourney compares to its training data before generating an image. The Office also argued that these images lack human authorship because the process is “unpredictable” and “not controlled by the user.” In other words, the “creativity” in these images comes not from the human entering prompts, but from the interaction between the prompt and Midjourney’s training data. This makes it different from a tool like a camera over which a user exercises total control—there is little to no unpredictability when we use digital cameras to photograph the world around us, rather all creative choices come from the human using the device. 

The Office also noted that this opinion was not necessarily the final world on AI-generated images, as “other [generative] AI offerings” might operate differently, such that the creativity and human authorship requirements could be met. Kashtanova argued that minor edits she had made to the images were sufficiently creative to give her copyright ownership in the work as a whole. While the Office disagreed in this specific case (the before and after images demonstrating the editing were nearly identical), it did leave this possibility intact for future cases. Moreover, the Office granted Kashtanova ownership in the comic’s text, which she alone had written, as well as copyright ownership in the compilation of Midjourney-generated images. Compilations of uncopyrightable subject matter can sometimes be protected by copyright, because both the human authorship and creativity requirements are met when a human selects and arranges the material. The copyright owner does not own a copyright in the material itself, but in the original compilation they have created.

What Does this Mean for Authors?

The Copyright Office’s denial of registration in the Midjourney-generated images has important implications for the public domain and authors’ abilities to use new forms of technology as assistive tools in the creation of their works. But the Office’s action also leaves some open questions about the copyright status of images generated by Midjourney and similar systems. One possibility is—as was asserted by Wikimedia in the case of Naruto’s selfie—these images are a part of the public domain. Were that to be the case, it could be a boon for artists and creators. Recall that once a work is in the public domain, it becomes free for all to use without fear of copyright infringement. The case of the monkey selfie is further instructive here, as the owner of the camera in that case did not prevail on claiming his own copyright in Naruto’s selfie. By the same token, it is unlikely that the creators of Midjourney could claim a copyright in images like those used by Kashtova, despite their role in creating and making available the “assistive tool.” 

If AI systems could be used to generate infinite public domain content—whether through text-based systems like ChatGPT or image-generating systems like Midjourney—this would greatly expand public domain content. The public domain can be a boon for creators, as they are free to do anything they wish with this material. On the other hand, some have expressed fear that, should all AI-produced works be considered a part of the public domain, these public domain works could compete with works produced by human authors. It is also important to remember the practical economic realities of systems like Midjourney. Whether or not the Copyright Office and other policymakers determine that AI-generated content is a part of the public domain, the creators of those systems could employ other means to assert ownership or forbid onward uses of the content created by these systems. Contractual override, the employment of so-called “digital locks” like DRM, or other legal and technical mechanisms could conceivably limit authors’ ability to use AI-generated works the way they might use more traditional public domain materials.