Text Data Mining Research DMCA Exemption Renewed and Expanded

Posted October 25, 2024
U.S. Copyright Office 1201 Rulemaking Process, taken from https://www.copyright.gov/1201/

Earlier today, the Library of Congress, following recommendations from the U.S. Copyright Office, released its final rule adopting exemptions to the Digital Millenium Copyright Act’s prohibition on circumvention of technological protection measures (e.g.,  DRM).  

As many of you know, we’ve been working closely with members of the text and data-mining community as well as our co-petitioners, the Library Copyright Alliance (LCA) and the American Association of University Professors (AAUP), to petition for renewal of the existing TDM research exemption and to expand it to allow researchers to share their research corpora with other researchers outside of their university (something not previously allowed). The process began over a year ago and followed an in-depth review process by the U.S. Copyright Office. 

We are very pleased to see that the Librarian of Congress both approved the renewal of the existing exemption and approved an expansion that allows for research universities to provide access to TDM corpora for use by researchers at other universities. 

The expanded rule is poised to make an immediate impact in helping the TDM researchers collaborate and build upon each other’s work. As Allison Cooper,  director of Kinolab and Associate Professor of Romance Languages and Literatures and Cinema Studies at Bowdoin College, explains:

“This decision will have an immediate impact on the ongoing close-up project that Joel Burges, Emily Sherwood, and I are working on by allowing us to collaborate with researchers like David Bamman, whose expertise in machine learning will be valuable in answering many of the ‘big picture’ questions about the close-up that have come up in our work so far.”

These are the main takeaways from the new rule: 

  • The exemption has been expanded to allow “access” to corpora by researchers at other institutions “solely for purposes of text and data mining research or teaching.” There is no more requirement that access be granted as part of a “collaboration,” so new researchers can ask new and different questions of a corpus. Access must be credentialed and authenticated.
  • The issue of whether a researcher can engage in “close viewing” of a copyrighted work has been resolved—as the explanation for the revised rule puts it, researchers can “view the contents of copyrighted works as part of their research, provided that any viewing that takes place is in furtherance of research objectives (e.g., processing or annotating works to prepare them for analysis) and not for the works’ expressive value.” This is a very helpful clarification!
  • The new rule also modified the existing security requirements, which provide that researchers must put in place adequate security protocols to protect TDM corpora from unauthorized reuse and must share information about those security protocols with rightsholders upon request. That rule has been limited in some ways and expanded in others. The new rule clarifies that trade associations can send inquiries on behalf of rightsholders. However, inquiries must be supported by a “reasonable belief” that the sender’s works are in a corpus being used for TDM research.

Later on, we will post a more in-depth analysis of the new rules–both TDM and others that apply to authors. The Librarian of Congress also authorized the renewal of a number of other rules that support research, teaching, and library preservation. Among them is a renewal of another exemption that Authors Alliance and AAUP petitioned for, allowing for the circumvention of digital locks when using motion picture excerpts in multi-media ebooks. 

Thank you to all of the many, many TDM researchers and librarians we’ve worked with over the last several years to help support this petition. 

You can learn more about TDM and our work on this issue through our TDM resources page, here.