Scholars who engage in Text and Data Mining (TDM) research face a unique set of challenges. Over the years, the Authors Alliance has worked closely with researchers who engage in text data mining, both to help clarify how existing law applies to their work and to advocate for improvements to make this kind of research easier. This has included advocating for exemptions to the Digital Millennium Copyright Act (DMCA), a law that forbids people from bypassing technical protection measures on copyrighted works, even when the underlying use is fair. The TDM exemptions enable valuable digital humanities research and teaching that is otherwise impossible under DMCA. Below is more information about our research, education and advocacy related to text and data mining.
- Text and Data Mining Under U.S. Copyright Law: Landscape, Flaws & Recommendations. In 2023, we interviewed over 40 TDM practitioners across different institutions and compiled a report to outline and analyze the current legal landscape for TDM research. The updated report now discusses the anticipated impact of 2024 TDM exemptions. You can download the full report here.
- The Current Text and Data Mining Research Exemptions from the DMCA can be found at 37 C.F.R. 201.40(b)(4) and (b)(5). Our blog post summarizes the key takeaways of the 2024 exemptions. The Authors Alliance petitioned for the renewal and expansion of the 2021 TDM exemptions in July 2023. You can read our petitions to renew and expand the existing TDM exemptions to Section 1201 of the DMCA for text-based works and for films. You can also read our comments in response to the U.S. Copyright Office’s Notice of Proposed Rulemaking in support of our petition.We are grateful to our co-petitioners, the Library Copyright Alliance and American Association of University Professors, for their support in petitioning the U.S. Copyright Office for the TDM exemptions.
- Previous DMCA exemption for TDM research. As background for the 2021 TDM exemptions, you can read our 2020 petition, comments in support, response to Copyright Office questions, reply comments, and a record of our Ex parte meeting with the Office.
- TDM educational resources include slides from the more than two dozen in-person and online webinars, such as this session we led at Princeton in 2024. You can find an example of the slides for these workshops here, and a version of the lecture portion of this workshop is available in a recording here. If you are interested in learning more about legal aspects of TDM or think that Authors Alliance may be able to support efforts to navigate these issues at your institution (e.g., by meeting with individuals or leading a workshop), please contact info@authorsalliance.org.
If you would like to learn more about TDM, check out these external resources:
- The Data-Sitters Club, where you can learn real world examples of how TDM works
- LLTDM-X, a project that explores cross-border TDM issues
- Building Legal Literacies for Text Data Mining, an OER for practitioners to understand the laws, policies, and ethics of TDM