Ensuring text and data mining: remaining issues with the EU copyright exceptions and possible ways out

Rossana Ducato* (Corresponding Author), Alain Strowel

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

44 Downloads (Pure)


Text and Data Mining (TDM) is a vital tool in the Big Data economy. TDM uses techniques from natural language processing, machine learning, information retrieval, and knowledge management for the automated analysis of digital content (structured and unstructured data), in order to extract information, identify patterns, discover new trends, insights or correlations.
The importance of TDM has been understood by the European legislator, which has introduced two specifically tailored exceptions in the Copyright in the Digital Single Market Directive. After a critical analysis of the new provisions, the paper argues that they still present several flaws that risk to stifle AI developments in Europe. Thus, the contribution outlines an interpretative framework, based on the analysis of the infringement test, to rethink the rights of reproduction and extraction in line with the economic rationale of copyright and the database right. Furthermore, the paper makes suggestions to improve the TDM exceptions at national level. In conclusion, it points out the remaining challenges of private ordering and trade secrets for research and AI innovation.
Original languageEnglish
Pages (from-to)322-337
Number of pages16
JournalEuropean Intellectual Property Review
Issue number5
Publication statusPublished - 21 Apr 2021

Bibliographical note

This article updates and expands the work presented in A. Strowel and R. Ducato, “Artificial intelligence and text and data mining: a copyright carol” in E. Rosati (ed.), Handbook of EU Copyright Law, Routledge, forthcoming 2021.
A sincere thanks to Roberto Caso and Ula Furgal for the constructive discussion on an early draft of this article.
The authors have jointly conceived the paper and share the views expressed therein. Nonetheless, while Section 4 is attributable to Alain Strowel, Section 3 is specifically attributable to Rossana Ducato. Both authors equally contributed to the drafting of the remaining sections.


  • Copyright
  • sui generis right
  • exceptions and limitations
  • text and data mining
  • research
  • technological protection measures
  • contract


Dive into the research topics of 'Ensuring text and data mining: remaining issues with the EU copyright exceptions and possible ways out'. Together they form a unique fingerprint.

Cite this