The European Union has established an exception to copyright law to regulate automated text and AI data mining. This device redefines the relationship between creator and innovator
The European Union marked a turning point in the race for innovation by adopting an exception to copyright law called text and data mining (TDM). By enabling digital tools to automatically explore and analyze huge volumes of data, including copyrighted content, this measure provides AI companies with an unprecedented opportunity to develop disruptive solutions and strengthen their competitiveness in an ever-changing digital environment. The application of this exception to generative AI services like ChatGPT training with potentially protected objects is now approved. Its implementation will nevertheless require adjustments.
Exceptions to copyright exist in order to maintain a fair balance between, on the one hand, the interest of owners of copyright and related rights in the protection of their rights and, on the other hand, the protection of the interests and fundamental rights of users of protected content, as well as the general interest. The TDM exception was created to support innovation and allow greater legal certainty for artificial intelligence services training with potentially protected objects. Indeed, copyright had been identified as a potential obstacle to the development of artificial intelligence.
The TDM exception preserves this balance through a double requirement: access to protected content must be lawful and rights holders retain the possibility of exercising their right of opt-out to oppose the exploration of their work.
Practical challenges
The right of opt-out is an essential guarantee for the balance of the copyright exception, but its practical implementation poses difficulties for providers. Indeed, many rights holders express their opposition by letters or non-standardized formats. This lack of standardization makes their processing difficult, if not impossible, for AI providers, while the European text requires an “appropriate” opt-out, that is to say carried out by machine-readable processes identifiable using cutting-edge technologies.
To overcome these difficulties, certain AI players offer their own control tools for rights holders allowing them to exercise their opt-out using processes that they know are readable by their machine.
Transparency, proof and legal balance
To address the concerns of rights holders – who fear that they will not be able to demonstrate the infringement of their rights despite exercising the opt-out – the AI Regulation requires AI providers to publish a sufficiently detailed summary of the training data. This transparency requirement allows creators to verify the absence of unauthorized exploitation following the exercise of an opt-out, while preserving the business secrecy of AI suppliers. It should be remembered that the burden of proof lies in principle with the person making the allegation. In terms of counterfeiting, this requirement is in addition to existing legal tools, such as seizure of counterfeiting or summary proceedings, already allowing rights holders to collect evidence of possible illicit use. Despite this, some people question the effectiveness of these traditional means of proof in the face of AI models.
If the owners exercise their right of opt-out but the providers nevertheless wish to use the contents, particularly during the so-called refinement phase of the development of the AI model or for updating purposes, the parties can come together to negotiate a fair remuneration. In practice, however, it is unrealistic to consider individual agreements for each copyrighted work, given the immense volume of data. Such a move would hamper innovation in the sector. The next challenge for rights holders and AI operators will be to define the mode of negotiation most favorable to establishing their mutual interests.




