Copyright and AI: 7 recommendations to the UK (and the EU alike)

SAA member's map

On 25 February, the SAA submitted its response to the UK’s public consultation on copyright and artificial intelligence. The UK proposes to introduce an exception for AI training with the possibility of rights reservations, similar to the EU. We responded that an exception for AI training is far from helpful in achieving the UK’s (and the EU’s) objective of promoting innovation while protecting the creative sector. Collective licensing is the most acceptable solution.

This text is based on the SAA’s submission to the UK consultation and outlines our 7 key recommendations to the UK (and the EU alike).

  1. AI companies should ask for permission (licensing), not authors' objections.
  2. Collective management provides AI companies with authorisation (licences) and authors with remuneration.
  3. Transparency is key. It does not undermine trade secrets.
  4. The text and data mining exception is a copyright loophole.
  5. Copyright protects human creativity and authorship.
  6. Licensing of works should cover both AI inputs and outputs.
  7. AI generated content must be labelled.

Introduction

A study by CISAC has estimated that audiovisual authors will lose 21% of their revenues between now and 2028, while the market for AI-generated content will increase from €3 billion to €64 billion over the same period. To stop this trend that is impoverishing the creative sector, authors should be remunerated for their work used in the context of AI. CMOs are best placed to handle the ample repertoire that AI companies need to train and develop their AI models and systems and represent a single point of contact that is able to drastically reduce the cost of multiple individual licences.

Indeed, it is a well-known fact that copyright-protected works are considered high-quality by AI companies, as these ensure an optimal development of the technology. Open AI itself has admitted so to the UK parliament. On the contrary, works that are not created by humans and are instead synthetic lead to a deterioration of the technology.

As much as AI companies need copyright-protected works, the legislator should recognise that innovation cannot be encouraged by not respecting the rights of the authors. Letting AI companies use protected works without any form of authorisation and remuneration would lead to a constant devaluation of human-authored works. It is unacceptable to encourage innovation in an emerging technology on the shoulders and on the future of another sector.

As the UK explores its options, it has an opportunity to learn from the EU's mistakes and become a pioneer in the correct application of copyright in the context of AI. These are the SAA's recommendations for the UK, as well as for the EU.


1) AI companies should ask for permission (licensing), not authors' objections.

The audiovisual sector is characterised by many intermediaries, and authors are far from the final users of their works, who are major economic operators in the digital environment. Legislators must therefore ensure a way for authors to still be associated with the exploitation revenues of their works.

AI developers should seek licensing of the works subject to the opt-out. While this should be the basic practice, the situation in the EU already tells us that this is far from happening. Even in the presence of a reservation of rights, the works are still being used, only further alimenting the growing number of court cases against AI companies and creating an atmosphere of distrust by authors towards AI companies.

The wordings ‘machine readable’ used by the EU legislator creates uncertainty and gives AI companies the chance to not follow a rights reservation declaration, even though AI should be smart enough to understand all languages, including natural language (as a judgement by the Hamburg Regional Court has specified). Moreover, legislation should clarify who is in charge of this rights reservation declaration and where it should appear: on the CMO's website? On the creator's or producer's website? In the content of the work? On the website of the model provider or generative AI system? All these questions would need to be answered and clarified before following a model that creates more problems than it can solve.

Additionally, pushing standards such as robots.txt, or more generally establishing a rights reservation scheme, as the EU is trying to do, is merely giving the chance to AI companies to escape their liabilities. It is yet to be established how the AI can be un-trained, meaning that it is not possible to remove works that have already been used for AI training. This fact alone should further encourage the creation of a licensing market and liability provisions that AI companies should respond to.

If a rights reservation regime similar to the EU one is going to be established in the UK (which we strongly oppose), rightsholders should be given the chance to express their rights reservation in all languages, including natural one. It is in fact impossible for each rightsholder to go after every AI company that appears any other day. It is instead easier for AI companies to conduct proper rights compliance before they start training their models and ensure that, if there is a rights reservation declaration in place, a licensing agreement is due. Encouraging a licensing market would also cover the issue of un-training the AI.

2) Collective management provides AI companies with authorisation (licences) and authors with remuneration.

The current market of licensing of audiovisual works for AI training is practically inexistent. The latest available data suggests that licensing is limited to a few agreements between certain news companies and AI providers. It is therefore imperative to introduce measures that are going to oblige AI companies to conclude licences and remunerate authors for the use of copyright-protected works.

Considering the amount of content that is needed for optimal AI development, collective management solutions are the best way to ensure that AI companies can obtain permission from CMOs to use entire repertoires and that the revenues generated from this use are duly shared between the authors.

It is more time consuming and overwhelming for both businesses and authors to deal with authorisations work-by-work, than having an entity such as a CMO that is able to handle all authorisations and to ensure the revenues for the use of works go from the AI companies to the authors. Individual licensing would only introduce difficulties in the conclusion of agreements and elevate costs for AI companies, that should conclude thousands of agreements and face the dangers of being accused of copyright infringement by claimants they have not concluded licences with.

As in other cases where use of high amounts of works is necessary, remuneration to authors can be ensured by relying on collective management systems.  Scholarship has long agreed with this: see for instance Xalabarder who believes that collecting management is the best solution when dealing with copyright protected works in the digital environment.

3) Transparency is key. It does not undermine trade secrets.

The current approach of the EU on the transparency obligations in AI matters is encouraging AI companies to be too general in their information sharing and is stopping rightsholders from having acceptable and updated information about which works are being used. The current EU AI framework in fact only obliges to publicly disclose a summary of the content used, and AI companies shield themselves behind trade secrets to justify their unwillingness to collaborate. Rightsholders are left to guess and anticipate that all protected works are being used – even works gathered from illegal sources.

Transparency obligations are a fundamental part of the AI framework and should lead the rightsholders to know which works are being used, in order to duly license the works and get or redistribute (in the case of CMOs) the revenues from the use of the works. Without such information, it is impossible for rightsholders to operate.

While CMOs are capable of handling confidential information, sharing for example a list of URLs has recently been considered not to undermine any trade secrets. A report commissioned by the French Ministry of Culture has specified that a list of URLs does not represent a trade secret, since it would merely be the ‘ingredient’ of the AI. Filtering methods, which rightsholders do not need nor aim at obtaining, can instead be considered the ‘recipe’ and therefore can be protected by trade secrets’ regulations.

4) The text and data mining exception is a copyright loophole.

The EU TDM exception for non-commercial research purposes (Art 3 DSM) is valid if one looks at the original scope. However, the way it is currently being abused should alert on the issues related to the application of this exception especially when it comes to generative AI and its value chain. Indeed, in the EU, not only has this exception been extended to generative AI – and this is already a questionable choice since AI is not equal to TDM [see Tim W. Dornis] – but also the way it is applied in the AI context is reaching far beyond its already enlarged scope.

The problem has emerged in the judgement of the Hamburg Regional Court. The case involved a photographer (author) and the provider/developer of a dataset. This provider had created the dataset by taking advantage of the research exception for TDM. However, this dataset was later used by an AI company with commercial purposes. Unfortunately, despite the author pointing out the abuse, the court did not recognise the practice as problematic and granted the applicability of the TDM exception for research purposes. This choice has serious consequences on the application of said exception: this TDM loophole, if used by dataset providers, puts into question even a system that is based on rights reservation.

5) Copyright protects human creativity and authorship.

One of the founding principles of copyright protection is the incentivisation of human creativity. The need for protection to incentivise the production of generative AI outputs does not seem to apply in this case. In fact, production of content via generative AI is far from stopping. Moreover, copyright exists in case of human authorship. When using generative AI, human authorship is lost, further arguing against the protection of AI-generated works.   

Slightly different is the case of AI-assisted works. Creation with the assistance of technology has been common for long and has proved to be helpful in the creative sectors. In the audiovisual sector, AI tools have been used for years to improve visual effects and streamline post-production processes, enhancing the visual experience of the audience. AI-assisted creation would however always need human intervention and direction. Eligibility for copyright protection should continue relying on the originality and the creative choices of the human beings.

6) Licensing of works should cover both AI inputs and outputs.

Considering the dangers of AI outputs being infringing, and the difficulties in controlling end-users’ behaviours, licensing of works should cover AI outputs as much as AI inputs, and AI providers should enact technological measures, including keyword filtering, to reduce the dangers of reproduction of works (in full or in part) in the output. It is worth considering developing a framework regarding the liability of AI providers resulting in a limited liability system for end-users. This should cover both input and output related infringements. It is also worth striving to establish rules on the burden of proof and rules for removing content infringing copyright. The proposed measures would ensure that AI applications are still attractive for end-users, while at the same time providing the remuneration due to authors of the protected works used.

7) AI generated content must be labelled.

Generative AI outputs should certainly be labelled as AI generated. It is important that consumers understand the difference between AI-generated content and a human-created work via a clear label, as it is often difficult to distinguish between the two, and it will become even more difficult as AI further develops. Labelling, other than for consumers, is important for the value of human works. Indeed, by keeping the public aware of the difference, human authorship will be able to maintain its value over time. It is appreciable that the EU has included output labelling provisions in the AI Act. However, it is yet to be seen how these provisions will be applied and what is going to be the effect on the markets.