To Bill Or Not To Bill: Royalty, Revenue In The Age Of AI

The core issue over AI content is copyright and revenue-sharing. Revisiting Fair Use provisions may allow for amicable solutions that permit AI models to access content without hurting the interests of their creators

Dec 04, 2024 · 6 min read

To Bill Or Not To Bill: Royalty, Revenue In The Age Of AI

The evolution of AI (especially Large Language Model or LLM)-based content, has created a global furore of late. Pundits are still debating its implications for businesses, livelihood, creativity and what it actually means for society at large.

On the ground, the battle lines are being sharply drawn in the content industry, with rival businesses often taking completely different routes on the issue. Some recent developments or trends warrant introspection, as it may have long-run consequences.

One of the biggest ‘news’ in this regard is the decision by Penguin Random House (PRH) to amend its copyright wordings that now explicitly states “No part of this book may be used or reproduced in any manner for the purpose of training artificial intelligence technologies or systems”. PRH, which is one of the ‘big five’ in English publishing, with around 365 imprints under its banner, has clarified that this will be included in all new titles and any backlist titles that are reprinted.

Training The Beast

This comes in sharp contrast to the recent announcement by its rival ‘big five’ member HarperCollins, which has signed a deal with Microsoft AI that allows the latter to use nonfiction titles from the publisher to train its AI models. The deal has a provision for authors to opt out of this arrangement and supposedly is not meant to generate new books without human authors. However, the exact details of the deal and how it is to be implemented have not been fully disclosed yet.

Things get even more interesting when one considers the fact that the HarperCollins deal is an extension of its parent company News Corp’s earlier deal with OpenAI that allows the latter full access to the formers ‘assets’, such as The Wall Street Journal, The New York Post, The Daily Telegraph and others, for a sum of US$ 250 million.

In contrast, the PRH decision seems to run contrary to its parent group Bertelsmann’s position on AI. Bertelsmann has launched a global media campaign to promote opportunities for AI.

Divided Over Content

The content industry seems to be divided both vertically and horizontally on how to deal with AI content. This is not just a temporal state of flux but a reflection of a bigger and more fundamental dilemma or ‘conflict’.

The core issue regarding AI-generated content is copyright and related royalty or other revenue-sharing mechanisms. The copyright system was institutionalised at a global level over the years via the Berne Convention, WIPO Copyright Treaty, TRIPS agreements under WTO, the Paris Convention, etc.

While these arrangements have suitably adapted from print to multimedia to even digital media, AI-generated content throws the proverbial spanner in the works. AI-generated content has some inherent problems with copyright. A fundamental problem arises from the fact that since these contents are machine-generated and not by humans, they cannot be copyrighted.

The problem is further exacerbated by the fact that AI content is generated by the replication of patterns in the data on which the model is trained. Simply put, if an AI is instructed to create a classic like Hamlet, it has to ‘train’ on the writings of Shakespeare. This then leads to the fundamental debate on whether the content is indeed an ‘original’ work of creativity or an artificial replication.

A related issue in this regard is the attribution of the original content. If an AI system is trained on content generated by Shakespeare to generate replicated content, some attribution of credit (or revenue generated from such content) is owed to those who own the rights to Shakespeare’s content. This is the genesis of the ‘conflict’; the machine cannot be considered as the ‘author’ of the content, nor can it own its copyright.

Copyright Wars

The obvious outcome of this conflict is litigation. Some of the most notable global ongoing litigations are Alter vs OpenAI, Andersen vs Stability AI, Daily News vs Microsoft, and Getty Images vs Stability AI, to name a few. This covers conflict over content ranging from written materials to artwork to audio-visual content.

AI firms have sought to resolve these conflicts by engaging in high-valued deals with content producers. OpenAI, one of the pioneers in LLM application, has signed deals with Times Magazine, Financial Times, Le Monde, Conde Nast and Prisa Media, besides NewsCorp, even as it battles out The New York Times, The Intercept or recently ANI India in court.

It is presumed that many of the ongoing litigations might be finally settled by such contractual agreements where the AI companies get into commercial agreements with traditional content generators.

However, it also implies that these AI firms now get exclusive access to such content. As discussed earlier, content is a critical element for any AI model. The success of the model depends on the data or content on which it trains.

Exclusive access then becomes an effective tool for restricting market access for competitors. This is a classic case of market distortion, where an existing business with deep pockets can create entry barriers for potential competition or prevent it from scaling.

The other extreme position, as adopted by PRH, is that it prevents any form of AI training risks derailing the development of AI technology.

Both exclusive contracts and total blockade by content generators are equally problematic for the future of AI. But a more fundamental problem is that in the long run, neither addresses the core conflict. That can only be solved by redesigning copyright and licensing provisions to account for machine-generated content.

Fair Use, Fair Dealing

There is ample scope for an equitable solution under copyright regulations, because copyright is not an absolute right. Most copyright provisions have extensive Fair Use and Fair Dealing clauses, which allow reproduction, copying or use of an author’s work in certain situations without the latter’s explicit consent.

Most such provisions recognise usage for research or usage for public good to be permissible under such provisions. For instance in India, Oxford University Press, Cambridge University Press and Taylor & Francis — three global publishing giants — filed a case of copyright infringement against a small photocopy service provider that provided photocopied content from books by these publishers to students in Delhi University. In this David vs Goliath battle, the court invoked the Fair Use clause to absolve the photocopy service provider.

Section 13 of the TRIPS agreement (that reads “Members shall confine limitations or exceptions to exclusive rights to certain special cases which do not conflict with a normal exploitation of the work and do not unreasonably prejudice the legitimate interests of the right holder”) forms the basis for all Fair Use provisions in different jurisdictions.

Presently, it is not possible to invoke the provisions for training AI models because, under the present conflict, the legitimate interests of the content owners are not duly safeguarded.

However, given that AI-generated content has already triggered a global debate on revisiting copyright rules, Fair Use provisions may also be revisited to allow for amicable solutions that will permit AI models to be developed without hurting the interests of the original creators.

(The author is a New Delhi-based economist with over a decade's experience in studying the digital sector. Views expressed are personal)

To Bill Or Not To Bill: Royalty, Revenue In The Age Of AI

The core issue over AI content is copyright and revenue-sharing. Revisiting Fair Use provisions may allow for amicable solutions that permit AI models to access content without hurting the interests of their creators

TAGS: