AI’s Copyright Enigma: Indian Laws May Fall Short Of Ensuring Fair Play

India must carefully assess the copyright issue pertaining to artificial intelligence, which is crucial for the innovation ecosystem and India’s aspirations to become a global AI leader

The New York Times recently sued ChatGPT owner OpenAI for copyright infringement, claiming the latter had used millions of NYT articles to train AI chatbots without permission. Several other authors, artists, and media houses have filed similar lawsuits against AI developers in the US and the UK, though we are yet to see any legal action in India.

The Secretariat had earlier covered the developments of the legal tussle between the NYT and OpenAI and questioned if the legacy media houses, including the NYT, are doing enough to adapt to the inevitability of artificial intelligence.

AI tools today can write novels, produce music, generate images, and even create videos. People are using DALL·E to design digital art for their brand campaigns. ChatGPT is used to create catchy advertisement slogans.

AI models need to train on massive datasets to produce such outputs. The datasets may include existing images, books, articles, and paintings, many of which are copyright protected.

For instance, one can ask Stable Diffusion, an image-generating AI tool, to generate images in the “style” of an artist. For this, Stable Diffusion would have “learnt” the style of the artist, by training on countless paintings of the artist.

If the image generated by Stable Diffusion is not substantially similar to the artist’s paintings, is the AI model still liable for copyright infringement only because it trained on copyrighted paintings of the artist? There is no straight answer.

AI Builds On Human Ideas: Is That A Copyright Violation?

Copyright laws do not protect ideas. They protect how an idea is expressed. AI developers thus argue that similar to how copyright laws permit humans to learn from and build upon others’ ideas and concepts, AI tools are also simply “learning” patterns, facts, and correlations from these copyrighted materials to produce new outputs.

Creators, however, disagree, arguing that AI models extract the exact expressive choices of the creator such as the words the author used, their order of placement, and the style of writing or painting. This is reflected in the outputs generated.

Another defence taken by AI developers is of ‘fair use’. The fair use exception under US copyright law allows the use of copyrighted content without obtaining the creator’s permission in certain cases. It is determined through a four-factor test, which includes looking at the purpose and character of the use, the nature of the copyrighted work, the amount of work used and the effect on the market value of the original.

AI developers argue that using copyrighted material for AI model training serves a distinct purpose than the original work and the output generated is not a mere substitute of the original. Hence, it does not impact its market value. In other words, the AI tool “transforms” the original works into something different.

The slew of copyright infringement actions against AI developers has compelled countries to take some steps to solve this issue. The US Copyright Office has opened this issue for consultation.Singapore has introduced an exception for text and data mining in its copyright law, which can be used for making copies of copyrighted content to “teach” AI models, though the copies have to be accessed lawfully.

Canada also did a consultation, which called for evidence to demonstrate the impact of the Text Data Mining (TDM) exception on AI and creative industries. On the other hand, the UK rejected the extension of the TDM exception for commercial development of AI, on account of the adverse impact on the creative industry. However, it is actively engaging with stakeholders to arrive at a workable solution. Japan, amid rising creator concerns over AI models, launched a consultation in 2023 to reconsider its information analysis exception (similar to a TDM exception).

Need For Changes In India

Legal or policy proposals are yet to be introduced in India. The fair dealing exception in Indian law is narrower as compared to fair use in the US. It is available only for the specific exceptions listed in the law, which currently does not expressly include training AI models.

One of the many arguments that developers could use is the broad exception for “research” to justify their collection of training datasets. But clarity on whether these arguments will work would require a conclusive interpretation by a court. There is also no text and data mining exception under Indian law. Thus, there are open questions on whether and in what circumstances AI developers can use copyrighted content to train their AI models.

Interestingly, while responding to a parliamentary question on this issue, Minister of State for Industry and Commerce Som Prakash said that India’s existing copyright law was sufficient and there was no proposal to amend the law in the context of AI-generated works.

He also added that users of generative AI must obtain permission from creators for copyrighted content, if such use is not covered under fair dealing exceptions. However, the response did not clarify the specific fair dealing ground that would cover such use.

Industry players have also started calling for specific measures and clarity to address this issue. The Digital News Publisher Association (DNPA), an industry association for digital news publishers, recently wrote to the IT Ministry, calling for compensation for the use of their works to train AI models, highlighting the need to protect copyrighted works from potential violations by AI tools.

Legal uncertainty in the ecosystem means that AI developers in India have the sword of an infringement claim hanging over their heads. Alternatively, AI developers may take a conservative approach and only use licensed or copyright-free sources to train their AI models, which affects the quality of the data sets, and in turn the efficacy of the AI model itself.

This is because both the quantity and diversity of the data are crucial for the efficacy of the AI model. AI developers argue that an AI model trained on a narrow data set will not have enough information to learn the required patterns needed to generate high-quality outputs.

This particularly affects AI startups, who lack adequate resources for acquiring licensed data sets. They will either have to incur huge costs to procure licensed data or limit their training to only unprotected sources, thereby reducing their competitive advantage.

Promote Innovation, Protect Creator's Interests

However, it is also important to give creators control over how their works are being used. If the outputs generated by AI models directly compete with the creator’s original work, it not only affects the creator’s livelihood but also goes against the heart of what copyright law stands for -- incentivising creative expression.

Thus, there is a need to have a framework that promotes innovation while also protecting the interests of creators.

Although Minister of State for Electronics and Information Technology Rajeev Chandrasekhar has recognised that the copyright issue in AI is important, but the observation has yet to see any formal dialogue in the country. As India positions itself as a global leader in AI innovation by undertaking initiatives for skilling, building computing infrastructure and data sharing, it must also look towards assessing this issue.

India can take a leaf out of other jurisdictions and hold extensive stakeholder consultations, to get views from both the developers and the creators. This will help take a considered and balanced view, which is crucial for the AI innovation ecosystem and India’s aspirations to become a global AI leader.

(The authors are lawyers with Ikigai Law, a tech-focused law and policy firm. Aman Taneja, from Ikigai Law, also gave inputs on the article. Views expressedare personal)

This is a free story, Feel free to share.

facebooktwitterlinkedInwhatsApp