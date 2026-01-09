TOWARDS the end of 2025, the Department for Promotion of Industry and Internal Trade published a working paper on 'Generative AI and copyright'. It is the report of an expert committee to examine the vexed question of copyright protection in the era of artificial intelligence, and its findings will most likely form the basis of India's policy on the subject in the near future.

Generative AI products like ChatGPT, Gemini, Perplexity, etc, are large language models (LLMs) that generate content based on prompts given by users. For instance, one can provide a brief storyline and ask ChatGPT to write a short story in the style of RK Narayan or Premchand, and it would generate it. Similarly, Dall-E, Midjourney (and other similar text-to-image and text-to-video generation models) can make a painting on a given subject based on prompts to generate it in the style of, say, Jamini Roy or MF Husain or a short movie clip in Satyajit Ray's style. Such outputs from generative AI models are based on their training that uses data from a variety of types of sources (like novels of Narayan or paintings of Husain and others).

The data used for training AI models may fall in different categories — copyrighted, copyright-expired works and data in the public domain available for 'fair use'. Ever since technology firms launched commercial versions of generative AI products, the question of their use of copyrighted material, such as books, research papers, photographs, films and other forms of creative expression has become central to the AI debate. It has posed complex legal, ethical and moral questions. Another unresolved issue is the copyrightability and authorship of AI-generated outputs.

Governments and courts in many countries have been struggling to deal with this new phenomenon — data training of AI models. Technology companies worldwide, including in India, have argued that AI models do not violate copyright laws as they are not copying or plagiarising copyrighted books, photographs, etc, but using them only as segregated datasets to train algorithms using patterns, styles, structures in terms of statistical relationships to enable them to generate new content. This, they claim, amounts to the well-accepted dictum of 'fair use' of creative works.

To argue that AI models are not 'copying' original works but only 'learning' from them does not hold water. This is because the process of training an AI system involves multiple steps, including copying and storage of data (original works), which, in effect, constitutes infringement. This, the AI industry says, can at most be considered a 'technical infringement', not a legal one.

Rejecting the notion that no copyright licensing is required, and after studying various models being discussed or implemented elsewhere in the world, the expert committee has recommended a hybrid framework for India. It has suggested a blanket licence to AI developers for the use of lawfully accessed copyright-protected works for training AI systems, provided the copyright holders are paid a royalty. But rightsholders will not have the option to withhold their works for use in the training of AI systems.

A centralised non-profit entity consisting of rightsholder organisations and designated by the government would be responsible for collecting the payments from AI developers. This entity would have copyright societies and industry organisations as its members, and these member-organisations would be responsible for passing on the royalties to individual creators. A certain percentage of the revenue generated from AI systems trained on copyrighted content would be payable as royalties, and the rates would be fixed by a government committee.

Technology companies argue that regulating the use of creative works and enforcing new copyright laws to make licensing compulsory would hinder technological innovation. They demand complete exemption of text and data mining (TDM) from copyright laws to enable the training of AI models. The representative of the industry body, Nasscom, in the expert committee disagreed with its recommendations and gave a dissent note. Instead of a copyright system based on the revenue of AI models, it wants a layered system of using copyrighted materials.

Nasscom has suggested that it was the responsibility of rightsholders to prevent the use of their publicly available work for TDM, and for this, they should be given an 'opt out' option at the point of availability of their work. For content which is not publicly accessible, rightsholders should be able to protect their rights through contract or license terms. All this puts the onus on rightsholders to protect their work, which, under the present circumstances, is very difficult because copyright is being violated openly and protected material is available online illegally. Technology companies have already mined data from millions of books thus available online. The 'opt out' option also seems impractical in this scenario.

In both cases, people who create new works (rightsholders) are going to be at the receiving end. The expert panel wants automatic availability of copyright-protected works for training of AI systems and will make it a legal certainty to help the AI industry, while denying copyright holders the right to opt out of the TDM system. On the other hand, the industry is not ready to accept the royalty-based system proposed by the government, but wants to make copyright holders responsible for protecting their works through means like an 'opt out' or individual contracts. Both types of regimes are unfair to the creators of original content. In any case, the creators will have to depend on policies and whims of the platforms (and other mediators) where their content is displayed.

The AI companies have begun generating billions of dollars of business, and the volume is projected to grow. Yes, LLMs and other models are a result of technological innovation, but one that critically hinges on the digital theft of the work produced by millions of creative people around the globe over the decades. The copyright framework India is proposing must place the interests of creators ahead of those of the industry.