In December 2023, The New York Times sued OpenAI for copyright infringement in the training of ChatGPT. In February 2024, The Intercept, Raw Story, and AlterNet filed similar lawsuits against OpenAI.
The prevailing view in the industry is that Large Language Models are black boxes, it is not technically possible to allocate fractional royalty payments to the copyright holders of documents used in the production of an LLM response.
At Neocortix, we've invented a new technology called Deep Attribution Networks, which allow Modern AI Systems to explicitly remember their training data, so that they can identify the source of what they are talking about, on a token-by-token basis. This can be used to determine the number of tokens in an LLM response attributable to different source documents. So it is easy to compute the fractional royalties to be paid to the copyright holders who contributed to the LLM response. You can see it working in the demo, above.
Deep Attribution Networks by Neocortix are an essential technology for providing citations and allocating royalties in Large Language Models.