Google Cloud has announced in a blog post that if customers are challenged on copyright grounds through using its generative AI products (Duet AI), Google will offer limited indemnity and assume responsibility for the potential legal risks involved.
Why?
With the many different generative AI services (such as AI chatbots and image generators) being powered by back-end neural networks / Large Language Models (LLMs) that have been trained using content from many different sources (without consent or payment), businesses that use their outputs face risks. For example, content creators like artists and writers may take legal action and seek compensation for LLMs using their work for training and which, as a result, can appear to copy their style in their output, thereby raising potential issues of copyright, lost income, devaluation of their work and more. Real examples include:
– In January this year, illustrators Sarah Andersen, Kelly McKernan, and Karla Ortiz filing a lawsuit against Midjourney Inc, DeviantArt Inc (DreamUp), and Stability AI, alleging that the text-to-image platforms have used their artworks, without consent or compensation, to train their algorithms.
– In February this year, Getty Images filing a lawsuit against Stability AI, alleging that it had copied 12 million images (without consent or permission) to train its AI model.
– Comedian Sarah Silverman joining lawsuits (in July 2023) accusing OpenAI and Meta of training their algorithms on her writing without permission.
– GitHub facing litigation over accusations that they have allegedly scraped artists’ work for their AI products.
– Microsoft, Microsoft’s GitHub, and OpenAI facing a lawsuit over alleged code copying by GitHub’s Copilot programming suggestion service.
Although all these relate to lawsuits against the AI companies themselves and not their customers, the AI companies realise that this is also a risky area for customers because of how their AI models have been trained and where they could get their outputs from.
What Are The AI Companies Saying In Their Defence?
Examples of the kinds of arguments that AI companies being accused of copyright infringement are using in their defence include:
– Some AI companies argue that the data used to train their models is under the principle of “fair use.” Fair use is a legal doctrine that promotes freedom of expression by allowing the unlicensed use of copyright-protected works in certain circumstances. For example, the argument is that the vast amount of data used to train models like ChatGPT’s GPT-4 is processed in a transformative manner, which AI companies like OpenAI may argue means the output generated is distinct and not a direct reproduction of the original content.
– Another defence revolves around the idea that AI models, especially large ones, aggregate and anonymise data to such an extent that individual sources become indistinguishable in the final model. This could mean that, while a model might be trained on vast amounts of text, it doesn’t technically “remember” or “store” specific books, articles, or other content in a retrievable form.
– Yet another-counter argument by some AI companies is that while an AI tool has the ‘potential’ for misuse, it is up to the end-users to use it responsibly and ethically. This means that AI companies can argue that because they often provide guidelines and terms of service that outline acceptable uses of their technology, plus they actively try and discourage/prevent uses that could lead to copyright infringement, they are therefore (ostensibly) encouraging responsible use.
Google’s Generative AI Indemnification
Like Microsoft’s September announcement that it would defend its paying customers if they faced any copyright lawsuits for using Copilot, Google has just announced for its Google Cloud customers (who are pay-as-you-go) that it will be offering them its own AI indemnification protection. Google says that since it has embedded the always-on ‘Duet AI’ across its products, it needs to put its customers first and in the spirit of “shared fate” it will “assume responsibility for the potential legal risks involved.”
A Two-Pronged Approach
Google says it will be taking a “two-pronged, industry-first approach” to this indemnification. This means that it will provide indemnity for both the training data used by Google for generative AI models, and for the generated output of its AI models – two layers of protection.
In relation to the training data, which has been a source of many lawsuits for AI companies and could be an area of risk for Google’s customers, Google says its indemnity will cover “any allegations that Google’s use of training data to create any of our generative models utilised by a generative AI service, infringes a third party’s intellectual property right.” For business users of Google Cloud and its Duet AI, this means they’ll be protected regardless against third parties claiming copyright infringement as a result of Google’s use of training data.
In relation to Google’s generated output indemnity, Google says it will apply to Duet AI in Google Workspace and to a range of Google Cloud services which it names as:
– Duet AI in Workspace, including generated text in Google Docs and Gmail and generated images in Google Slides and Google Meet.
– Duet AI in Google Cloud including Duet AI for assisted application development.
– Vertex AI Search.
– Vertex AI Conversation.
– Vertex AI Text Embedding API / Multimodal Embeddings.
– Visual Captioning / Visual Q&A on Vertex AI.
– Codey APIs.
Google says the generated output indemnity will mean that customers will be covered when using the above-named products against third-party IP claims, including copyright.
One Caveat – Responsible Practices
The one caveat that Google gives is that it won’t be able to cover customers where they have intentionally created or used generated output to infringe the rights of others. In other words, customers can’t expect Google to cover them if they ask Duet AI to deliberately copy another person’s work/content.
The Difference
Google says the difference between its AI indemnity protection and that offered by others (e.g. Microsoft), is essentially that it covers the training data aspect and not just the output of its generative AI tools.
Bots Talking To Each Other?
Interestingly, another twist in the complex and emerging world of generative AI last week were reports that companies are using “synthetic humans” (i.e. bots), each with characteristics drawn from ethnographic research on real people and using them to take part in conversations with other bots and real people to help generate new product and marketing ideas.
For example, Fantasy, a company that creates the ‘synthetic humans’ for conversations has reported that the benefits of using them include both the creation of novel ideas for clients, and prompting real humans included in their conversations to be more creative, i.e. stimulating more creative brainstorming. However, although it sounds useful, one aspect to consider is where the bots may get their ‘ideas’ from since they’re not able to actually think. Could they potentially use another company’s ideas?
What Does This Mean For Your Business?
Since the big AI investors like Google and Microsoft have committed so fully to AI and introduced ‘always-on’ AI assistants to services for their paying business customers (thereby encouraging them to use the AI without being able to restrict all the ways its used), it seems right that they’d need to offer some kind of cover, e.g. for any inadvertent copyright issues.
This is also a way for Google and Microsoft to reduce the risks and worries of their business customers (customer retention). Google, Microsoft, and other AI companies have also realised that they can feel relatively safe in offering indemnity at the moment as they know that many of the legal aspects of generative AI’s outputs and the training of its models are very complex areas that are still developing.
They may also feel that taking responsibility in this way at least gives them a chance to get involved in the cases, argue, and have a say in the cases (particularly with their financial and legal might) that will set the precedents that will guide the use of generative AI going forward. It’s also possible that many cases could take some time to be resolved due to the complexities of the new, developing, and often difficult to frontier of the digital world.
Some may also say that many of the services that Google’s offering indemnity for could mostly be classed as internal use services, whilst others may say that the company could be opening itself up to a potential tsunami of legal cases given the list of services it covers and the fact not all business users will be versed in what the nuances of responsible use in what is a developing area. Google and Microsoft may ultimately need to build-in legal protection and guidance of what can be used to the output of their generative AI.
As a footnote, it would be interesting to see whether ‘synthetic human’ bots could be used to discuss and sort out many of the complex legal areas around AI use (AI discussing the legal aspects of itself with people – perhaps with lawyers) and whether AI will be used in research for any legal cases over copyright?
Generative AI is clearly a fast developing and fascinating area with both benefits and challenges.