← Back to incidents
Stability AI Sued by Getty Images and Artists for Training Stable Diffusion on Copyrighted Images
HighGetty Images and multiple artists filed lawsuits against Stability AI alleging the company trained Stable Diffusion on billions of copyrighted images without permission, seeking damages and injunctive relief.
Category
Copyright Violation
Industry
Media
Status
Litigation Pending
Date Occurred
Aug 22, 2022
Date Reported
Jan 17, 2023
Jurisdiction
US
AI Provider
Other/Unknown
Model
Stable Diffusion
Application Type
api integration
Harm Type
legal
Human Review in Place
No
Litigation Filed
Yes
Litigation Status
pending
copyrightstable_diffusionartist_rightsfair_usetraining_dataintellectual_property
Full Description
In January 2023, Getty Images filed a lawsuit in Delaware federal court against Stability AI, alleging that the company unlawfully copied and processed millions of images from Getty's collection to train its Stable Diffusion AI image generator. The lawsuit claimed that Stability AI scraped Getty's copyrighted images without permission as part of training the model on the LAION-5B dataset, which contains approximately 5.85 billion image-text pairs collected from across the internet.
Separately, a class-action lawsuit was filed by artists Sarah Andersen, Kelly McKernan, and Karla Ortiz against Stability AI, Midjourney, and DeviantArt in the Northern District of California. The artists alleged that these companies violated copyright law by training their AI models on billions of copyrighted images without consent or compensation. The complaint argued that the AI systems create derivative works that compete directly with the original artists whose work was used in training.
The legal challenges center on whether training AI models on copyrighted material constitutes fair use under copyright law. Getty Images claimed that Stability AI's use was commercial in nature and harmed the market for licensed images. The company pointed to instances where Stable Diffusion generated images containing corrupted versions of Getty's watermark as evidence of direct copying. Artists in the class-action suit argued that the AI systems learned to replicate their distinctive artistic styles, effectively creating unauthorized derivatives.
Stability AI defended its practices by arguing that training AI models on publicly available images constitutes fair use, similar to how humans learn by observing existing art. The company contended that the AI does not store or reproduce copyrighted images but rather learns statistical patterns to generate new, original works. However, critics pointed out that the AI can be prompted to create images "in the style of" specific artists, potentially undermining their market value.
The lawsuits seek monetary damages, injunctive relief to prevent further alleged copyright infringement, and destruction of AI models trained on copyrighted works. The cases are ongoing and could establish important precedents for how copyright law applies to AI training data. Similar lawsuits have been filed in the UK, where Getty Images is also pursuing legal action against Stability AI.
The broader implications extend beyond Stability AI to the entire generative AI industry, as most large-scale AI models rely on training datasets that include copyrighted material scraped from the internet. The outcome could significantly impact how AI companies source training data and potentially require fundamental changes to current industry practices around dataset creation and model training.
Root Cause
Stability AI allegedly trained its Stable Diffusion model on the LAION-5B dataset containing billions of images scraped from the internet without obtaining licenses or permission from copyright holders. The training process involved copying and processing copyrighted works to enable the AI to generate derivative images in similar styles.
Mitigation Analysis
Implementing robust copyright clearance processes, obtaining proper licensing agreements, using only public domain or licensed training data, and developing technical safeguards to prevent generation of works that closely replicate copyrighted material could have prevented this legal exposure. Provenance tracking of training data sources and automated copyright detection systems would also reduce risk.
Lessons Learned
The lawsuits highlight the urgent need for clear legal frameworks governing AI training on copyrighted material and demonstrate the significant legal risks companies face when using unlicensed content at scale. The cases may establish important precedents for fair use in AI contexts and could force the industry to develop new approaches to ethical data sourcing.
Sources
Getty Images lawsuit says Stability AI misused photos to train AI
Reuters · Feb 6, 2023 · news
The lawsuit that could rewrite the rules of AI copyright
The Verge · Jan 16, 2023 · news