November 30, 2022
Hello, and welcome to November’s special monthly edition of Eye on A.I. David Meyer here in Berlin, filling in for Jeremy.
Stable Diffusion is growing up so fast. Barely three months after Stability AI introduced the image generator to the world, version 2.0 is out. However, while this evolution of the system produces noticeably better-quality images, the startup’s choices are proving controversial.
First, the unalloyed good. Stable Diffusion 2.0 features new text-to-image models trained using a LAION-developed text encoder that provides a real step-up in quality. The resulting pictures are bigger—768×768 is now also available as a default resolution, and pictures can now be 2048×2048 or higher. Also of note: a new depth-guided model called depth2img that can infer very different new images from an input image.
The controversy comes with the ways in which Stability AI has moved to address criticism of earlier versions. It’s made it more difficult to use Stable Diffusion to generate pictures of celebrities, or NSFW content. And gone is the ability to tell Stable Diffusion to generate images “in the style of” specific artists such as the famous-for-being-ripped-off Greg Rutkowski. While the no-NSFW change was down to cleaning up Stable Diffusion’s training data, the other changes were put into effect by the way in which the tool encodes and receives data, rather than by filtering out the artists, Stability AI founder Emad Mostaque told The Verge.
Regarding NSFW imagery, as Mostaque told users on Discord, Stability AI had to choose between stopping people from generating images of children, or stopping them from generating pornographic images, because allowing both was a recipe for disaster. That, of course, didn’t ward off accusations of censorship.
Mostaque was reportedly less keen to discuss whether the artist and celebrity-related changes were motivated by a desire to avoid legal action, but that is a reasonable assumption to make. Copyright concerns have definitely been exercising artistic communities of late. When the venerable DeviantArt community earlier this month announced its own Stable Diffusion-based text-to-image generator, DreamUp, it initially set the defaults so users’ art would automatically be included in third-party image datasets. Cue outrage and a same-day U-turn (though users will need to fill out a form to stop their “deviations” being used to further train DreamUp.)
It clearly isn’t possible to please everyone with these tools, but that’s to be expected when they’re developing at such breakneck speed, while also being available to the general public. It’s a bit like sprinting down a tight rope, and who knows which pitfalls will become apparent in the coming months.
More A.I.-related news below.