The promise, and perils, of visual generative AI technology

Hello and welcome to Eye on AI.

This past week was a big one for the visual side of generative AI, thanks to launches from Microsoft/OpenAI, Canva, and Google. Altogether, they show how imperfect (and ripe for abuse) these generative AI tools still are, but also paint a picture of rapid technological advancement.

Microsoft and OpenAI made a splash by making DALL-E 3 generally available to the masses via Bing Chat. The release comes even before DALL-E 3’s anticipated launch within ChatGPT, which is scheduled for later this month for paying users. The integration within Bing Chat—as well as the planned launch for ChatGPT—also introduces the capability for users to refine their images by conversing with that chatbot, as opposed to using the tool as a standalone product.

It didn’t take long, however, for DALL-E 3 to take centerstage in a coordinated 4chan campaign to flood the internet with racist images. As reported by 404 Media, 4chan users decided to make “propaganda for fun” and created a visual guide for how to use AI tools to quickly make images with “redpill messaging” and other racist content. While the guide says people can use any program they want such as Stable Diffusion or Photoshop, it says “Most people are using DALL-E 3” and links to the Bing tool, calling it the “QUICK METHOD.”

OpenAI has placed limitations on its AI tools to prevent the generation of racist and other offensive content, but users are of course finding ways around them. In turn, these images are now swirling around the internet, destined to inform current and future AI models as they scrape training data and increasingly browse the internet in real time.

Moving on to Canva, the online design platform launched Magic Studio, an extensive suite of AI-powered tools and capabilities that includes a text-to-image generator, text-to-video generator, the ability to generate entire projects from a line of text, generate copy in your brand voice, translate copy into different languages, and automatically switch between formats (such as automatically transforming the content of a slideshow into a document, for example). That’s on top of a host of interesting new photo editing capabilities such as Magic Grab, which allows users to select and separate the subject of a photo and make it into an editable element that can be individually edited, repositioned, or resized.

But it’s clear some of these features, which are powered by the company’s partnership with OpenAI, still have a ways to go. I played with the text-to-video tool and the results ranged from slightly disturbing to totally nonsensical. When I prompted it to generate a video of “A cat birdwatching in a sunny windowsill,” it did—except the cat’s butt and tail were sitting beside it on the windowsill rather than attached to its body. And when I prompted the tool to generate a video of “UFOs abducting chefs from Earth,” the two-second clip it produced was so utterly confusing that I had no idea what I was looking at.

That brings us to the unveiling of Google’s new Pixel 8 and Pixel 8 Pro smartphones, which launched to much fanfare thanks to new AI-powered photo editing capabilities. Google Pixel can now erase unwanted audio from video, edit specific elements within a photo (essentially the same as Canva’s Magic Grab), and combine different frames of a photo to help you create the best shot. If one person is making an unflattering face in a group photo, for example, Google Pixel now lets you simply choose a better face from recent images or frames and swap it in. While these types of edits have always been possible for people with expertise in Photoshop, now they’re available to everyone with a click and in the palm of their hands.

“In all my years of reviewing personal technology gadgets, I can count the number of times my jaw has dropped when learning about a new product. It’s good to be a skeptical journalist! But I failed to maintain that detachment when Google demoed a few imaging tricks on its new Pixel 8 and Pixel 8 Pro smartphones,” wrote Julian Chokkattu, the reviews editor at Wired.

And indeed it is pretty impressive. It wasn’t that long ago—2016—when I covered the launch of Google’s first AI photo feature, the “enhance” tool, which basically just adjusted the lighting and sharpness of a photo. Even as recently as two months ago, I stopped in a Google store to check out the current Pixel offerings and play with the Magic Erase tool, which neither I nor the store associate could get to work. Now just a few months later, we have a suite of AI tools that goes above and beyond Magic Eraser. And while it’s likely the new face-swapping capability is still imperfect (not to mention a bit dystopian), it sure shows just how far this technology has come.

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

AI IN THE NEWS

OpenAI is exploring making its own AI chips and has evaluated a potential acquisition target. That’s according to Reuters. CEO Sam Altman has publicly complained about the scarcity of graphics processing units and made acquiring more AI chips a top priority for OpenAI. The company has not yet decided to move ahead on a potential deal and has also considered working more closely with chipmakers including Nvidia and diversifying its suppliers beyond Nvidia.

Microsoft is gearing up to debut its first AI chip next month. Just like OpenAI, Microsoft is hoping to reduce its reliance on Nvidia’s AI chips, which continue to be in short supply as they power the current generative AI boom, according to The Information. Obtaining a steady stream of chips is becoming increasingly vital for the growth of Microsoft's cloud business, where it powers LLMs for customers.

UK regulators motion against Snap over concerns its My AI chatbot poses data privacy risks to children. The Information Commissioner’s Office (ICO) issued a preliminary enforcement notice against Snap, reports TechCrunch, after it provisionally found that the risk assessment Snap conducted before launching My AI “did not adequately assess the data protection risks,” particularly to children. The notice is not a breach finding and gives Snap a chance to respond before the regulator formally decides if the company is incompliant.

Australia’s education officials formally back a national framework to allow AI in schools. That’s according to The Guardian. AI tools including ChatGPT will be allowed in all Australian schools starting in 2024, according to the unanimously-backed framework. The move includes a $1 million investment into Education Services Australia, a not-for-profit educational technology company owned by federal, state, and territory education departments, to establish “product expectations” of generative AI technology.

The EU has ironed out significant parts of its forthcoming AI Act. Dragos Tudorache, co-rapporteur of the EU AI act, shared on Linkedin that he and the involved parliament members have “found agreement on large and important chunks of the text,” including an architecture for the classification for high-risk AI systems, requirements for high-risk AI systems, sandboxes, market surveillance and enforcement, and penalties and fines. “We're not there yet but we are very close and moving rapidly in that direction,” he wrote.

EYE ON AI RESEARCH

Making machines forget. In response to all the concerns (and lawsuits) about copyrightable material being used in AI training sets, a pair of researchers at Microsoft set out to investigate if models can be made to forget information they were already trained on. And found some surprisingly encouraging results.

The concept of AI "unlearning" is one of the technology's thorniest problems and something that Fortune wrote about in August. Many researchers say it's virtually impossible to delete information from a trained AI model without resetting the model.

In the new study, the Microsoft researchers set out to make Meta’s Llama2-7b model forget everything it knows about the Harry Potter books. While stating that they originally thought this would be impossible, the researchers found success and “propose a novel technique for unlearning a subset of the training data from a LLM, without having to retrain it from scratch.” You can read the full paper here.

FORTUNE ON AI

Mira Murati, the young CTO of OpenAI, is building ChatGPT and shaping your future —Kylie Robison And Michal Lev-Ram

AI is getting ‘more hype than it deserves,’ Warren Buffett’s right-hand man Charlie Munger says —Chloe Taylor

Meta’s AI characters played by celebrities like Charli D’Amelio and Tom Brady are dissing the products and brand sponsors they work with in real life, —Alexandra Sternlicht

Big Food is using AI to create healthier, tastier snacks and meals. It could super-charge nutrition in America —Erin Prater

AMD’s Lisa Su is ready to crash Nvidia’s trillion-dollar chips party, —David Meyer

Will AI mean fewer discounts at your favorite retailers? Companies hope the answer is yes —Megan Arnold

Can AI fix Wall Street’s ‘spaghetti code’ crisis? Microsoft and IBM are betting that it can, —Ben Weiss

Why collaboration with generative AI is so tricky—and how to make it work —François Candelon, Lisa Krayer, Saravanan Rajendran, and David Zuluaga Martínez

BRAINFOOD

To sue or to build? Both. Another interesting angle on the visual side of generative AI comes courtesy of a live interview with Getty CEO Craig Peters, which took place at the Code Conference in San Francisco and was shared on The Verge’s Decoder podcast this past week. Many might have expected Peters to spend the interview railing about the taking of content for AI models, and indeed Getty banned users from uploading AI-generated content to its platform and is currently pursuing a lawsuit against Stability AI, alleging the company stole 12 million images to train its Stable Diffusion model. But Peters made clear Getty is also bullish on generative AI and discussed the company’s own just-announced AI image generator.

The tool, called Generative AI by Getty Images does pretty much the same thing as every other AI-powered image generator. The difference is that all of the training data used belongs to Getty. For this reason, Getty is promising that its generator is 100% commercially safe for its customers and won’t infringe on intellectual property, likeness concerns, or generate images of real people (such as politicians and other public figures) that could contribute to disinformation. “It cannot produce deepfakes. It doesn’t know what the Pope is, it doesn’t know what [Balenciaga] is,” Peters said. He insisted “it can’t” generate photos of Donald Trump, Joe Biden, or Taylor Swift because “it doesn’t know who they are.”

Peters also discussed the company's plan to compensate creators whose images are used in training data, divvying up a fixed pie according to what proportion of the training set their content represents and how their content performs in Getty’s licensing world.

This is the online version of Eye on AI, a free newsletter delivered to inboxes on Tuesdays. Sign up here.