OpenAI CEO Sam Altman loves Italy, but the affection may not be mutual—at least not when it comes to OpenAI’s flagship product, ChatGPT.
Italy temporarily banned ChatGPT last week on the grounds that it violates Europe’s strict data privacy law, GDPR. OpenAI immediately complied with the ban, saying it would work with Italian regulators to “educate them” on how OpenAI’s A.I. software is trained and operates.
“We of course defer to the Italian government and have ceased offering ChatGPT in Italy (though we think we are following all privacy laws),” Altman tweeted, adding that “Italy is one of my favorite countries and I look forward to visiting again soon!”
The comments drew plenty of snark from other Twitter users for its slightly tone deaf, ugly American vibes. Meanwhile, Italy’s deputy prime minster took the country’s data regulator to task, saying the ban seemed excessive. But Rome’s decision may be just the start of generative A.I.’s problems in Europe. As this newsletter was preparing to go to press, there were reports Germany was also considering a ban.
Meanwhile, here in the U.K., where I’m based, the data protection regulator followed Italy’s ban with a warning that companies could very well fall afoul of Britain’s data protection laws too if they weren’t careful in how they developed and used generative A.I. The office issued a checklist for companies to use to help ensure they are in compliance with existing laws.
Complying with that checklist may be easier said than done. A number of European legal experts are actively debating whether any of the large foundation models at the core of today’s generative A.I. boom—all of which are trained on vast amounts of data scraped from the internet, including in some cases personal information—comply with GDPR.
Elizabeth Renieris, a senior researcher at the Institute for Ethics in AI at the University of Oxford who has written extensively about the challenges of applying existing laws to newly-emerging technology such as A.I. and blockchain, wrote on Twitter that she suspected GDPR actions against companies making generative A.I. “will be impossible to enforce because data supply chains are now so complex and disjointed that it’s hard to maintain neat delineations between a ‘data subject, controller, and processor’ (@OpenAI might try to leverage this).” Under GDPR, the privacy and data protection obligations differ significantly based on whether an organization is considered a controller of certain data, or merely a processor of it.
Lilian Edwards, chair of technology law at the University of Newcastle, wrote in reply to Renieris, “These distinctions chafed when the cloud arrived, frayed at the edges with machine learning and have now ripped apart with large models. No-one wants to reopen GDPR fundamentals but I am not clear [the Court of Justice of the European Union] can finesse it this time.”
Edwards is right that there’s no appetite among EU lawmakers to revisit GDPR’s basic definitions. What’s more, the bloc is struggling to figure out what to do about large general-purpose models in the Artificial Intelligence Act it is currently trying to finalize, with the hope of having key EU Parliamentary committees vote on a consensus version on April 26. (Even then, the act won’t be really be finalized. The whole Parliament will get to make amendments and vote in early May and there will be further negotiation between the Parliament, the EU Commission, which is the bloc’s executive arm, and the European Council, which represents the bloc’s various national governments.) Taken together, there could be real problems for generative A.I. based on large foundation models in Europe.
At an extreme, many companies may have to follow OpenAI’s lead and simply discontinue offering these services to EU citizens. It is doubtful European politicians and regulators would want that outcome—and if it starts to happen, they will probably seek some sort of compromise on enforcement. That alone may not be enough. As has been the case with GDPR and trans-Atlantic data sharing, European courts have been quite open to citizens’ groups going to court and obtaining judgements based on strict interpretations of the law that force national data privacy regulators to act.
At a minimum, uncertainty over the legal status of large foundation models may make companies, especially in Europe, much more hesitant to deploy them, especially in cases where they have not trained the model from scratch themselves. And this might be the case for U.S. companies that have international operations too—GDPR applies not just to customer data, but also employee data, after all.
With that, here’s the rest of this week’s news in A.I.
A.I. IN THE NEWS
U.K. government releases A.I. policy white paper. The British government’s Department for Science, Innovation and Technology published a white paper on how it wants to see A.I. governed. It urges a sector and industry-specific approach, saying regulators should establish "tailored, context-specific approaches that suit the way A.I. is actually being used in their sectors," and for applying existing laws rather than creating new ones. The recommendations also lay out high level principles in five main areas: safety, security, and robustness; transparency and explainability; fairness; accountability and governance; and contestability and redress. While some A.I. and legal experts praised the sector-specific approach the white paper advocates, arguing it will make the rules more flexible than a one-size-fits-all approach and promote innovation, others worried that different regulators might diverge in their approach to identical issues, creating a confusing and messy regulatory patchwork that will actually inhibit innovation, CNBC reported.
Bloomberg creates its own LLM, BloombergGPT, for finance. Bloomberg, where I worked before coming to Fortune, is not new to machine learning. (I’ve periodically highlighted some of the ways Bloomberg has been using large language models and machine learning in this newsletter.) The company has access to vast amounts of data, much of it proprietary. This past week, Bloomberg unveiled Bloomberg GPT, a 50 billion parameter LLM, and the first ultra-large language GPT-based model the financial news company has ever trained. This puts it pretty far up there in the rankings of large models, although still far smaller than the largest models OpenAI, Google Brain, DeepMind, Nvidia, Baidu and some other Chinese researchers have built. The interesting thing is that 51% of the data Bloomberg used was financial data, some of it its own proprietary data, that it curated specifically to train the model. The company reported that BloombergGPT outperformed general-purpose LLMs on tasks relevant to Bloomberg’s own use cases, such as recognizing named entities in data, performing sentiment analysis on news and earnings reports, and answering questions about financial data and topics. Many think this is a path many large companies with access to lots of data will choose to take going forward—training their own proprietary LLM on their own data and tailored to their own use cases—rather than relying on more general foundation models built by the big tech companies.
EYE ON A.I. RESEARCH
Research collective creates open-source version of DeepMind visual language model as step towards an open-source GPT-4 competitor. The nonprofit A.I. research group LAION released a free, open-source version of Flamingo, a powerful visual language model created by DeepMind a year ago. Flamingo is a fully multi-modal model, meaning it can take in both images, videos, and text as inputs and output in all those modes too. That enables it to describe images and also answer questions about them, as well as generating images (or possibly video) from text, similar to the way Stable Diffusion, Midjourney and DALL-E can. Flamingo had some interesting twists in its architecture that enable it to do this—including a module called a perceiver remixer that reduces complex visual data to a much lower number of tokens to be used in training, the use of a frozen language model, and other clever innovations you can read about in DeepMind's research paper.
Any way, LAION decided to copy this architecture, apply it to its own open-source, multi-modal training data and the result is Open Flamingo.
Why should you care? Because LAION explicitly says it is doing this in the hopes that someone will be able to use Open Flamingo to train a model that essentially replicates the capabilities of GPT-4 in its ability to ingest both text and images. This means everyone and anyone might soon have access to a model as powerful as OpenAI’s currently most powerful A.I., GPT-4, at essentially no cost. That could either be a great thing or a terribly dangerous thing, depending on your perspective.
And another subtle dynamic here that doesn’t often get discussed: One of the things that is continuing to drive OpenAI to release new, more powerful models and model enhancements (such as the ChatGPT plugins) so quickly is the competition it is facing not just from other tech players, such as Google, but the increasingly stiff competition it faces from open source alternates. These open source competitors could easily erode the marketshare OpenAI (and its partner Microsoft) was hoping to control.
In order to maintain a reason for customers to pay for its APIs, OpenAI is probably going to have to keep pushing to release bigger, more powerful, more capable models—which, if you believe these models can be dangerous (either because they are good for producing misinformation at scale, or because of cybersecurity risks, or because you think they just might hasten human extinction, then anything that incentivizes companies to put them out in the world with less time for testing and for installing guardrails, is probably not a good thing).
FORTUNE ON A.I.
ChatGPT gave advice on breast cancer screenings in a new study. Here’s how well it did—by Alexa Mikhail
Former Google CEO Eric Schmidt says the tech sector faces a ‘reckoning’: ‘What happens when people fall in love with their A.I. tutor?’—by Prarthana Prakash
Nobel laureate Paul Krugman dampens expectations over A.I. like ChatGPT: ‘History suggests large economic effects will take longer than many people seem to expect’—by Chloe Taylor
Google CEO won’t commit to pausing A.I. development after experts warn about ‘profound risks to society’—by Steve Mollman
How should we think about the division over last week’s open letter calling for a sixth month pause in the development of any A.I. system more powerful than GPT-4? I covered some of this in Friday’s special edition of Eye on A.I. But there’s a very nice essay on how politicized discourse over A.I. risks is becoming, from VentureBeat’s A.I. reporter, Sharon Goldman. It’s worth a read. Check it out here.
Also, how should we feel about Sam Altman, the OpenAI CEO, who claims to be both “a little bit frightened” about advanced A.I. and, simultaneously, hellbent on creating it? Well, dueling profiles of Altman, one in the New York Times and one in the Wall Street Journal, try to sort this out. Both are worth a read.
The cynical take on Altman was put forth by Brian Merchant in an op-ed in the Los Angeles Times—namely, that fear-mongering about A.I., particularly about its ability to replace lots of people’s jobs, only serves to hype the power of existing technologies and OpenAI’s brand, boosting its sales.
I agree with some of Merchant’s take. I do think OpenAI has very much become a commercially-motivated enterprise, and that this explains a lot about why it is releasing powerful A.I. models so quickly and why it has done things like create the ChatGPT plugins. But, I’m not sure about Merchant’s take on Altman himself—that Altman's “conflicted genius” schtick is simply that, schtick. Altman’s concern with A.I. Safety is not some newfound preoccupation that came about only once he had something to sell. It’s clear from those Altman profiles that AGI and its potential for good and ill have been preoccupations of Altman’s for a long time. It's what led him to cofound OpenAI with Elon Musk in the first place. And remember, when it started, OpenAI was just a nonprofit research lab, dedicated to open sourcing everything it did. Altman didn’t set out to run a commercial venture. (He may have thought there would be money to be made down the line, but making money doesn’t seem to have been his real rationale. He was already enormously wealthy at the time.) So I think Altman's simultaneous expressions of longing for AGI and fear of it are not just about hyping A.I. I’m not saying the rationale is noble. I just don’t think commercial motives explain Altman’s strange stance on advanced A.I. I think it has a lot more to do with ego and with a kind of messiah complex—or at the very least, a kind of messianic thinking.
In fact, a lot of stuff people who believe in AGI say only makes sense if viewed in religious terms. AGI believers are a lot like evangelicals waiting for the rapture. They both want the second coming and wish to hasten its arrival, and yet on some level they fear it. And while some of these folks are cynical in their beliefs—they only talk about the Armageddon because they have Bibles to sell (that would be Merchant's take)—others are sincere believers who really do want to save souls. That doesn't mean you have to agree with these folks. But intentions do make a difference. Which do you think Altman is: Bible salesman or modern day prophet?
This is the online version of Eye on A.I., a free newsletter delivered to inboxes on Tuesdays and Fridays. Sign up here.