Hello and welcome to Eye on AI. In this edition….Mobile networks don’t want to miss the AI game but it’s not clear they have the best hand to play; the U.S. State Department starts using AI to help target foreign students for possible deportation; OpenAI finds that chain of thought monitoring can detect ‘reward hacking,’ but only if the model doesn’t know you’re looking; can AI bring us closer to God?
I spent last week at Mobile World Congress in Barcelona, where AI was the overriding theme of the show. But beyond the hype around AI, a hint of desperation lurked behind the flashy stands and demos.
Mobile network providers lament the fact that they provided the essential “pipes” on which the app economy was built in the decade following Apple’s 2007 introduction of the iPhone. Yet most of the value created during the mobile revolution flowed to what the mobile folks call “OTT” (short for over-the-top) companies, such as Meta and other social media apps, ecommerce sites like Amazon, and the streamers, from Spotify to Netflix. Now they are worried that the AI revolution will similarly leave them in the position of being a mere utility—and valued as such by investors—while the big money accrues to the OpenAIs, Anthropics, and Googles of the world.
Hope is not a strategy
But, as they say, hope is not a strategy. And it isn’t entirely clear what these carriers can do to avoid once again being relegated to utility status. French mobile provider Orange has started selling an AI assistant—which it calls Dinootoo—that it orignally built to help its own employees. But it is not clear why businesses would necessarily choose Orange’s Dinootoo over similar products offered by OpenAI, Anthropic, Microsoft, or Salesforce. So I am not convinced the mobile carriers can win at this game.
The fact that a U.S. court has now struck down net neutrality—which forbade carriers from charging customers more to prioritize their traffic across the network—at least at the federal level (some states have their own net neutrality laws), will give some U.S. carriers a chance to grow revenues. But there’s little sign of net neutrality going away in Europe or many other geographies. The carriers are also betting big on the idea that many enterprises will want to set up private 5G—and, yes, future 6G—wide area networks to serve campuses or big geographic footprints. This is a good business, but probably not enough to let carriers claim a big boon from the AI revolution.
The handset makers have an AI play
The mobile handset makers, such as Samsung, and the chipmakers who serve them, such as Qualcomm, have a somewhat more convincing AI story to tell. They see a future in which your interactions with an AI assistant largely take place on your phone. In essence, the AI assistant becomes the phone’s new user interface—interacting with you through multimodal inputs (largely voice, but also text and images taken from your phone’s camera.) The assistant will be an agent, performing tasks for you using your phone’s other apps and the phone’s internet browser.
As AI models deliver more reasoning capabilities in smaller packages, these hardware companies are betting that these models will be able to live on your phone. “AI will have to be hybrid, not everything is going to go to the cloud,” Don McGuire, Qualcomm’s chief marketing officer, told me. “You’re going to be able to do lots of things by using Gen AI without you having to ping the cloud. That will create better environments for users and applications from a privacy, safety, personalization, and latency perspective.” It will also, he noted, mean users can access the help of their AI assistant even in cases when they don’t have signal. (Note that in many ways “AI on the edge” is not good news for the network carriers.)
Shopping assistants
One evening in Barcelona, I had the pleasure of moderating a dinner discussion on AI and retail cohosted by Fortune and Shopify. Peyman Naeini, Shopify’s field CTO, talked about a world in which customers increasingly discover products through the recommendations of AI assistants and AI-powered search engines, rather than through traditional search. In many cases, these AI assistants will increasingly act as agents, doing the purchasing on behalf of users. It’s a world that most retailers, Naeini said, are not really prepared for.
How will our AI assistants decide which products to bring to our attention or buy on our behalf? It isn’t entirely clear. (I hope that, however it happens, that we have some insight into the process. If there is a paid relationship between the company providing the AI assistant and the brands being recommended that should be completely transparent. I fear that it won’t be.) Customer reviews might form a key datapoint that an AI assistant considers. But in an era where generative AI makes it easy to produce thousands of convincingly-realistic fake customer reviews at the press of a button, how will we ensure that we can trust such reviews?
Trust as a service
Well, Adrian Blair, the CEO of Denmark-based Trustpilot, which is well known for its customer reviews and ratings of businesses, has spent a lot of time thinking about this problem. Blair told me last week that Trustpilot gathers hundreds of meta datapoints for each review posted—including the customer’s email address, the location and time of day they are posting the review, the kind of device they are using, how much of the review was cut and pasted and how many seconds it took to be written, and many many more. With an average of 190,000 new reviews submitted daily, and a total catalog of more than 300 million reviews, the company has a lot of data from which to build AI models that can assess the likelihood that a review is fake. Last year, about 6% of all the reviews submitted to Trustpilot were blocked or removed on suspicion of being fake, and 82% of these were caught by Trustpilot’s automated systems. The advent of generative AI has seen the number of suspect reviews climb, Blair says, but he says it is more of an incremental increase than a flood.
Yesterday, at the HumanX conference in Las Vegas, Trustpilot announced that it is now making its customer review data available through a platform it is calling TrustLayer. This will let third-parties, including AI assistants and agents, directly access data from Trustpilot’s customer reviews, allowing them to gain insights into how a brand or product is perceived by its customers.
TrustLayer is being primarily marketed at companies that want to monitor their own consumer sentiment more easily—and potentially assess themselves against rival firms. The first announced customers for the new platform are private equity firm Advent and venture capital firm Felix Capital that want to use the data to conduct due diligence on possible investments and also to benchmark how their portfolio companies are performing with customers. But Blair sees companies building AI assistants as a key future customer. “I can definitely see a world where ‘shopping agents’ want to make use of our platform to do a better job themselves,” he told me. And of course, we might be able to rate our AI agent experiences on Trustpilot too.
With that, here’s more AI news.
Jeremy Kahn
jeremy.kahn@fortune.com
@jeremyakahn
Correction, March 12: An earlier version of this story misidentified the country where Trustpilot is headquartered. It is Denmark. Also, a news item in this edition misidentified the name of the Chinese startup behind the viral AI model Manus. The name of the startup is Butterfly Effect.
AI IN THE NEWS
U.S. State Department begins using AI to target visa holders. In a development that has drawn widespread outcry from civil liberties groups, the U.S. State Department has launched what it is calling a “Catch and Revoke” program that involves using AI models to scrutinize the social media posts of foreign student visa holders for evidence they have voiced support for Hamas or other designated terrorist organizations. The program aligns with President Donald Trump’s executive order aimed at combating antisemitism, which calls for the deportation of non-U.S. citizens who have participated in pro-Palestinian protests. The State Department is also leaning on a rarely-before invoked provision of the 1952 Immigration Nationality Act that gives it authority to revoke the visas of foreign nationals deemed to be a threat. Critics worry the policy will have a significant chilling effect on freedom of speech, especially on university campuses. There is also significant concern that AI models will not be able to make fair assessments. You can read more from Axios here and Reuters here.
OpenAI signs $12 billion cloud computing contract with CoreWeave. The five-year deal comes as data center provider CoreWeave readies for an initial public stock offering that is forecast to value the company at $35 billion. As part of the transaction, OpenAI will get a $350 million equity stake in CoreWeave. The arrangement helps OpenAI diversify its computing suppliers beyond Microsoft, which had previously been OpenAI’s sole data center provider and which also had been leasing significant data center capacity with CoreWeave. Microsoft cancelled some of those leases over delivery issues, though it still plans to spend $10 billion on CoreWeave services by 2030. You can read more from the Financial Times here.
ServiceNow buys AI agent company Moveworks for $2.85 billion. The cash-and-stock acquisition was seen as a way of bolstering ServiceNow’s “agentic AI” offerings, which is among the fastest growing revenue lines for the giant IT services and digital automation company. It is also locked in a fierce battle with competitors such as Salesforce and Microsoft in the AI agent market. You can read more from me here.
The AI world goes crazy for Manus for a hot minute. The hottest topic among AI watchers this week was the debut of Manus, a model from little-known Chinese AI startup Butterfly Effect. Initially some claimed Manus represented a “second DeepSeek” moment—an instance where a small Chinese lab, working with relatively few resources, showed it could jump out in front of much-better resourced Western labs. Monica billed Manus as the world’s first fully autonomous “general agent,” able to perform a diverse range of complex tasks, including analyzing job applications, preparing property reports, and conducting financial analysis at remarkable speeds using a multi-agent system. However, initial excitement over Manus quickly gave way to doubt, with user tests of the model yielding uneven performance and questionable outputs. One developer even claimed to have discovered that Manus is not actually a new model at all, but rather a “wrapper” built on top of Anthropic’s latest AI model, Claude 3.7 Sonnet—essentially using that model as its core but creating a lot of workflows around that allow it to use dozens of software tools. Even so, many have defended Manus as a nonetheless innovative application. You can read more from The Register here and TechCrunch here.
Foxconn debuts its own large language model. The Taiwan-based Apple supplier introduced “FoxBrain,” a customized LLM optimized for traditional Chinese and Taiwanese language that is designed to help the electronics company with internal applications such as data analysis and code generation. The company plans to eventually offer it to other technology partners. You can read more from Reuters here.
EYE ON AI RESEARCH
Is the key not to cheat—or just not to get caught? I’ve highlighted before in this newsletter the danger of “reward hacking.” That is when an AI model seeks to accomplish a goal that we’ve set, but does so in a way that is not what we intended. In other words, the model “cheats.” There’s been growing evidence that large language models, when given goals, will readily engage in reward hacking. And as we move to a world of AI agents, where these models start doing things for us, it is something we will need to be particularly on guard against, since it could cause real world harm. (Imagine an AI agent that is supposed to help boost sales but decides that a simpler way to accomplish the goal is simply to hack the networks of a customer and place a large order.)
Well, on this front comes some good news, and some bad news, in the form of new research from the AI safety team at OpenAI. The team found that with the latest wave of “reasoning” models that perform actions by following a step-by-step “chain of thought” process, it is actually possible to detect reward hacking by looking at this chain of thought. In fact, it is possible to use a less capable LLM to read the chain of thought output of a more capable reasoning model and catch it engaging in reward hacking and then instruct the reasoning model not to cheat. That’s all good. (Leave aside for a moment that a lot of model companies, including OpenAI, currently hide the chain of thought output from their human users, so we might only have an automated monitor built into the system to rely on to catch such cheating. At least we know it is possible to catch reward hacking in this way.)
Now, the bad news. If you try to incorporate this monitoring of a reasoning model’s chain of thought into its training, by for instance penalizing the model if you catch it trying to cheat, this does not in fact make the model less likely to cheat. All it does, it turns out, is encourage the model to lie about what it is actually “thinking” in order to not get caught.
You can read OpenAI’s blog on the research here.
FORTUNE ON AI
Chinese advances like DeepSeek have ‘narrowed’ AI gap with U.S., says CSIS think tank: ‘Unrealistic to expect a lead of more than a year or two’ —by Lionel Lim
Why Elon Musk may yet prevail in his attempt to block OpenAI’s for-profit conversion —by David Meyer
OpenAI’s ex-policy lead accuses the company of ‘rewriting’ its AI safety history —by Beatrice Nolan
Commentary: AI innovation isn’t a climate threat, it’s our best hope —by Cully Cavness
AI CALENDAR
March 10-13: Human [X] conference, Las Vegas
March 17-20: Nvidia GTC, San Jose
April 9-11: Google Cloud Next, Las Vegas
April 24-28: International Conference on Learning Representations (ICLR), Singapore.
May 6-7: Fortune Brainstorm AI London. Apply to attend here.
May 20-21: Google IO, Mountain View, Calif.
July 13-19: International Conference on Machine Learning (ICML), Vancouver
BRAIN FOOD
The new religion. AI offers the potential to provide people access to spiritual guidance and comfort 24/7, without the need to wait for Sunday or Friday prayers, or what have you. And it doesn’t have to just be the pseudo-confessional of the AI chat that AI provides. AI technologies now allow the simulation of entire services, complete with AI avatars representing the priest and the other congregants. But is this virtual service as good as the real thing? That’s what a church in Finland wanted to find out.
St. Paul’s Lutheran Church in Helsinki used AI to do everything from write the sermon and generate the music to creating the visuals, including AI avatars of the pastors and the parishioners. They even created an AI avatar of a former Finnish president who read from the Old Testament.
The 45-minute service attracted about 120 attendees, which is more than the church usually saw on a weeknight. But a lot of the churchgoers said that the service—while attractive for its novelty—left them cold. One attendee said the service felt impersonal and distant. The church’s priest also said he missed the human connection of a traditional service. You can read more from British paper The Independent here.