CEO DailyCFO DailyBroadsheetData SheetTerm Sheet

Why better A.I. may depend on fake data

January 4, 2022, 10:38 PM UTC

Welcome to a new year, Eye on A.I. readers! To kick off 2022, let’s focus on one of the biggest trends in artificial intelligence: synthetic data.

Training machine-learning models to flag defects in products during manufacturing or predict when customers will jump to a competitor requires tremendous amounts of high-quality data. A major problem, however, is that most businesses don’t have enough of it.

That’s why using A.I. to create synthetic data—essentially computer-generated data that substitute for real data—for improving machine-learning models is hot. Ag giant John Deere, for instance, is creating synthetic images of plants in different weather conditions to improve its computer-vision systems that will eventually be used on tractors to spot weeds and spray weed killer on them, among other uses. Doing so saves the company the time and expense of manually photographing thousands of plants in every kind of lighting and surroundings, the company says. Without synthetic data its technology could more likely confuse weeds for crops, or vice versa.

JPMorgan’s A.I. research head Manuela Veloso tells Fortune that her bank is also experimenting with synthetic data. Some projects involve creating synthetic datasets representing payment fraud and money laundering. The more data the bank feeds its A.I., the better it will be at spotting real-world fraud, or so the thinking goes. 

American Express is conducting similar research for identifying fraud.

Still, not everyone agrees that synthetic data is helpful. Some companies lack the technical staff and resources necessary to create synthetic data for machine learning. Additionally, startups specializing in fake-data generation services are still young and have yet to prove themselves beyond test projects, analysts tell Fortune. It’s also unclear whether using synthetic data to train machine-learning models even leads to big improvements in training models. Researchers have yet to create formal studies to evaluate any differences between training A.I. systems with real-world or fake data.

Veloso, however, is confident that A.I’s future will involve synthetic data. While real-world data is critical for training machine-learning models, she says, it only reflects “a copy of the past” that can therefore increase the risk of A.I. stumbling when it encounters anomalies. Synthetic data, she says, give A.I. a broader education.

“Simulation enables us to diverge from reality,” Veloso says. “Without that divergence, it is hard to handle a surprise.”

Jonathan Vanian 


Google bets big on security and machine learning. Google plans to buy the cybersecurity startup Siemplify for about $500 million, according to a report by Israeli tech news service CTech that cites unnamed sources. Siemplify uses machine learning software to triage security vulnerabilities in an organization so that more serious flaws are prioritized. The report said Google will use Siemplify as the search giant’s base of security operations in Israel for its growing cloud computing unit.

A.I. just helped this man become very wealthy. Tang Xiao’ou, the co-founder of Chinese A.I. company SenseTime, is now among the world's richest people after his company went public in Hong Kong, Bloomberg News reported. The news service estimates that the executive’s wealth “jumped by $300 million to roughly $3.7 billion after SenseTime ended Thursday 7.3% above its initial public offering price.” Tang, an expert in facial recognition, previously worked for Microsoft Research Asia before starting SenseTime in 2014.

An A.I. auto partnership. Autonomous truck company TuSimple has partnered with Nvidia and will use the semiconductor’s computer chips to power its self-driving trucks, Reuters reported. TuSimple is targeting a 2024 production date for its autonomous trucks, which automotive company Navistar will help build.

Machine learning for burns. The U.S. Army Institute of Surgical Research is partnering with the Beckman Laser Institute and Medical Clinic at the University of California at Irvine on a project involving machine learning to assess the severity of burns, the industry publication Nextgov reported. The hope is that the research will let non-medical experts better care for burns.  


The White House Office of Science and Technology Policy has picked Alexander Macgillivray as principal deputy CTO. Macgillivray was previously the co-founder and general counsel of the nonprofit Alloy and was deputy federal CTO for the U.S. government from 2014 through 2017.

Former Waymo CEO John Krafcik has joined the board of auto company Daimler Truck, which is partnering with Waymo to develop autonomous semis.


Move the robot with your brain. Researchers from Switzerland’s École polytechnique fédérale de Lausanne published a paper in Nature’s Communications Biology about using reinforcement learning—in which computers learn by numerous trials—as a method for tetraplegic patients to command robotic arms with their thoughts. The A.I. system was embedded in a headpiece that patients wear to monitor brain activity. The paper describes how patients were able to move a robotic arm on a table to avoid a glass cup just by looking at the objects, among other feats.

From the paper:
Looking towards the future, we aim to further design and develop teaching and control methods for increasing the dexterity of external prostheses whilst facilitating the interaction for the subject. Future assistive robotic manipulators should involve autonomous grasping for an increased grasp stability in a larger variety of objects. The ultimate goal will be to introduce a seamless human-machine coordination, capable for performing complex tasks in real-world environments.


The world just blew a ‘historic opportunity’ to stop killer robots—and that might be a good thing—By Jeremy Kahn

Why synthetic data is such a hot topic in the artificial intelligence world—By Jonathan Vanian

Mercedes-Benz’s futuristic concept car underpins its plan to topple Tesla from the EV throne—By Christiaan Hetzner

Data sharing and ally-shoring: Global problems require collaborative solutions—By  Sanjay Brahmawar

Virtual reality is offering designers new ways to see the world—and design for it, too—By Nicole Gull McElroy


World's A.I. hot spots. Although San Francisco is the leader when it comes to A.I. jobs, several other cities also have a high number of machine learning experts, according to a Harvard Business Review article about A.I. talent pools. As part of their methodology, the authors included factors like gender, racial diversity, acceptance of migrants, and cost of living into their calculations about A.I. talent pools.

Some of the cities that stand out in the rankings include Bangalore, Beijing, Atlanta, Singapore, New Delhi, and Melbourne. From the article:

Brazil and India, for example, have cities on this list of 50 and are hiring three times as many AI workers as they were in 2017. This is a rate of growth that matches or exceeds that in the U.S. Almost 30% of scientific research papers from India include female authors, double the proportion of female authors in the U.S. and UK. Meanwhile, the Chinese Academy of Sciences is the top publisher of AI research, with Tsinghua University and Peking University close behind. 

Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.