• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

2

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

3

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place

1

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year

2

Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster

3

Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
TechAI

Israeli startup raises $18.5 million to train A.I. with fake data

Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
March 16, 2021, 3:00 AM ET
Add Fortune on Google for similar content.

Subscribe to Eye on A.I. for expert weekly analysis on the intersection of artificial intelligence and industry, delivered free to your inbox.

Companies interested in using artificial intelligence face a big obstacle: Having enough of the right kind of data to train their systems.

Companies need large amounts of labelled, historical examples to train A.I. systems, particularly those that work with images and videos. The demand has spawned a whole sub-industry of companies that specialize in helping other businesses annotate their data. Among them are Scale AI, which was valued at $3.5 billion in a December 2020 funding round, Hive, Sama,  Labelbox, Cloudfactory, and a division of A.I. company Clarifai, among others.

But there is another way to produce enough data to train A.I. systems: Fabricate it.

Fake it till you make it

That’s essentially what a fast-growing Israeli startup called DataGen specializes in doing. The company uses its own machine learning systems to create what’s known as “synthetic data”—in this case, artificially generated still and video images—that DataGen’s customers then use to train their own A.I.

DataGen can produce a bespoke synthetic dataset for its customers in just a few hours, says Ofir Chakon, DataGen’s founder and chief executive officer. Compare that to the months it typically takes a data labelling company to curate an equivalent real world video or image library.

Synthetic data has other advantages too, in addition to speed. With synthetic data, companies don’t need to worry about any personal identifying information in the dataset, nor need they worry about ethical considerations around how the data was collected. This feature gains significance as more and more of the world’s population is covered by data protection laws. Gartner, the technology analytics firm, says that by 2023, 65% of the world’s population will have their personal data covered by some sort of privacy regulation, up from just 10% last year.

Data bias can still be a problem though. A synthetic dataset can, in some cases, simply replicate the same biases found in a real dataset. But DataGen has ways to potentially eliminate bias. The company can shape the dataset it generates however it wishes, allowing the company to create a lot more examples of unusual or rare cases to ensure that an A.I. system will know how to handle these. For instance, what will happen to a robot that uses a video camera to “see” as it navigates around a warehouse if there is a power cut and the warehouse’s low-level emergency lighting switches on? Acquiring enough examples of these rarer cases is far more difficult with real world datasets.

Courtesy of DataGen

“Our customers have full control over all the parameters that go into the data they create,” Chakon says. “The real-world implication is that, once deployed, you can be sure it’s going to work well in different domains, with different ethnicities, in different geographic locations or any environment you can imagine.”

An enabler for the whole A.I. industry

DataGen has attracted some big name investors.

On Tuesday, the company announced a $18.5 million early stage funding round lead by Israeli venture capital funds TLV Partners and Viola Ventures. The round also includes an impressive list of machine learning luminaries. They include Michael Black, a computer vision pioneer who is now director of the Max Planck Institute for Intelligent Systems; Gal Chechik, director of A.I. research at computer chip giant Nvidia; Anthony Goldbloom, the chief executive officer and cofounder of machine learning competition site Kaggle; and Trevor Darrell, a computer science professor at the University of California at Berkeley. Existing investor Spider Capital is participating in the new funding round too.

Rona Segev, a founding partner at TLV, says that simulated data “addresses problems which are just unsolvable without it.” She says that synthetic data is “an enabler for the whole A.I. industry. Without simulated data, the industry will slow.”

DataGen said it would use the funds to hire more machine learning experts and engineers, expanding from the 30 employees it has currently, most of them based in Israel. Chakon said the company would also expand its focus from creating training sets for machine learning to data that is also designed to test those A.I. systems once they have been trained.

The future product plans aim to address a major problem with a lot of A.I. systems: quality assurance. Oftentimes, only a small subset of available data is reserved for testing an A.I. As a result, it may be hard for a company to test enough rare situations to know how well an A.I. will perform if it encounters the same or similar situations in the real world.

Courtesy of DataGen

DataGen’s cofounders Ofir Chakon, CEO, (left) and Gil Elbaz, tech chief (right) create so-called synthetic data to train A.I. systems.

The startup, which was founded in 2018, has about 10 paying customers so far, “all of them big companies,” Chakon says, although he says contractual agreements prevent him from naming them. DataGen’s data has been used to train warehouse robots to pick items off a conveyor belt, help with factory operations for a home appliance manufacturer, and for a number of physical security applications, such as identifying shop lifters in a retail store.

Just like the real thing

“We are experts in everything that has to do with indoor environments and human perceptions,” Chakon says, adding that the company can also simulate the way people move in indoor environments. “We generate data that looks exactly like the target domain.”

In other words, a set of DataGen-created images of various household items in a crate—a scene used to train a robot picking arm in a logistics warehouse—looks just like the real overhead video images taken of those objects in a real crate on a real warehouse conveyor. A fabricated scene of a kitchen looks as if the company had gone out and commissioned photography of a real kitchen. And a simulation of a person’s face displays all the same movement points, textures, and skin tones as would be found in a real photograph or video.

DataGen uses software that represents objects and people as a kind of three-dimensional mesh, allowing the user to easily edit and adjust their size and shape. The company pairs that visual meshwork with a physics simulator to create realistic scenes of how objects moves. By doing so, the company can easily depict what happens when one object moves on top of or in front of another, potentially obscuring a clear view of that object from a certain angle.

DataGen uses a machine learning technique called a GAN, short for “generative adversarial network,” to create its realistic simulations. GANs also underpin the creation of so-called Deepfakes, which are a kind of synthetic data, but Deepfakes exist only in a two-dimensional representation of a person’s face, not a three-dimensional one.

Chakon says he thinks DataGen’s use of 3-D simulation gives it an advantage over other companies that are trying to use 2-D photos and videos to create synthetic data. He said it is much more difficult to simulate the interaction of objects—particularly when one object obscures or occludes another or two objects collide—accurately with just two-dimensional data.

More must-read tech coverage from Fortune:

  • After its IPO, Coupang eyes South Korea domination
  • The pandemic’s edtech boom won’t slow down anytime soon
  • One year later: 15 ways life has changed since the onset of the COVID pandemic
  • Facebook reveals A.I. that is already improving Instagram video recommendations
  • HBO Max will offer cheaper subscriptions—for people who don’t mind watching ads
About the Author
Jeremy Kahn
By Jeremy KahnEditor, AI
LinkedIn iconTwitter icon

Jeremy Kahn is the AI editor at Fortune, spearheading the publication's coverage of artificial intelligence. He also co-authors Eye on AI, Fortune’s flagship AI newsletter.

See full bioRight Arrow Button Icon
Add Fortune on Google for similar content.

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

Nikesh Arora, chief executive officer at Palo Alto Networks
SuccessJobs
CEO of $248 billion cybersecurity company says workers are about to face a ‘Darwinian moment’ thanks to AI: Evolve or get cut
By Emma BurleighJuly 1, 2026
2 hours ago
Current price of Ethereum for July 1, 2026
Personal FinanceEthereum
Current price of Ethereum for July 1, 2026
By Joseph HostetlerJuly 1, 2026
4 hours ago
In this photo illustration, a Cisco logo is displayed on a smartphone with Artificial Intellingence (AI) symbols in the background.
AICFO Daily
Cisco is rolling out AI agents to every single one of its 90,000 employees
By Sheryl EstradaJuly 1, 2026
4 hours ago
senate
CommentaryCongress
One rare bipartisan AI bill is moving through Congress. Here’s why it deserves to pass
By Neil Björkman and Betsy BrewerJuly 1, 2026
6 hours ago
I know how Gen Z can survive the ‘jobpocalypse’ because I built an AI company — in 2015
CommentaryCareers
I know how Gen Z can survive the ‘jobpocalypse’ because I built an AI company — in 2015
By Jeremy FainJuly 1, 2026
6 hours ago
OCBC rolls out its ‘avatar banking’ platform with ‘Wendy’ and ‘Wayne,’ two virtual financial advisors, as banks integrate AI into wealth management
AsiaSingapore
OCBC rolls out its ‘avatar banking’ platform with ‘Wendy’ and ‘Wayne,’ two virtual financial advisors, as banks integrate AI into wealth management
By Angelica AngJuly 1, 2026
6 hours ago

Most Popular

MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
Success
MacKenzie Scott alone accounted for one-third of America's $19.2 billion in megagifts last year
By Sydney LakeJune 25, 2026
6 days ago
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
Success
Philanthropy leader at Warren Buffett and Bill Gates’ Giving Pledge says children of billionaires are pushing them to give their wealth away faster
By Preston ForeJune 27, 2026
4 days ago
Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
Success
Elon Musk on MacKenzie Scott giving away $26 billion of her fortune: 'Sadly,' it makes the world a worse place
By Sydney LakeJune 29, 2026
2 days ago
As Big Tech showers employees with perks to win the talent war, Nvidia built a nearly $5 trillion company by making people pay for their own lunch
Big Tech
As Big Tech showers employees with perks to win the talent war, Nvidia built a nearly $5 trillion company by making people pay for their own lunch
By Marco Quiroz-GutierrezJuly 1, 2026
10 hours ago
The U.S. Army is opening military bases to private billions — here's why that changes everything for the next 250 years
Commentary
The U.S. Army is opening military bases to private billions — here's why that changes everything for the next 250 years
By Marc AndersenJune 30, 2026
1 day ago
The Supreme Court's birthright citizenship ruling hands the U.S. economy a $7.7 trillion win
Newsletters
The Supreme Court's birthright citizenship ruling hands the U.S. economy a $7.7 trillion win
By Diane BradyJuly 1, 2026
8 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.