• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

Egg companies made $1.22 billion in profit off a $6 carton — now they’re buying their way out of a price-fixing case with 53 million donated eggs

2

Meet the Zillennials: The luckiest micro-generation in the workforce, born between 1993 and 1998

3

Economists have found an answer to slowing cognitive decline: Avoid retiring early, study finds

1

Egg companies made $1.22 billion in profit off a $6 carton — now they’re buying their way out of a price-fixing case with 53 million donated eggs

2

Meet the Zillennials: The luckiest micro-generation in the workforce, born between 1993 and 1998

3

Economists have found an answer to slowing cognitive decline: Avoid retiring early, study finds
Newsletters

Why teaching A.I. to read is a lifelong endeavor

By
Jonathan Vanian
Jonathan Vanian
Down Arrow Button Icon
By
Jonathan Vanian
Jonathan Vanian
Down Arrow Button Icon
October 27, 2020, 12:34 PM ET
Add Fortune on Google for similar content.

It’s not just tech giants that are using artificial intelligence to understand human language, so that products like digital assistants can respond to basic questions.

More conventional businesses are also increasingly using a subset of A.I. called natural language processing (NLP) to create more powerful software to help answer basic customer call center queries or create summaries of long, complicated documents. 

LexisNexis, for instance, has been using NLP to improve the legal research software that lawyers, journalists, and analysts use to find relevant court documents. It’s light years ahead of the user unfriendly Boolean search system that I regularly used over a decade ago as a cub reporter.

With A.I., LexisNexis’ search interface is more intuitive. That’s partly because the company used Google’s free, open-source language model BERT as the foundation. The BERT model, trained on a vast amount of web data including Wikipedia pages, helps software better understand how some words mean different things depending on the context in which they appear.  

But LexisNexis can’t use BERT for all of its language needs because the company deals with information that is specific to the legal industry. This particular data can’t be found on the open web, which means the information doesn’t come baked into BERT.

Min Chen, vice president and chief technology officer for the Lexis Nexis Asia-Pacific and global search team, said that BERT “provides a good base model to start with.” But the company must fine-tune the technology with additional legal data so that it understands legal linguistics.

This fine-tuning is increasingly common for many companies operating in areas like finance or healthcare. Every industry has its own lingo that makes no sense in another context.

Chen said it took LexisNexis 12 months to train a version of BERT that understands case citations and even Latin. If someone wants to find a document showing that a case has been adjudicated, or closed, the technology knows to look for documents with the Latin term res judicata (claim preclusion, or a matter decided). 

As Amanda Stent, an NLP expert for financial news and information service Bloomberg, explained, technologies like BERT are important because they remove a lot of the grunt work required to train a language model from scratch. For a 10-word sentence, Stent said, “the combinations [of words] are astronomical,” and having a powerful language model like BERT as a starting point is very helpful.

But as other A.I. researchers have pointed out, because language models are typically trained on Internet data, they sometimes parrot back the offensive text they’ve scanned. You’ll be happy to know that companies can take precautions to make this less likely.

Stent and her colleagues recently published a best practices that companies can follow when training A.I.-powered language models and other machine learning systems. They recommended using human subject-matter experts to help annotate and label the text used for training (to ensure data is labelled accurately) and ensuring that product managers and engineers coordinate on big projects (to help ensure that problems don’t slip through the cracks).

The goal is to eliminate any problems before companies introduce new products. After all, no user wants to be bombarded with vile language.  

One thing companies should be prepared for is that data training projects are never done. There’s always room for improvement. 

Said Stent, “It never stops.”

Jonathan Vanian 
@JonathanVanian
jonathan.vanian@fortune.com

A.I. IN THE NEWS

Speaking of NLP. Eye on A.I.’s Jeremy Kahn takes a look at AI21 Labs, a NLP-focused startup founded by prominent machine learning researchers that aims “to fundamentally transform how we read and write." As opposed to other language models like OpenAI’s GPT-3, Kahn writes that the startup’s “system is a fusion between neural network-based language models and an older form of artificial intelligence that seeks to represent human knowledge, like vocabulary and the meaning of words, in a graph structure.”

Enter the A.I. Threat Matrix. The non-profit and security focused MITRE Corporation, Microsoft, IBM, Nvidia, Bosch and a host of other companies teamed up to release the Adversarial ML Threat Matrix, which VentureBeat described as “an industry-focused open framework designed to help security analysts to detect, respond to, and remediate threats against machine learning systems.” The goal is to help companies better secure their machine learning systems by thoroughly understanding all of the ways hackers can crack modern A.I. software. The authors of the threat matrix said via GitHub, “Data can be weaponized in new ways which requires an extension of how we model cyber adversary behavior, to reflect emerging threat vectors and the rapidly evolving adversarial machine learning attack lifecycle.

How to bring “dead languages” back to life. MIT researchers are using machine learning to “automatically decipher lost languages that can no longer be understood,” technology publication CNET reported. The researchers created an algorithm that analyzes the patterns of how languages develop over time to help uncover the forgotten languages.  From the report: “Going forward, the team hopes to expand its work to identify the semantic meaning of words, even if they're not readable yet. It ultimately hopes to be able to resurrect lost languages using just a few thousand words.”

The FDA sounds the A.I. bias alarms. Bakul Patel, the director of the U.S. Food and Drug Administration’s new Digital Health Center of Excellence, explained during an online meeting how biased and unclean data could cause machine learning software to misfire and “negatively impact patient care,” industry publication MedTech Dive reported. “We don't want to set up a system and we would not want to figure out after the product is out in the market that it is missing a certain type of population or demographic or other aspects that we would have accidentally not realized," Patel said.

EYE ON A.I. TALENT

Censiahas picked Deborah Leff to join the enterprise software startup’s board. Leff was previously the global leader and industry chief technology officer for data science and A.I. at IBM.

Nautilus hired Garry Wiseman to be the fitness company’s senior vice president and chief digital officer. Wiseman was previously the senior vice president of digital customer experience for Dell Technologies.

EYE ON A.I. RESEARCH

When auditing A.I. research, look at the conferences. Technology analysis website TechTalks looks into a recent research paper describing the review process that researchers face when attempting to submit their papers to The International Conference on Learning Representations. The authors of the research paper, who are currently anonymous, claim that they have found some problems with the submission process, including “evidence for a gender gap, with female authors receiving lower scores, lower acceptance rates, and fewer citations per paper than their male counterparts.”

As TechTalk notes, the research paper notes several instances of bias, including the conference organizers showing “significant preference for Carnegie Mellon, MIT, and Cornell universities.” Researchers who published their papers on the popular arXiv preprint server prior to submission also did better, especially if they came from those top-tier universities.

From TechTalk:

Interestingly, their research did not find a significant bias toward large tech companies such as Google, Facebook, and Microsoft, which house reputable AI researchers. At first glance, this is a positive finding, because big tech already has a vast influence over commercial AI and, by extension, on AI research.

But as other authors have pointed out, the same academic institutions that are very well represented at AI conferences serve as talent pools for big tech companies and receive much of their funding from those same organizations. So this just creates a feedback loop of a narrow group of people promoting each other’s work and hiring each other at the expense of others.

FORTUNE ON A.I.

Former Facebook employee’s new book exposes Big Tech’s dirty secrets—By Danielle Abril

Startup cofounded by A.I. heavy hitters debuts editing tool it hopes will ‘transform writing’—By Jeremy Kahn

Is it time for a new agency to oversee Big Tech? Many say yes—By Jeff John Roberts

Here’s what Amazon’s new Echo speakers are like—By Jonathan Vanian

How Lyft became the company with nine lives—By Beth Kowitt

A possible semiconductor shortage looms over Huawei’s new smartphone launch—By Naomi Xu Elegant

BRAIN FOOD

A.I. takes to space. Researchers have found machine learning technology to be an excellent tool for analyzing space data. In 2017, for instance, NASA and Google used neural networks to comb through imagery data captured from the Keplar space telescope, and uncovered a couple of planets far outside of our solar system. More recently, researchers from NASA’s Jet Propulsion Laboratory have used machine learning to identify recently formed craters on the surface of Mars. Space.com reports:

Scientists have fed the algorithm more than 112,000 images taken by the Context Camera on NASA's Mars Reconnaissance Orbiter (MRO). The program is designed to scan the photos for changes to Martian surface features that are indicative of new craters. In the case of the algorithm's first batch of finds, scientists think these craters formed from a meteor impact between March 2010 and May 2012. 

About the Author
By Jonathan Vanian
LinkedIn iconTwitter icon

Jonathan Vanian is a former Fortune reporter. He covered business technology, cybersecurity, artificial intelligence, data privacy, and other topics.

See full bioRight Arrow Button Icon
Add Fortune on Google for similar content.

Latest in Newsletters

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Newsletters

Anthropic CEO Dario Amodei
AIEye on AI
Anthropic’s Fable model is back. But U.S. AI policy is still a mess
By Jeremy KahnJuly 2, 2026
1 day ago
From Dow to JPMorgan, these are the most important female exec moves to know
NewslettersMPW Daily
From Dow to JPMorgan, these are the most important female exec moves to know
By Emma HinchliffeJuly 2, 2026
2 days ago
A test of Anduril's Altius drone.
NewslettersTerm Sheet
Defense tech could be entering its awkward teenage years. Is the boom a bubble?
By Allie GarfinkleJuly 2, 2026
2 days ago
The true cost of Donald Trump’s $2.2 billion year
NewslettersCEO Daily
The true cost of Donald Trump’s $2.2 billion year
By Diane BradyJuly 2, 2026
2 days ago
Meta CEO Mark Zuckerberg (left) and CTO Andrew "Boz" Bosworth in Menlo Park, California, on Wednesday, Sept. 17, 2025. (Photo: David Paul Morris/Bloomberg/Getty Images)
NewslettersFortune Tech
Meta prepares to join the cloud infrastructure fray
By Andrew NuscaJuly 2, 2026
2 days ago
How foodservice giant Sodexo is embracing AI and robotics to reshape the kitchen
NewslettersCIO Intelligence
How foodservice giant Sodexo is embracing AI and robotics to reshape the kitchen
By John KellJuly 1, 2026
3 days ago

Most Popular

Egg companies made $1.22 billion in profit off a $6 carton — now they’re buying their way out of a price-fixing case with 53 million donated eggs
Law
Egg companies made $1.22 billion in profit off a $6 carton — now they’re buying their way out of a price-fixing case with 53 million donated eggs
By Wyatte Grantham-Philips and The Associated PressJuly 2, 2026
1 day ago
Meet the Zillennials: The luckiest micro-generation in the workforce, born between 1993 and 1998
AI
Meet the Zillennials: The luckiest micro-generation in the workforce, born between 1993 and 1998
By Nick LichtenbergJuly 3, 2026
22 hours ago
Economists have found an answer to slowing cognitive decline: Avoid retiring early, study finds
Economy
Economists have found an answer to slowing cognitive decline: Avoid retiring early, study finds
By Sasha RogelbergJuly 2, 2026
2 days ago
On Wall Street, analysts increasingly don’t believe the U.S. government’s 'misleading' job numbers
Economy
On Wall Street, analysts increasingly don’t believe the U.S. government’s 'misleading' job numbers
By Jim EdwardsJuly 3, 2026
17 hours ago
$25 billion CEO says one-hour interviews are a waste of time—he puts candidates through six hours of tests and wants them to order wine at lunch
Success
$25 billion CEO says one-hour interviews are a waste of time—he puts candidates through six hours of tests and wants them to order wine at lunch
By Orianna Rosa RoyleJuly 3, 2026
22 hours ago
Current price of oil as of July 2, 2026
Personal Finance
Current price of oil as of July 2, 2026
By Joseph HostetlerJuly 2, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.