Don’t trust everything you hear.
By now, most people know about phishing email scams, spoofed phone numbers from spammers, and hackers taking over social media accounts to steal information or money. But most people are unaware of the latest ploy by thieves: voice cloning.
In an example of the problem, a pharmaceutical company CFO, who demanded anonymity, approved a wire transfer of $200,000 in February after listening to a voicemail in which his CEO told him that there was a verbal contract that needed urgent funding. A follow-up email from the CEO’s account said that the money transfer was urgent and added that the CEO would be out of town and unreachable.
The CFO followed the instructions, which weren’t unusual considering the pharmaceutical company’s fast-paced growth. But it wasn’t the company’s CEO in the phone message: His voice had been cloned using deep-learning algorithms.
Unfortunately, voice cloning scams are on the rise, mostly because of improved technology that makes them easier to pull off and more convincing. Thieves have plenty of material on which to train their artificial intelligence algorithms, such as audio clips that are publicly available through YouTube, podcasts, and online presentations. All it takes is 45 minutes of audio for the technology to learn to mimic a voice. The more time training, the better the technology becomes.
“I’ve seen a 60% increase in cases in just the past four months,” says Eric Cole, founder of Secure Anchor, a cybersecurity consulting firm. He’s worked with 17 companies that have lost an average of $175,000 apiece from voice cloning scams, and in one case, the hackers gained access to a firm’s IT systems.
While the ploy is not yet widespread, the Federal Trade Commission held a workshop on the topic last year, noting that numerous consumers have fallen prey to a version of voice cloning scams known as “grandparent scams,” in which an elderly person receives a phone call from a “grandchild” in distress who needs cash immediately. The voices, however, are fake. Says Cole: “We must get out of the habit of assuming that email and voice are secure.”
Voice cloning first appeared several years ago in apps that let consumers message their friends in, say, the voice of a celebrity. Since then, the technology has become more widely used by businesses.
If you watched the most recent Super Bowl, you may have heard the voice of a deceased Vince Lombardi in an ad titled “As One,” in which he implores people to come together following the pandemic. Lombardi’s voice was regenerated by a startup called Respeecher.
Most voice cloning uses text-to-speech tools, which let people type words, which the application then reads and speaks in a cloned voice. Respeecher, however, uses newer technology called speech-to-speech, which lets a person speak in their own voice, while an algorithm translates that audio into a cloned one. This allows for more nuance, emotion, and inflection in the cloned voice. For instance, any actor could use the tool to read the lines of a major star, who may be unavailable for a voice-over—but the audio would sound like the star’s voice rather than that of the unknown actor. Actors could even “license” their voice to be used in voice-overs in commercials, audiobooks, and video games—without ever actually uttering a single word for the projects themselves.
Eventually, Respeecher plans to unveil a new voice cloning tool that lets people talk and remove their accents in real time—technology that could help Indian call center agents, for instance, command pay commensurate to agents in the U.S.
Meanwhile, Resemble.ai and Google provide A.I. to save time for podcast producers by letting hosts automatically turn written text into audio clips, or into an entire podcast, that’s spoken in the voice of the show’s host. The technology could create entirely new lifelike A.I.-generated voices for companies to use in audio ads, in film narration, at call centers, or for virtual assistants.
Additionally startup Descript uses voice technology for editing audio clips. Its software turns audio into text, and lets you edit the audio by deleting words such as “um” and “uh” and even replacing some words.
Like a number of voice cloning companies, Respeecher vows to use the technology in an ethical manner and is developing new tools to combat fraud and the unauthorized use of voices. “We see a lot of people with no rights to someone’s voice reaching out to us,” says Respeecher CEO and founder Alex Serdiuk.
But the company’s policy is to clone voices only for clients that have permission and voice sample data from that person. And it helps clone voices of the deceased only if their family or estate grants permission. The company doesn’t make its software available for clients to use directly. “In the wrong hands, it could provide a threat to society,” Serdiuk says.
Perhaps the best way to combat voice cloning for fraud is to be skeptical. Company leaders should create protocols and rules for when and how money is transferred—and place limits on the amounts. They should ensure all employees know that if they receive a phone call asking for money, passwords, or other critical information, they should confirm those requests by calling from a different phone to the caller’s own phone number. Company leaders should also emphasize that employees won’t be blamed or punished for failing to follow the instructions of a caller if the situation seems fishy.
Executives may also consider implementing rotating code words for money transfers. So in the month of February, everyone would know the code word is “hotdog.” But in March, that code word could change to, perhaps, “goldfish,” says Justin Zeefe, president and cofounder of cybersecurity firm Nisos. “Just because they can copy your voice doesn’t mean they can answer questions only that person can answer.”
In the recent case of the pharmaceutical company, executives didn’t realize they had been defrauded until 10 days later, when the hacker called again to request another $200,000. This time, the CEO was in the office, and the CFO walked down the hall, asked if the request was legitimate, and discovered the truth—that he’d previously been scammed.
Many of the hackers hail from Russia, China, and Venezuela—countries with no laws against hacking outside the country and no extradition treaties with the United States—making U.S. authorities largely ineffective.
“Voice cloning is a natural evolution—another tool in their tool bag,” says Zeefe of Nisos. “And it’s a cat-and-mouse game just like every other technology.”
More must-read tech coverage from Fortune:
- Turkey wages war on cryptocurrencies, and investors lose a fortune
- Commentary: How to take data privacy back from the “tech gorillas”
- How to stop SIM swap scammers from stealing your Bitcoin
- Commentary: The 3 factors that make a digital product truly innovative
- Coinbase makes it easier to buy cryptocurrency using PayPal
Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.