YouTube video streaming now using A.I. that mastered chess and Go

February 11, 2022, 9:00 AM UTC

YouTube has begun using an algorithm first developed to conquer board games such as chess and Go to improve its video compression.

The artificial intelligence algorithm, called MuZero, was developed by YouTube’s London-based sister company within Alphabet, DeepMind, which is dedicated to advanced A.I. research. When applied to YouTube videos, the system has resulted in a 4% reduction on average in the amount of data the video-sharing service needs to stream to users, with no noticeable loss in video quality.

While that might not sound like a major improvement, given YouTube’s scale it is a major savings in computing power and bandwidth. It also will help people in countries with very limited broadband to watch video content they would otherwise struggle to view, Anton Zhernov, a DeepMind researcher who worked to adapt the algorithm for YouTube, said. Already, video streaming occupies a good chunk of the world’s internet capacity, and that figure is only expected to climb.

The system is now in active use across most, but not all, of the videos on YouTube, Zhernov said. The A.I. system specifically works to improve on an open-source video compression method called VP9 that is widely used by YouTube, although some of its content is compressed using other protocols.

This is the first full-scale business application for MuZero, although the algorithm has been used in other real-world contexts. In late 2020, the U.S. Air Force said it had used an open-source version of the software to control the radar systems on a modified U2 spy plane during a simulated strike on an enemy airbase. DeepMind, which has said it will not work on military applications of A.I., did not participate in that project.

DeepMind is paid royalty fees by Google for the use of its technology. Colin Murdoch, DeepMind’s chief business officer, who leads the company’s commercial collaborations, declined to say how much YouTube paid DeepMind for adapting MuZero for YouTube’s video compression. In 2020, the last year for which figures are available, DeepMind said it was paid more than $1.1 billion from other Alphabet companies for use of A.I. algorithms it had developed, representing an increase of more than $760 million on the prior year’s figure.   

MuZero is a kind of A.I. algorithm that learns completely by trial and error, a method called reinforcement learning. In this kind of A.I. training, the software is given an objective to try to achieve, and it is given feedback on whether its decisions are getting it closer to achieving that objective. But the A.I. is not given any past examples of effective strategies, and therefore must learn entirely through its own experience.

Algorithms trained in this way have the advantage of being able to figure out completely novel tactics that can surpass what humans have been able to do—but they often do so in ways that strike human experts as counterintuitive or alien. This can make it difficult, at first, for humans to trust these A.I. systems.

In the case of YouTube’s video compression, Chenjie Gu, one of the DeepMind researchers who worked on the project, said that MuZero often ignored a standard video compression rule of thumb that the bit rate should be maximized for the first frame in a scene and then for a reference frame about 10 frames further into a sequence. MuZero often ignored this, finding that for many video sequences, as long as the bit rate was maximized for one of these two frames, the other did not need much bandwidth, Gu said.

A.I. systems that are trained like MuZero can sometimes fail in surprising ways too. While MuZero works extremely well for complex videos that stump other compression algorithms, it struggles with a simple “slideshow” type of video, Gu said. This is because it doesn’t understand how humans experience video, he said. In a slideshow, what is important to a human viewer are the static images—the “slides”—not the transitions between the slides. But MuZero often allocated more bandwidth to the transition frames because they are more dynamic when compared with the frame sequences before and after, while skimping on the slides themselves, he said. After discovering this flaw, YouTube engineers fixed it, he said, through some hard-coded rules for that kind of video.

DeepMind originally created MuZero in 2019 to show that an A.I. system could learn to play at superhuman level almost any game in which players have complete information about the status of the game and that it could do so starting from zero knowledge about how to play, including not knowing the rules. MuZero learns entirely by playing games against itself and gradually discovering the rules of the game and effective tactics and strategies. In this way, MuZero was able to master chess, Go, the Japanese strategy game Shogi, and a host of classic Atari video games.

Murdoch said that his team had an intuition that video compression could be converted into a kind of gamelike environment to which MuZero could then be applied. “A video is a series of still frames and each still frame is like a step in the game,” he said.

To do so, though, DeepMind and YouTube engineers had to figure out how to give MuZero feedback about whether its decisions on how to compress each frame of a video were both saving bandwidth and not noticeably eroding video quality.

The engineers did this by having MuZero control just one of the compression protocol’s metrics, called the quantization parameter, or QP. It determines the bit rate, or number of bits per second of bandwidth, that are allocated for each frame in the video. In general, more complicated scenes require a higher bit rate and more static scenes a lower one in order to maintain an acceptable quality level. To turn this into a gamelike environment, DeepMind converted a series of complicated video quality and bit rate metrics into a single combined score and then had MuZero essentially compete against its own previous attempts to compress the same video. If MuZero beat its previous best combined score, it got a point. If it failed to beat its previous best effort, it scored 0 points.

Zhernov said DeepMind is eager to see if MuZero can do even better if it is given more compression parameters to adjust than just the QP. “We are super excited with this achievement, but this is just the first step and one tiny step into the real world,” he said.

Never miss a story: Follow your favorite topics and authors to get a personalized email with the journalism that matters most to you.

Read More

Artificial IntelligenceCryptocurrencyMetaverseCybersecurityTech Forward