Facebook, more than nearly any web company, knows how much people love watching videos.

So far this year, the social network’s users have watched an average of four billion videos daily, quadruple the number in 2014. That’s a lot of cute puppies, skateboarding, and goofy backyard videos.

“Think about that growth,” said Jay Parikh, Facebook’s vice president of engineering. “How do you prepare for that growth?”

The reality is that it’s never been easier to be an amateur filmmaker. Smartphones, loaded with powerful cameras, have ushered in an era in which people want to do more than message each other with simple texts. They want to record moments in their lives, like a father filming his child’s first steps, and post them online for friends and family to watch.

But with more people making home movies from their phones, a huge strain is being put on the computer networks and cables that make the Internet work. At the same time, users want instant gratification, and anything less than perfection is reason to complain.

That’s why Facebook FB has been steadily redesigning its infrastructure to make loading and watching videos as reliable as possible. Plus, if the company ever wants to make big money from its nascent mobile video advertising push, it must deliver the visual goods without hiccups.

On Monday, Facebook plans to take the stage at its annual @Scale infrastructure conference in San Jose and reveal what have, until now, been closely guarded secrets about its fast growing video operations. In prelude, several of the company’s top engineers gave Fortune a detailed account of the technology behind the vast and complex system.

In Facebook’s early days, people mostly communicated via text by sending each other status updates and trading comments on the site. Then, users went wild loading photos on the site and tagging friends and family in them, Parikh explained.

Now, Facebook is noticing its users’ focus shift to video. “Everything is moving to this richer form of experience,” Parikh said.

But not everyone has the latest smartphone to take movies, nor do they always have access to speedy Internet connections. Filming a sunset in Africa and then uploading the clip to Facebook may be a miserable experience because of slow or non-existent Internet infrastructure.

Facebook's world map of video playback. The gradation shows the different levels of playback success rates. A lighter shade means better video playback.
Facebook’s world map of video playback success rates. Facebook defines success rates as a video that can play in less than one second. The gradation shows the different levels of playback success rates. A lighter shade indicates better video playback. Map by Facebook

To ensure that more people can load videos, even in remote locations, Facebook has designed a new way to transfer videos even when connectivity is poor. When users upload videos from their phones onto a website, the clips undergo a process called encoding in which they are converted into new digital files that can be played on any device.

For people who live in places with poor Internet connectivity, Facebook shrinks the size of videos during the process to limit the amount of data that must be transferred. Of course, that means video quality may suffer, but the company considers it a necessary sacrifice.

Although encoding seems relatively straight forward, there are actually many different ways that companies can convert videos to a playable format, explained Mike Coward, a Facebook engineering manager. For example, a web site whose users tend to upload text heavy clips may subtly tweak the color during encoding so that the text looks sharp and is easy to read.

But doing so may hurt image quality if there are other scenes in the clip. A waterfall tumbling in a tropical jungle could end up looking a little bland.

Facebook, however, has created a complex process to minimize the problem. Rather than taking a one size fits all approach to handling uploaded videos, it uses multiple encoding techniques.

The key is to automatically split videos into different scenes so that each one can be encoded differently to enhance the color. Computers then stitch the sections back together without viewers noticing any difference.

Because Facebook handles millions of videos, the job of chopping them up must be done automatically. That’s where artificial intelligence, an area of computer science that involves training computers to learn and make decisions like humans, comes into play.

Facebook built software powered by AI technology that it trained to identify different scenes using thousands of videos. In theory, the system can pick out a bleak desert landscape, a crowded concert hall, and waves lapping at a beach — and then automatically apply the proper encoding to each so that they look their best.

Once done with the touch ups, Facebook’s vast server farms glue the clips back together. Viewers shouldn’t really notice that the final product has been slightly enhanced, according to Coward.

It seems so easy, but it’s actually not. Uploading a short one minute video of a family standing in front of the Grand Canyon can put as many as 100 of Facebook’s computers to work behind the scenes.

And although Facebook sped up the video uploading process, there’s still the matter of actually watching the videos where connections are slow. As a remedy, for every video upload, Facebook makes multiple copies that can accommodate different devices and Internet speeds.

A person in India watching a video from an old phone will likely see a lower quality image than someone watching on a new laptop connected to a high speed network. The point is to cut down on what’s known as buffering, the annoying delay that can make you want to punch the screen.

Facebook says it is building the new infrastructure in a way that makes it easy to add any new video products it develops. For example, a new live streaming service that includes celebrities like Dwayne Johnson (aka, The Rock) showing himself pumping iron, took only three months, Parikh said.

As a precaution, the team keeps the original version of the live stream running in the background at one of its data centers, explained Abhishek Mathur, a Facebook technical program manager. It then makes temporary copies, called caches, and hosts them on servers all over the world. That way, if millions of people want to watch The Rock lift weights, they won’t bog down the system and cause the live stream to stutter.

Facebook must also be aware of the problems millions of videos can cause big telecommunications companies. Maintaining good relations with them is important, which requires trying to avoid clogging their networks with too much bandwidth.

As AT&T CEO Randall Stephenson told Fortune earlier this summer, over half of the carrier’s mobile traffic is video. This boom has led to AT&T overhauling its vast, global network to keep up with the pace.

“We want to be as thoughtful and as efficient as possible,” Parikh said.

And, yes there is a financial reason for Facebook to invest in all of this technology. Growing video advertising, particularly on mobile devices, is a critical part of the company’s overall strategy, Parikh explained. He recognizes that in order for Facebook to create a big business around video advertising, the infrastructure powering it must be efficient and fast. After all, if Facebook wants more people to upload and watch videos, there can’t be any bottlenecks.

Subscribe to Data Sheet, Fortune’s daily newsletter on the business of technology.

For more on Facebook, check out the following video: