Skip to Content

Algorithms that can ‘see’ images on the Web

Pixellated portrait of man with digital tabletPixellated portrait of man with digital tablet

Thanks to smartphones, photos are the new language of the Web. Each day people upload 1.8 billion new images to the Internet. The only problem with all that pic-sharing? The Web’s entire infrastructure is built around text. (Even Google’s image search function relies on text to identify images.) So how do we search, sort, browse, and navigate a rapidly growing sea of images? We teach our computers to actually see them. As you can imagine, that’s no small challenge.

Nor is it new. Academic researchers built a “convolutional neural network” architecture in the 1980s, but those early computer-vision algorithms weren’t very powerful (or useful) running on conventional processors. Programming them to run on modern graphics processors—the kind used by video games rendered in 3-D—changed everything. That happened around 2009. “A super-nerdy discipline in academia is now becoming an important way of understanding the Internet,” says Shaun Zacharia, chief technology officer of TripleLift, a startup that’s using computer vision to optimize digital ads. TripleLift is one of many companies that are pulling the Web’s billions of untagged, unsearchable images out of the dark.

For example, Google Ventures-backed Clairifai developed an algorithm that analyzes several hundred images per second for clients such as social networking companies, real estate listing sites, and e-commerce firms. Clairifai’s technology can identify an image, categorize it, and group similar images together. Other startups take computer vision into the real world: Body Labs, based in New York, creates digital 3-D models of bodies, which the U.S. Army is now using to improve armor for female soldiers. Floored, also based in New York, turns 3D models of building interiors into interactive graphics so that potential clients can “experience” a piece of real estate through a “video flythrough,” or, if they choose, a virtual reality headset.

The race is on for our computers to apply their new image-reading skills. In the minute it took you to read this article, another 1.25 million images were uploaded to the Internet.

For more stories from our Shape the Future package, click here.