A top Google researcher talks about increasingly intelligent computers.
The next time you enter a query into Google’s search engine or consult the company’s map service for directions to a movie theater, remember that a big brain is working behind the scenes to provide relevant search results and make sure you don’t get lost while driving.
Well, not a real brain per se, but the Google Brain research team. As Fortune’s Roger Parloff wrote, the Google Brain research team has created over 1,000 so-called deep learning projects that have supercharged many of Google’s products over the past few years like YouTube, translation, and photos. With deep learning, researchers can feed huge amounts of data into software systems called neural nets that learn to recognize patterns within the vast information faster than humans.
In an interview with Fortune, one of Google Brain’s co-founders and leaders, Jeff Dean, talks about cutting-edge AI research, the challenges involved, and using AI in its products. The following, done against the backdrop of the 50th annual Turing Award, an honor in computer science from the Association for Computing Machinery, has been edited for length and clarity.
What are some challenges researchers face with pushing the field of artificial intelligence?
A lot of human learning comes from unsupervised learning where you’re just sort of observing the world around you and understanding how things behave. That’s a very active area of machine-learning research, but it’s not a solved problem to the extent that supervised learning is.
So unsupervised learning refers to how one learns from observation and perception, and if computers could observe and perceive on their own that could help solve more complex problems?
Right, human vision is trained mostly by unsupervised learning. You’re a small child and you observe the world, but occasionally you get a supervised signal where someone would say, “That’s a giraffe” or “That’s a car.” And that’s your natural mental model of the world in response to that small amount of supervised data you got.
We need to use more of a combination of supervised and unsupervised learning. We’re not really there yet, in terms of how most of our machine learning systems work.
Can you explain the AI technique called reinforcement learning?
The idea behind reinforcement learning is you don’t necessarily know the actions you might take, so you explore the sequence of actions you should take by taking one that you think is a good idea and then observing how the world reacts. Like in a board game where you can react to how your opponent plays. Eventually after a whole sequence of these actions you get some sort of reward signal.
Reinforcement learning is the idea of being able to assign credit or blame to all the actions you took along the way while you were getting that reward signal. It’s really effective in some domains today.
I think where reinforcement learning has some challenges is when the action-state you may take is incredibly broad and large. A human operating in the real world might take an incredibly broad set of actions at any given moment. Whereas in a board game there’s a limited set of moves you can take, and the rules of the game constrain things a bit and the reward signal is also much clearer. You either won or lost.
Get Data Sheet, Fortune’s technology newsletter.
If my goal was to make a cup of coffee or something, there’s a whole bunch of actions I might want to take, and the reward signal is a little less clear.
But you can still break the steps down, right? For instance, while making a cup of coffee, you could learn that you didn’t fully ground the beans before they were brewed—and that it resulted in bad coffee.
Right. I think one of the things about reinforcement learning is that it tends to require exploration. So using it in the context of physical systems is somewhat hard. We are starting to try to use it in robotics. When a robot has to actually take some action, it’s limited to the number of sets of actions it can take in a given day. Whereas in computer simulations, it’s much easier to use a lot of computers and get a million examples.
Is Google incorporating reinforcement learning in the core search product?
The main place we’ve applied reinforcement learning in our core products is through collaboration between DeepMind [the AI startup Google bought in 2014] and our data center operations folks. They used reinforcement learning to set the air conditioning knobs within the data center and to achieve the same, safe cooling operations and operating conditions with much lower power usage. They were able to explore which knob settings make sense and how they reacted when you turn something this way or that way.
Through reinforcement learning they were able to discover knob settings for these 18 or however many knobs that weren’t considered by the people doing that task. People who knew about the system were like, “Oh, that’s a weird setting,” but then it turned out that it worked pretty well.
What makes a task more appropriate for incorporating reinforcement learning?
The data center scenario works well because there are not that many different actions you can take at a time. There’s like 18 knobs, you turn a knob up or down, and you’re there. The outcome is pretty measurable. You have a reward for better power usage assuming you’re operating within the appropriate margins of acceptable temperatures. From that perspective, it’s almost an ideal reinforcement learning problem.
An example of a messier reinforcement learning problem is perhaps trying to use it in what search results should I show. There’s a much broader set of search results I can show in response to different queries, and the reward signal is a little noisy. Like if a user looks at a search result and likes it or doesn’t like it, that’s not that obvious.
How would you even measure if they didn’t like a certain result?
Right. It’s a bit tricky. I think that’s an example of where reinforcement learning is maybe not quite mature enough to really operate in these incredibly unconstrained environments where the reward signals are less crisp.
What are some of the biggest challenges in applying what you’ve learned doing research to actual products people use each day?
One of the things is that a lot of machine learning solutions and research into those solutions can be reused in different domains. For example, we collaborated with our Map team on some research. They wanted to be able to read all the business names and signs that appeared in street images to understand the world better, and know if something’s a pizzeria or whatever.
It turns out that to actually find text in these images, you can train a machine learning model where you give it some example data where people have drawn circles or boxes around the text. You can actually use that to train a model to detect which pixels in the image contain text.
That turns out to be a generally useful capability, and a different part of the Map team is able to reuse that for a satellite-imagery analysis task where they wanted to find roof tops in the U.S. or around the world to estimate the location of solar panel installations on rooftops.
For more about Google, watch:
And then we’ve found that the same kind of model can help us on preliminary work on medical imaging problems. Now you have medical images and you’re trying to find interesting parts of those images that are clinically relevant.