How This Company Plans to Use Microsoft Cloud and Mesosphere to Get Google-like Scale
At this point, all companies, even old-school stalwarts like General Electric, (GE) are software companies, and the ability to unleash computing power as needed on new problems—fast—can spell the difference between success and failure.
ESRI, a specialist in geographic information systems (GIS), basically data-rich digital maps, likewise has turned to new technologies to help cities, Fortune 500 companies, and universities ramp up data collection and analysis.
This process must accelerate due to the emerging Internet of things, in which zillions of devices collect data about trains, planes, and automobiles (and just about everything else), and funnel it to cloud computing systems for aggregation and analysis and then, hopefully spew out useful results.
Now the company is working on a new managed service that will let clients quickly set up applications that import and parse data for their own needs. Recently ESRI demonstrated how it used Microsoft’s (MSFT) Azure cloud, Internet of Things Suite, and Azure Container Service to track all New York City taxis—by location, ID number, on/off duty status, routes etc.—using publicly available data generated by those taxis. The company will talk about this service more next week at its user conference.
The beauty of the ESRI offering is that it aggregates and visualizes the data customers need and then allows them to “replay it,” Tivo-like, as needed, said Adam Mollenkopf who leads ESRI’s real-time GIS efforts. That real time and historical perspective is very important. Perhaps more intriguingly, he noted that the goal here is to also offer “predictive” GIS services. i.e., based on certain traffic patterns on a particular day over time, here’s what can be expected to happen on this day, this year, or the next.
To demonstrate how the managed service will work, ESRI took public data from the New York City Taxi & Limousine Commission.
Mollenkopf picked the busiest hour of the busiest day, January 25, 2015, for which there was public data available. There were 535,000 total trips logged that day. Then for that hour, he vacuumed up the available data: The taxi company; the individual taxi ID; pick up position and drop off positions by longitude and latitude; number of passengers; trip time in seconds; and distance travelled.
He ended up with a simulation file that aggregated more than 7.1 million GPS positions from 10,086 trips. That combination of historical data with real-time data could help taxi companies deploy cabs where they’re most needed at the right times, and map the most efficient routes to various destinations—pointing out bottlenecks etc. Also where most traffic tickets are issued. That alone is probably worth it to any cab driver.
Over time this data could also feed into other “smart city” options like variable-rate parking, the placement of taxi stands etc.
ESRI’s goal is to provide real-time GIS that is flexible and easy for customers to adapt to their own need. A video demo of this work is here. A managed service is one that is provided by a third party, ESRI in this case, that companies can use easily without having to sweat the complexities of the servers and software running below the surface. Essentially, the third party behind the managed services acts as the customer’s IT department.
One thing that Azure and Azure Container Service, running Mesosphere’s DC/OS (that stands for data center operating system), provides is scale and flexibility. While ESRI’s GIS system can handle thousands of events per second while running on premises, the new cloud set up can handle hundreds of thousands or even millions of events per second if necessary, Mollenkopf said.
A cloud-based system can be ramped up quickly to handle workloads that spike and crater depending on the day (or hour, or minute).
“The key here is ESRI was able to stand up and manage infrastructure components needed for a large-scale application with a few button clicks and API calls using Azure Container Services and DC/OS,” Mark Russinovich, chief technology officer for Microsoft Azure, told Fortune in an interview. Azure IoT suite can take in (or ingest, in tech parlance) data from tens of millions of devices without being overburdened, according to Microsoft.
“What these applications need is flexibility and power,” Russinovich added.
An API, or application programming interface is the tech name for the standard way that different computer programs “talk to” or interact with each other. APIs are considered the lingua franca of software and are especially important now that most big enterprise applications consist of myriad “microservices,” tiny pieces of software yoked together. In the past enterprise software consisted of massive, monolithic software applications, written by a single company and updated every few years.
The pitch here is that ESRI, by using these new-age tools, was able to prototype a pretty complete system in just months, far shorter than what was once a typical multi-year timeline.
Get Data Sheet, Fortune’s daily newsletter about the business of technology.
A few big things have changed for technology consumers that enable customers to take on big new tasks.
First is the emergence over the past ten years of massive public clouds like Azure or Amazon (AMZN) Web Services and Google (GOOG) Cloud Platform. These offer customers whole data centers full of servers, storage, and networking available for their on-demand use, without the setup costs, overhead, and necessary expertise of building their own.
Second is the container craze, as exemplified by Docker, which lets developers package up everything they need to run an application into one tidy, theoretically portable container that can then be deployed inside a company’s data center or out on a public cloud.
Third is a raft of container management software tools which have emerged to make it easier to deploy those containers en masse wherever they need to run. That job can get pretty complicated pretty fast with so many variables and so many containers, so these tools are becoming increasingly necessary.
In the past, ESRI would have had to rely on its own data centers and specialized proprietary (and typically expensive) software to build this managed service. Now it’s using open-source tools like Apache Spark streaming to bring in data and Elasticsearch for a search and storage engine. These components are both less pricey than their traditional counterparts, but also allow for more scale across distributed computing systems.
Left unsaid in all this is that ESRI, which specializes in maps, faces a big competitor in Google Maps. The two companies are also partners in that Google cut its old Google Earth for Enterprise project and has referred those customers to ESRI. But given all the data that Google gathers from Android phones and other data sources to update it’s maps, it’s clear that a company like ESRI has to come up with its own way to cull myriad data sources around the world. In theory, this Azure implementation will help it do so.
Microsoft and Mesosphere are tight partners here. Microsoft was reportedly interested in acquiring the company, but ultimately invested in it instead last year. In any case, they want to show that they can help ordinary companies reach Google-like scale.
They’re not alone in that quest. Google (GOOG) and its allies are pushing Kubernetes as the best way to deploy and manage containers. (To muddy the waters, it’s also true that Mesosphere DC/OS also support Kubernetes.) And this week at DockerCon, Docker (the company) pushed its own suite of container management and deployment tools. Similar solutions to ESRI’s may eventually be popping up as more companies develop their own enterprise software solutions and managed services.
This story was updated at 2:35 p.m. EDT to include mention of other ESRI projects.