Twitter makes it look easy to give its users a constantly updating scroll of news links, advice and whining from everyone they follow. LinkedIn seems to hardly break a sweat when it pings users about their connections landing a new job or “liking” an interesting article.
But, in fact, providing all of this instant information is a huge undertaking. And the emerging technology that makes it possible will likely play an increasingly critical role amid a growing appetite for data analysis at the blink of an eye.
In addition to Twitter and LinkedIn, so-called “stream processing” is also used by The Weather Channel to suck up and analyze weather data in real time, and steaming-music service Spotify to recommend songs and show targeted ads to listeners. It could also play a major role in the Internet of Things, whereby commonplace devices like cars, toasters, and locomotives collect huge amounts of data for slicing and dicing.
Car makers could use the technology so that an Internet-connected vehicle can find the best possible route using data taken in from the environment. Meanwhile, trucking companies could track their 18-wheelers as they criss-cross the country.
It’s great to have huge amounts of data about your business, customers, or online users. But it’s entirely useless if you lack the data crunching power to keep pace with the amount of information coming in.
That’s where stream processing comes into play. Big web companies like Twitter (TWTR) and LinkedIn (LNKD) use this technology to make sure that the latest up-to-date information is displayed when people visit their services. With stream processing, a webpage or online service doesn’t have to feel static because it’s constantly absorbing, analyzing, and displaying new data.
Traditionally, if companies wanted to analyze their data, they would have to store all that information into some sort of huge data repository, commonly referred to as a data warehouse. Data analysts could then run queries on all that information to help learn how one piece of data might influence another.
If a company executive wants to know the demographic breakdown of customers who bought a particular kind of toothpaste, for example, a data analyst could run a query on all the stored sales data and get a breakdown of sales by age group.
Stream processing essentially makes this a much faster process because the data doesn’t have to be funneled into a database or data warehouse to be analyzed. The data can be streamed from its source, like a store cash register, and analyzed on the fly because the engine that’s helping to move the data can automatically perform the query that a data scientist would have otherwise had to do manually.
Stream processing is basically one element among many for processing huge amounts of data, sometimes called Big Data. A free open-source version of stream processing known as Apache Storm has also been gaining traction with data scientists and is being used by companies like e-commerce site Alibaba and the Chinese search engine Baidu.
This type of technology is getting to be such a big deal that web companies like eBay (EBAY) have also been building their own stream-processing engines. Startups are sprouting up with their own take on the technology to sell to businesses. Last month, DataTorrent, one such company, raised a $15 million Series B financing round to continue building its stream-processing technology and hire more engineering staff.
Even cloud computing providers like Microsoft, Google, and Amazon have been busy unveiling stream-processing services as a way to attract customers who crave faster tools to crunch data and don’t want to host run their company’s computing infrastructure in house.
And it’s not just web companies that are interested, DataTorrent CEO Phu Hoang told Fortune. Insurance companies are looking into the technology to change how insurance claims are automatically paid out to customers. While it’s great that customers get paid quickly, the speedy nature of the process “lends itself to fraud,” Hoang said.
With stream-processing technology, insurance companies can take all the claim data in, process it faster, and scan for any anomalies that might indicate something is fishy with a particular claim. This sort of fraud detection is nothing new. But stream processing lets companies verify the information more quickly, Hoang said. As a result, it wouldn’t slow down checks getting into the hands of insurance customers.
With connected objects like a car, Hoang said stream processing could help drivers make sure that a mechanic services their vehicles at the right time. Some mechanics may want drivers to check their engines every 3,000 miles. But in some situations, like daily driving in stop-and-go traffic, car owners should visit a garage before they hit that 3,000 threshold.
With stream processing, car makers could automatically cull through driving information they collect remotely – presumably with driver consent – and analyze it instantaneously. Drivers could be pinged when the manufacturer determines that they should take their car to the shop earlier than expected.
As more companies create connected devices, there’s a potential data flood in store for the future that requires a much faster way for businesses to sift through all that information. In this case, stream processing may just be the answer.
For more about data, watch this Fortune video: