Survey shows huge popularity spike for Apache Spark

September 25, 2015, 7:49 PM UTC
Servers in the data center
Photograph by Getty Images

Apache Spark is the Taylor Swift of big data software. The open source technology has been around and popular for a few years. But 2015 was the year Spark went from an ascendant technology to a bona fide superstar.

Over the past summer, Fortune has covered some of Spark’s including IBM’s (IBM) $300 million investment in the technology, and big data company Cloudera’s switch to Spark from Hadoop MapReduce. They’re doing this because Spark is significantly faster and easier than other ways of processing lots of data, and it includes tools (real-time processing, machine learning and interactive SQL, for example) that are well suited for powering business objectives such as analyzing real-time data from connected devices, also known as the Internet of things.

But sometimes numbers speak louder than words (and even actions). One popular number often noted by the Spark community is that its roughly 600 contributors make it the most active project in the entire Apache Software Foundation, a major governing body for open source software, in terms of number of contributors. That’s no small feat considering the number of popular enterprise database and infrastructure projects currently governed by Apache.

And new numbers released this week as part of survey from Databricks, a software startup founded by the creators of Spark, shed some new light on just how popular the technology has become. One of the standout statistics has to do with attendance at user conferences, which are usually a good sign of interest in a technology and who’s using it. In 2015, attendance at Spark Summit events grew 156% to nearly 3,000, and the number of companies represented grew 152% to more than 1,100.

Users that presented at the most recent Spark Summit in June included NBCUniversal, Netflix (NFLX), Uber, Capital One (COF) and Baidu (BIDU). There’s also a set of technology companies including (CRM) that are embedding Spark into their data analysis software to provide customers with faster processing. While software and web companies are the largest group of Spark users, the Databricks survey says about 15% of Spark users are in more traditional businesses like banking, medicine and telecommunications.

That might help to explain why the number of Spark developers who use Windows nearly quadrupled, going from 6% in 2014 to 23% in 2015. While Mac OS X and Linux have always been popular among Spark users and web developers, they’re not nearly so prevalent inside Fortune 500-type companies.

Really, though, Spark’s success among big companies is just a microcosm of a larger trend over the better part of a decade—including inside Apple (AAPL), the world’s most valuable company. Open source software is getting much better, often outpacing proprietary software in terms of innovation and quality, and chief information officers everywhere are taking note. The saying “No one ever got fired for buying Oracle” once seemed like a truism, but it might soon just seem like an antiquity.

For more about the business value of data analytics, watch this Fortune video:

Sign up for Data Sheet, Fortune’s daily newsletter about the business of technology.


Read More

Artificial IntelligenceCryptocurrencyMetaverseCybersecurityTech Forward