Skip to Content

IBM blesses Spark in its latest big data push

Inside IBM Research HeadquartersInside IBM Research Headquarters
Signage in front of IBM's Thomas J. Watson Research Center in Yorktown Heights, New York.Photograph by Scott Eells — Bloomberg/Getty Images

IBM is a big proponent of big data and the successful application of big data techniques to help businesses run better.

Increasingly, big data jockeys are looking beyond Hadoop, the distributed file system that is the foundation of many big data applications, to Spark for analyzing that information.

By way of introduction, Sparkis an open-source project for in-memory data processing that aims to speed up the crunching of huge amounts of data, typically data held in Hadoop. Spark’s claim to fame, in fact, is that it can do that data crunching faster and easier than MapReduce, the tool typically used along with Hadoop for these applications.

So, on Monday, IBM (IBM), which always likes to wrap its stated intentions in press releases, said it is opening a Spark Technology Center in San Francisco. It also vowed to incorporate Spark in all of its analytics and e-commerce products and to dedicate 3,500 developers to Spark-related work, although it’s unclear where all those developers will come from. Spark itself will be also offered as a service on IBM’s Bluemix development platform.

News of the effort leaked last week but now it’s official.

IBM’s endorsement of open-source projects like Spark has proven fruitful in the past. IBM was an early booster of the Linux operating system and Eclipse Java tool set, for example.

The Spark project isn’t just about fast, in-memory processing. The overall Spark umbrella also includes a file system, machine learning, stream processing components.

IBM moves comes as it tries to bolster revenue from data analytics projects, and recruit more developers to use its tools—including Bluemix—to develop next-generation applications. In this arena it faces a raft of newer companies like Cloudera and Hortonworks (HDP), that have already endorsed Spark and don’t have to support the legacy businesses that IBM has to keep running.

Subscribe to Data Sheet, Fortune’s daily newsletter on the business of technology.