Apr 22, 2014
The elusive promise of the Big Data app economy has inched a little closer to reality on Monday after Hortonworks expanded its partnership with Concurrent to package the startup’s Cascading development framework into its flagship Hadoop distribution.
Available for free under an Apache license, Cascading serves as an abstraction layer between the batch processing platform and the applications that use it, allowing enterprise developers to tap into their organizations’ vast troves of unstructured information without getting bogged down by the inherent complexity of MapReduce.
“Building applications on top of Hadoop was very difficult. That’s why our founder Chris Wensel created a framework so you could have a separate business logic layer from the data layer, and it’s written in Java so any Java programmer can pick it up,” Guy Nakamura, the CEO of Concurrent, told SiliconANGLE in an exclusive interview on theCUBE at O’Reilly Fluent Conference 2013.
Cascading goes above and beyond just making it easier to create data-driven applications, completely eliminating the need for users to change the way they work through support for broad range of enterprise technologies, including SQL and a number of popular data science tools. “The requirement for the enterprise is not to learn new skills for Hadoop but to leverage existing skills, existing systems and existing investments they already made in their infrastructure,” Nakamura explained. Upcoming versions of the framework will also include integration with Apache Tez, an emerging alternative to MapReduce that aims to deliver better performance and lower latency at large scale.
Tez runs on top of the YARN resource management and scheduling technology included in Apache Hadoop 2.0, which constitutes the core of the latest Hortonworks Data Platform (HDP) 2.0. Under the expanded partnership, the distributor is “guaranteeing the ongoing compatibility of Cascading-based applications across future releases” and offering customers dedicated support for the framework.
The partnership makes sense for both companies. Hortonworks is coming under increased pressure to deliver value higher up the stack and enabling applications on top of Hadoop is one of the best possible ways of accomplishing that. Plus, the integration allows it to catch up with rivals Cloudera and MapR, which have long provided support for Cascading in their respective distributions.
The announcement is also good news for Concurrent. The company’s flagship framework is now compatible with all three major Hadoop distributions, making it easily accessible to the overwhelming majority of users. The partnership with Hortonworks is especially significant because the two firms have very similar business models: they both make their their flagship products available at no charge and and monetize their user bases through value-added solutions. But whereas the Yahoo! spin-off focuses exclusively on professional services, Cascading sells complementary software such as its recently released Driven application performance management tool. Free while in beta, the cloud-based service provides visibility into data flows and program logic at runtime to enable test-driven development while allowing practitioners to keep tabs on information quality, according to the firm.