Concurrent aims to smash the Hadoop “black box”
Lucy Carey, JAXenter
February 20, 2014
Traditionally, one of the biggest productivity snags for app developers has been the Hadoop “black box” scenario, which makes it difficult to decipher what’s actually going on inside their project. Stepping up to solve this problem are Concurrent, with a (currently beta) solution called Driven, billed as “the industry’s first application performance management product” for Big Data apps.
Concurrent are probably best known for Cascading – a highly extensible Java app framework which makes it quick and easy for devs to tool rich Data Analytics and Data Management apps to deploy and manage across diverse computing environments. CEO Gary Nakamura describes it primarily as, “a tool that does the heavy lifting and converts your application logic built on Cascading into MapReduce jobs so that the developer doesn’t have to deal with the low level assembly language of Hadoop.”
On top of this, it delivers a computation engine, systems integration, data processing and scheduling capabilities through common interfaces, and runs on all popular Hadoop distributions – though it’s also capable of extending beyond the elephant ecosphere.
Regarding the topic of Hadoop, although it may have lost some of its lustre in recent months, Gary is positive about the future of the technology, believing that the dimming of the “buzz” around it is a positive.
According to Gary, “Hadoop is maturing, and the user behavior is maturing along with it. Enterprises are taking a more pragmatic approach to formulating their data strategy and looking for the right tools to ensure success.” Whilst the Hadoop ecosystem is still quite “convoluted and confusing”, he belives that “those that have taken a deliberate and informed approach are the ones that are succeeding.”
He adds that, “enterprises have moved on to building data applications on their Hadoop investments and are driving business process and strategy through these data applications.” It’s this new wealth of data and applications that will underpin innovation and business advantage going forward, he affirms, “not necessarily the fabric that an application runs on.”
For this reason, Concurrent are confident that 2014 is the year for enterprise data applications to come to the fore, as Hadoop-based apps become “business critical and essential for businesses to move forward.”
Cascading is well placed for this scenario, with a number of differentiators from rival offerings to give devs that all important productivity edge when building robust data applications.
As Gary puts it, “With Cascading, instead of becoming an expert in MapReduce, you can now leverage existing skill sets such as enterprise Java, Scala, SQL, R, etc.” Moreover, he says, Cascading’s local mode enables test-driven development practices where developers can efficiently test code and processes on their laptops before deploying on a Hadoop cluster.
On to the newer offering – Driven, which was inspired by “years of painstaking experience with building enterprise data applications.” Its raison d’être is to make the process of developing, debugging and managing Cascading apps that bit more painless, as well as allowing for easy management of production data applications.
It’s been a long time in the making, with the key challenge centering on waiting for the market to mature enough for it to be viable product, as opposed to any specific development challenges. For Concurrent, the magic time for “broad applicability” of Driven has now come.
Gary cites the key “Driven difference” as being a significantly faster timeline – up to ten times, in fact – for enterprise data applications to reach production. With Driven’s real time app visualisation, instant app performance analytics, and well as data management and monitoring capabilities, Gary thinks that users will find they have “unprecedented visibility into your applications that don’t exist in the Hadoop ecosystem today.” And, most crucially, circumventing the aforementioned “black box” scenario.
Although Concurrent doesn’t currently see any direct competitors for their tools, there are “indirect entities” around which they compete – for example, the open-source tools in Apache Hadoop ecosystem.
The majority of Cascading’s popularity to date is something that Gary puts down to the “attrition” of developers using Pig and Hive. For now, the hope is that devs will ultimately adopt Cascading and Drive (once it moves out of beta status) as a tandem productivity enhancement solution, continuing this upward trajectory.