How Concurrent hopes to help ID Hadoop bottlenecks

How Concurrent hopes to help ID Hadoop bottlenecks

How Concurrent hopes to help ID Hadoop bottlenecks
Nancy Gohring, IT World
February 4, 2014
http://www.itworld.com/cloud-computing/403167/how-concurrent-hopes-help-id-hadoop-bottlenecks

February 04, 2014, 4:00 AM — Concurrent, the company behind Cascading, the application framework for building big data apps, is now hoping to take the pain out of managing big data apps.

Concurrent is releasing today in beta a product called Driven that lets developers identify problems that are holding up their Hadoop applications.

“Enterprise developers will be able to see their apps running and understand where their apps are failing, down to the line of code,” said Gary Nakamura, CEO of Concurrent.

He hopes Driven will solve a problem that is eating up loads of time for developers. “Hadoop is a complete black box,” he said. “In order to find out what happened to an app running on Hadoop, you have to scrape logs from thousands of nodes and look for a needle in a haystack. That’s often an untenable task that takes weeks or months,” he said.

In fact, companies are now filling positions for a person who’s entire job it is to “stare at Hadoop clusters all day long and make life or death decisions,” said Chris Wensel, CTO of Concurrent.

Complicating matters is that when administrators look at Hadoop jobs, they may see one job that in fact includes multiple jobs. Without insight into those jobs, an administrator may kill a job because it’s clear it’s operating poorly and unknowingly kill an important job that’s been running for a day.

With Driven, users will not only be able to see individual jobs but see who owns them. So if an admin finds a job that’s behaving badly but it’s one that’s a very important, time sensitive project that’s nearly finished, the admin can decide to let it go.

“Or if it turns out that an intern wrote some bad code, you can kill it and go talk to that intern,” Wensel said.

Driven will show enterprise developers where slow-downs are occurring so that they can fix the problem. It also includes some collaboration features designed to make it easy for developers, data scientists and operators, all of whom may work together on an app, to share views of the app in order to discuss areas that might need improvement.

Driven is a cloud service that can work on Hadopp applications running internally or in a public cloud. For now, it supports Cascading apps, which Concurrent says means it’ll be available for multiple thousands of app deployments. It’s free to use in a development environment now with a planned launch of a version for commercial deployments, which will include support, scheduled to launch in the second quarter.

It’s a safe bet that more products like this will appear, given the growth of big data and the emerging challenges enterprises face in managing it.