January 8, 2015
With the New Year finally upon us it seems as good a time as any to ask where Hadoop, the open-source Big Data framework, will be heading in 2015.
SiliconANGLE pulled forecasts from an assortment of analysts and industry experts who’ve tried to second guess the next big developments in Hadoop, and the overwhelming consensus is that adoption will accelerate within the enterprise, as more businesses build smart applications with real-time data analysis capabilities atop of the platform.
1. More market consolidation
Only ten years have passed since Google published its MapReduce whitepapers, notes MapR CEO and Co-founder John Schroeder, which means Hadoop is still at a relatively youthful stage of the technology maturity life cycle. What we can expect to see throughout 2015 is Hadoop enter a period of consolidation, with the number of vendors fighting for a piece of the action narrowing down.
“Hadoop is early in the technology maturity life cycle,” said Schroeder. “In 2015, we will see the continued evolution of a new, more nuanced model of OSS to combine deep innovation with community development. The open-source community is paramount for establishing standards and consensus. Competition is the accelerant transforming Hadoop from what started as a batch analytics processor to a full-featured data platform.”
Saggi Neumann, CTO of Xplenty, agreed, telling SiliconANGLE that: “2015 will see more Big Data acquisitions and buyouts than ever before – In 2014, we witnessed the acquisitions of XA Secure, Hadapt, RainStor, DataPad and a few others. Both Cloudera and Hortonworks are now billion dollar companies and are eagerly looking to acquire more brains and technologies. Other giants such as HP, IBM, Oracle, Pivotal and Microsoft are knee-deep in Hadoop business and we’ve yet to see the end of M&As in the category.”
2. Enterprise adoption to gather pace
Forrester Research has already put its reputation on the line and said Hadoop will become anenterprise priority in 2015, and the experts tend to share that opinion. According to Gary Nakamura, CEO of Concurrent, Inc., Hadoop is all set to become a “worldwide phenomenon” in 2015.
“Hundreds of thousands of data points reported from the Cascading ecosystem support the notion that Hadoop is rapidly spreading across Europe and Asia and soon in other parts of the world,” said Nakamura. “Therefore, there will be a strong Hadoop adoption next year for enterprises ramping up their data strategy around Hadoop, creating new jobs, and further disrupting the data market worldwide.”
Larent Bride, CTO, Talend, said a growing number of enterprises will begin deploying Hadoop in more than just proof-of-concept environments. Hadoop will be used for day-to-day operations,” he said. “Organizations are still exploring how best to adopt Hadoop as the primary data warehouse technology. But as Hadoop is used more and the capabilities of YARN become fully realized, more useful opportunities leveraging technology like Apache Spark and Storm will emerge and quickly increase its potential. Even now, real-time/operational analytics are the fastest moving part of the Hadoop ecosystem, and it’s becoming evident that by 2020 Hadoop will be relied on for day-to-day enterprise operations.”
3. SQL to become a “must-have” with Hadoop
The SQL data querying language tool that’s so popular with developers will become one of the most popular applications for Hadoop, reckon the experts.
“Fast and ANSI-compliant SQL on Hadoop creates immediate opportunities for Hadoop to become a useful data platform for enterprises,” said Mike Gualtieri of Forrester Research. This will provide a sandbox for analysis of data that is not currently accessible.
Mike Hoskins, CTO at Actian, agreed, telling SiliconANGLE that: “SQL will be a “must-have” to get the analytic value out of Hadoop data. We’ll see some vendor shake-out as bolt-on, legacy or immature SQL on Hadoop offerings cave to those that offer the performance, maturity and stability organizations need.”
4. No more Hadoop skills shortage
One of the more surprising developments Forrester expects is that the Hadoop skills shortage, which has been so well documented in the last couple of years, will evaporate in 2015. “CIOs won’t have to hire high-priced Hadoop consultants to get projects done,” noted Forrester’s report. “Hadoop projects will get done faster because the enterprise’s very own application developers and operations professionals know the data, the integration points, the applications and the business challenges.”
Key to this will be SQL on Hadoop, Gualtieri said, as it will open the door to familiar access with Hadoop data. Meanwhile, commercial vendors and the open-source community alike are both building better tools to make Hadoop easier for everyone to use.
5. Architecting around Hadoop
Cloudera chief technologist Eli Collins told SiliconANGLE he expects to see more users “architecting around Hadoop”, for example by using it as an Enterprise Data Hub, rather than just using it for bespoke operations. He also expects more users to consume Hadoop via it being embedded in larger applications.
“Analytics is becoming an important part of a lot of applications, often a core part of the application itself so we’ll continue to see more of this,” said Collins. “Customers are applying Hadoop across a lot of industries as they’ve been doing for the last several years, they’re just adopting it more extensively as the platform becomes more capable, more accessible and better integrated with the other technologies they use.”
6. Rise of Hadoop + real-time analytics
As enterprises increase their contribution to the Hadoop ecosystem’s rising growth, and as Hadoop becomes a more attractive alternative to traditional database vendors, the demand for real-time and transactional analytics will rise significantly in 2015, said Ali Ghodsi, head of product management and engineering at Databricks.
“In 2015, enterprises will continue to evolve from the initial incarnation of making use of data through offline operations and signification manual intervention, to one in which organizations will make decisions on streaming data itself in real-time, whether that be through anomaly detection, internet of things, etc.,” said Ghodsi.
“Enterprises will need infrastructures that can scale and ingest any type and size of data from any source and perform a variety of advanced analytics techniques to identify meaningful insights in the necessary amount of time to make an impact on the business. The rise of compatible process engines such as Apache Spark will further enable Hadoop to help address these needs. This year, the approach to analytics will be more predictive to operational and relational.”