Thursday, 12 May 2016

Is Hadoop losing its spark?

A 2015 survey by Gartner Inc. revealed that only 18 percent of respondents expressed their desire to either try out or adopt Hadoop in the next few years. However, this report is not the only one which suggested that Hadoop’s star is fading.

Newer big data frameworks such as Spark have started to gain momentum and, according to the Apache Software foundation, companies are running Spark on clusters of thousands of nodes, which the biggest cluster encompassing nearly 8,000 nodes. Although many people rushed into writing Hadoop’s obituary, market research firm MarketAnalysis.com announced in its June 2015 report that the Hadoop market was projected to grow at an annual rate of 58 percent, surpassing $1 billion by the year 2020.

The discussion about whether Spark is meant to replace or enhance Hadoop is still ongoing, with a third group of professionals claiming that Spark and Hadoop should be used together for boosted analytics and storage capabilities. The reality is that companies can take advantage of Hadoop’s capabilities if they integrate Spark with it; the former enables Spark workloads to be deployed on available resources anywhere in a distributed cluster and eliminates the need to manually track individual tasks.

Data expert and best-selling author Bernard Marr explained in a Forbes article that many vendors offer both Spark and Hadoop and advise companies on which they will find most suitable. Marr pointed out that even though Spark is developing very quickly, the security and support infrastructure is not as advanced as Hadoop’s since it is still in its infancy.

“Hadoop was madness”


JAX Finance speaker John Davies told JAXenter.com that to him “Hadoop was madness”. “Apache Spark is part of the way back to common sense but much of the big data we have today is because we’re making the data bigger than it needs to be, we’ve been lazy. By making the data smaller, leaner and faster (Fast Data) we can run Spark several orders of magnitude faster than Hadoop with a fraction of the work and complexity to get there.”

No comments:

Post a Comment