Would the real Hadoop ecosystem savior please stand up?
In the beginning, there was HDFS. Coupled with MapReduce, this budding file system offered great promise for consuming the Web, then parceling out the many indexes that would form the backbone of Yahoo’s search engine.
Then came the distros. They numbered in the sixes and sevens, before the dark days came. Then consolidation occurred. Major players dropped out. Many forces rallied around Hortonworks. Others called Cloudera home. And MapR created their own ball to showcase.
But the innovation wave was far from over. Along came Spark! Once the brainchild of AMPLab, this versatile execution engine opened its doors to all manner of data. With an in-memory focus, it quickly found a sweet spot for machine learning and other heavy processing needs.
So powerful was Spark in the early days, that mega-vendors like IBM threw all their Big Data eggs into its nifty basket. And though most consumers leverage the open-source version, a company called DataBricks formed in order to harden this gemstone in the Cloud.
And then? Along came Kafka. Clearly sporting the coolest name of any software product ever built, this creation of LinkedIn took the Apache Software Foundation by storm! Enabling high-powered streams — virtual firehoses of data — Kafka upended traditional data architectures.
But there was still something missing. Mind you, we’ve just covered more than $2 billion of investment, naming these agile open-source ventures. Nonetheless, there was still a gaping hole at the center of this data-driven vortex.
Not anymore. Thanks in part to our friends at the National Security Agency (yes, the NSA), we now have an engine that could bring all these pieces together. We have what might just be the computational Rosetta stone, the traffic cop of all things data, large and small, fast and slow.
Say hello to Apache NiFi! Modestly billed as “distributed and fault-tolerant realtime computation,” this fascinating construct uses directed graphs to guide the flow of data. Remarkably, NiFi enables orchestration, design, security and analytics all at once.
Yes, it really is that cool. Granted, we’re in the early days. A stable release dates back only to August of 2016. But this puppy has the chops to sink its teeth into the epicenter of data management. It fills the void that was. And this journalist predicts it will dominate the future.