Every year we highlight 10 companies and technologies to watch for the coming year. Our selection is driven primarily by the technologies being distinctive, innovative and relevant to major trends in the industry that we follow. Here is our list, arranged in alphabetic order to avoid suggesting that we have ranked the chosen companies and technologies:
Actian: Actian grew its portfolio of products significantly last year, completing its acquisition of Pervasive Software, and in doing so, adding a whole suite of integration software along with another database, PSQL, and the very impressive DataRush parallel analytics platform (now called ParAccel Dataflow). Later in the year it acquired ParAccel, adding a scale out Big Data analytical database to its it portfolio of databases, which already included Vectorwise, now called ParAccel SMP. This has become a very capable and interesting software portfolio.
Ayasdi: You could describe Ayasdi as “extreme machine learning.” At least that’s the way I prefer to think of it. Ayasdi takes a big collection of data (Big Data if you like) and applies a variety of machine learning algorithms to the collection. This yields a geometric/topological picture of the data, which Ayasdi claims can reveal all the important characteristics of the data collection. The insights about all the data groups and sub-groups are, in theory at least, baked in. Topological data analysis is, in our view, a revolutionary method for analyzing and discovering important relationships within data sets.
Alpine Data Labs: One of the rapidly emerging dynamics in the analytics market is the rush to create and end-to-end analytics capability which can be suited to business analyst level staff rather than requiring “data scientists.” This is the capability that Alpine Data Labs seeks to deliver via a cloud capability. End-to-end analytics is not so difficult to deliver since it spans everything from data access through data cleansing and transformation to the application of data algorithms and, in Alpine Data Labs’ case, machine learning algorithms – all the way to implementing results. We see this as a very competitive section of the market and Alpine Data Labs as an interesting player.
Calpont/InfiniDB: You could claim that 2103 was the year of the Big Data database. I think we were briefed on more “new” or revamped databases last year than on any other kind of product. The opportunity is clear in that the old tired relational database is slowly becoming obsolete and there is, as yet, no clear indication as to whether it will be superseded by any particular database model. Among the products we’ve been briefed on, one of my favorites is Calpont’s InfiniDB. I like its scale-out column store architecture, and I particularly like the way that it can be implemented over Hadoop. It’s a competitive market, but I believe InfiniDB will do well in it.
Cirro: There’s something neat and innovative in what Cirro does. You could call it a data federation product if you wanted, but I prefer to think of it as a data optimization product, similar to data virtualization products but working in quite a different way. Cirro can spread a query across multiple data sources and use the local processing capability located where each of the sources resides to help resolve the query. This is distributed optimization, and in our view, it’s an idea whose time has come. Conceptually, in many circumstances it really does make more sense to take the processing to the data, rather than take the data to the processing, especially when the data is Big.
Hadoop – YARN: YARN stands for Yet Another Resource Negotiator. It doesn’t sound so revolutionary, does it? And yet it is. YARN changes Hadoop for the better and forever. We have reached the stage where people are beginning to ask sensible questions about Hadoop rather than just get irrationally excited about it. Hadoop is indeed useful and it does indeed have a role to play, and its role is not going to be large unless its resources can be managed by something that looks more like an operating system than a job scheduling capability. This, we believe, is what YARN will become. It has cut the link between the HDFS and MapReduce, a link that desperately needed cutting, and it has made Hadoop far more versatile than it was.
NuoDB: Among the recent crop of new and relatively new database products is one which is distinctly different in quite a few ways. First of all it targets OLTP usage. “Big Data” is 90 percent about analytics. Nevertheless there are Big OLTP problems that many companies would like to address and NuoDB targets those problems precisely. We do not have sufficient space here to provide much detail except to say that it is built to be fully distributed (on a global basis) and to be very fast indeed. Its architecture is worth attention and, some time in 2014, we believe there will be a new release of the product that will also target Big Data query workloads. That could be interesting.
SISense – IBM (BLU Acceleration): In mentioning SiSense here we must also mention IBM for the same reason. Both IBM and SISense have built software products to exploit in-chip capabilities. With SISense it’s their BI product. With IBM it’s DB2. The point is that nowadays those x86 chips have quite a lot of data space available in L1, L2 and L3 cache and it’s just a lot faster to process the data there if you can. You can also use the vector instructions that the x86 chips have available to process more than one item of data in a single instruction (which Actian’s ParAccel SMP database does). This is a trend that we expect to blossom going forward. There’s a particular advantage with data compression in keeping the data compressed until it is on the chip and then decompressing and processing it. This can be very fast.
VelociData: When I think about processing speed nowadays I often think about VelociData. Its technology is both a hardware innovation and a software innovation. VelociData builds appliances from “commodity chips” – not just CPUs, but GPUs and FPGAs. If you leverage the power of all three species of silicon you can deliver enormous amounts of parallel processing power. That’s what VelociData’s appliances deliver. They target ETL workloads and they can reduce their execution time by true “orders of magnitude,” not just 10x but anywhere between 100x and 1000x. This is dramatic and disruptive to the point where most large businesses should at least be aware of the product.
VisionWaves: VisionWaves is a product which, in my opinion, should have existed a long time ago – in the sense that somebody should have taken this approach long ago. VisionWaves can be described as a model-driven environment for building Enterprise level capability. Nothing special in that per se; pretty much all BPM can claim to answer to that description. What makes VisionWaves unique is that it is built to satisfy real corporate goals determined by long recognized management best practices. The software is additive to existing capability and built to allow the organization to gradually implement known effective business management and business process practices.
Keep your eyes on these products. We may cover some of these products in our 2014 research program, Big Data Information Architectures, which will examine the whole area of Big Data, but focus most strongly on database.