A guide for the perplexed

As it evolves, the cloud data warehouse market will likely be shaped by some of the same forces that shaped both the market for early business intelligence (BI) tools and the recent self-service BI market. 

On the one hand, this means that the PaaS data warehouse market will inevitably consolidate as providers get acquired, sometimes by larger competitors, or, alternatively, by companies that are outside of – and perhaps even orthogonal to – the PaaS space. (Folks with random-access memories will recall that direct-mail powerhouse Pitney-Bowes came out of nowhere to acquire best-in-class data-quality provider Group 1 Software.) This will put pressure on providers to make acquisitions of their own, to shop themselves around as acquisition candidates, to pursue partnerships with much larger competitors, or, most ambitiously, to recast themselves as apex acquirers, adding customers and capabilities as they devour not only competitors but companies in adjacent markets.

On the other hand, it is conceivable that a number of topics which lack cachet or currency in today’s PaaS data warehouse market could see new emphasis as providers attempt to expand their presences inside large enterprise accounts. We saw this in the self-service BI segment, where Tableau, especially altered its messaging to focus on topics to which it had once given surprisingly short shrift – for example, data governance and the complex of data quality, data lineage, data retention, and data standardization issues that are bound up with it.

What might this look like? Right now, for example, there is little pressure on PaaS data warehouse providers to address how their platforms slot into the role of the enterprise data warehouse: that is, as a central, authoritative, time-variant repository that permits a panoptic view across all business function areas. For the most part, customers seem to be OK with this: the focus – among providers and customers alike – seems to be on getting workloads into the cloud. Again, we saw something similar to this in the self-service BI space: in the first place, early- and mid-stage adopters did not have the same priorities as late-stage adopters; in the second place, the priorities of vendors and customers alike changed – sometimes radically – as self-service BI tools penetrated into and saw uptake across the the enterprise as a whole.

Over time, then, we should expect that pressure from enterprise customers will force providers to address how the PaaS data warehouse functions in the role of the traditional enterprise data warehouse – be it as a single large system (as in classic data warehouse architecture) or as a federated archipelago of marts and repositories, knit together by a complex of data integration technologies (data virtualization services, metadata catalog services, etc.) into a virtual whole. This will put the onus on providers to develop, improve, and/or acquire different types of data integration capabilities, among them data preparation and engineering, metadata cataloging, data virtualization, data quality, data standardization, and data orchestration features. 

Another thing to keep in mind is that the data warehouse is usually the most connected system in the enterprise: in one way or another, most business processes – for example, new account/new customer creation; automated credit checking; customer financing; order fulfillment, etc. – depend on its data and analytics. In this way, the data warehouse functions at once as a kind of autonomic nerve center for the business and as an indispensable support for its higher-level brain functions: decision-making, forecasting, planning, etc. All day, every day, the applications, services, and workflows that undergird critical business processes consume the data and analytics served up by the data warehouse – usually without intervention by the people who populate these processes. (This is its autonomic dimension.) But managers, directors, and executives rely on data and analytics from the warehouse to support day-to-day decision-making and to assist with long-term business strategizing. The upshot is that providers who position the PaaS data warehouse as a replacement for on-premises systems – as the larger of two hemispheres, so to speak, in a hybrid data warehouse deployment, or as an enterprise data warehouse that lives solely in the PaaS cloud – must adapt their marketing and their product development efforts to address these scenarios. Over time, then, it is likely that large enterprise customers will pressure PaaS data warehouse providers to improve the ease with which their services interoperate with (e.g., can be accessed or invoked by) the apps, services, and workflows that undergird core business processes. This is easier said than done.

The marketing stories that providers tell will change to reflect these and other new (or different) emphases. The fact of the matter is that real-world constraints tend to tamp or tame the priorities and agendas of each of the vendors that competes in a market. To cite one common example, upstart players are primed to tell variations of a story in which the physical, economic, or technological constraints that previously had determined what was possible (in a market, in reality itself) are said not to apply to them. And this might be true, contextually, insofar as what the startup is selling is packaged to appeal to the early adopters that will fuel demand in the nascent market. Again, the tendency in all markets is for known constraints and iron laws to reassert themselves. This is a function not just of economics or physics, but of human behavior.

These same laws and constraints shape individual markets, too, with the result that, over time, new or disruptive markets tend to get subsumed into larger, genericized markets. One way to look at this is as a kind of perforce, market-based rationalization: the new or different market that the disruptive thing was created to address gets genericized, commoditized. We saw this in the market for BI tools, which initially consisted of primitive ad hoc query interfaces. Over time, markets for standalone BI reporting and, subsequently, OLAP tools emerged, too. And so was born a market for the standalone BI suite, which, in a relatively short period of time, got subsumed by larger, more generic markets (namely, those for enterprise applications and databases). We saw this same cycle play out with the emergence of so-called data warehouse appliances in the early 2000s, and we saw it again, very recently, in the self-service BI space.

The priorities and agendas of the vendors in each of these markets changed radically over a relatively short period of time; the markets themselves changed, too, with the result that the “disruptive” features and capabilities associated with them became genericized. Odds are, we will see a repetition of this same cycle in the PaaS data warehouse space, too. Just give it time.

About Stephen Swoyer

Stephen Swoyer is a technology writer with more than 25 years of experience. His writing has focused on data engineering, data warehousing, and analytics for almost two decades. He also enjoys writing about software development and software architecture – or about technology architecture of any kind, for that matter. He remains fascinated by the people and process issues that combine to confound the best-of-all-possible-worlds expectations of product designers, marketing people, and even many technologists. Swoyer is a recovering philosopher, with an abiding focus on ethics, philosophy of science, and the history of ideas. He venerates Miles Davis’ Agharta as one of the twentieth century’s greatest masterworks, believes that the first Return to Forever album belongs on every turntable platter everywhere, and insists that Sweetheart of the Rodeo is the best damn record the Byrds ever cut.