By next year, Gartner Inc. projects that three quarters of all database deployments will have shifted to the cloud. Gartner’s report did not speak specifically to deployments of the data warehouse in-the-cloud, but several indicators – from the sheer preponderance of cloud platform-as-a-service (PaaS) data warehouse offerings to the record-making IPO of cloud data warehousing specialist Snowflake – suggest that the cloud is an attractive destination for most data warehousing workloads, too.
There are a few reasons for this, argued Bruno Aziza, head of data and analytics with Google Cloud.
“With big data and cloud coming together people realize the benefits of being able to deploy applications faster, scale them faster in a cost effective way, and also ramp them up and down and be agile in the way that, you know, nobody could do it before,” Aziza told analyst Eric Kavanagh during a recent episode of DM Radio. He cited a couple of other important impetuses for cloud adoption, starting with the fact that enterprise customers, especially, are more comfortable with cloud security.
Probably the biggest reason encompasses all of the others, Aziza suggested: the database-in-the-cloud is going native – as in cloud-native. “I think people are starting to realize … cloud infrastructure … can be a great engine of innovation and in fact, might be even more secure than my own environment,” he told Kavanagh. “In the last few years … there’s been a shift where people are just starting out in the cloud. We’re not talking about migration; we’re [talking about] starting with cloud.”
Cloud-native is a radically different way of thinking about software
Aziza hit on a couple of other important points. For one, many businesses are engineering software that accords with cloud-native design principles. The aim of cloud-native design is to produce software that is easier to deploy, manage, maintain, and scale. So, for example, cloud-native design emphasizes loose coupling between the components that comprise an application, service, workflow, etc.
This is not new; in fact, it is consistent with established paradigms, such as SOA and REST. But cloud-native design is closely associated with microservices architecture, which gives priority to a technique called “decomposition:” i.e., breaking up application “monoliths” into their constitutive functions.[1] In microservices design, developers aim to produce compact software components that provide basic functions. Microservices can call, or be called by, other microservices; dozens, hundreds, or thousands of microservices can be assembled and orchestrated to comprise larger, composite applications.
Another benefit of cloud-native design is that it aims to be much more tolerant of ephemerality than traditional (or, as cloud-native proponents are apt to describe it, “monolithic”) software architecture.
In traditional design, software is imagined as monolithic, stateful, and always available. In cloud-native design, by contrast, there is the expectation that software should not only be stateless – i.e., isolated; self-contained – but ephemeral, too: the emphasis shifts from an always-fully-online to an event-driven execution paradigm. So, for example, a service calls another service, which may or may not be running; in the latter case, an orchestration platform starts and monitors the requested service. If the service crashes, the orchestration platform respawns it; if or when it is no longer required, it shuts it down. As necessary, it spawns new instances of the service to accommodate upticks or surges in demand.
To sum up: in the cloud-native model, software gets called into being, runs, and (usually) terminates.
The cloud-native database is a radical departure from the status quo
The cloud-native database is no exception. It, too, is designed to be tolerant of ephemerality. In the on-premises environment, the database qua database was deployed as a fixed quantity: it consisted of one or more nodes, each configured with a finite amount of compute and storage. It was always on, its capacity always maxed out, its resources always consuming power, even when it was not being used.
In cloud-native design, the database qua database is not just an elastically scalable but an event-driven resource: an application or service calls it into being to perform a specific function – as in the query-as-a-service (QaaS) use case – after which it shuts down again. Far from being a software monolith, or a cupidinous resource hog squatting atop a massively over-provisioned hardware stack, the cloud-native database is gracile. It, too, is always fully online and always available, but – like consumer electronics gear that operates in low-power “standby” mode – it ramps up its power in response to event-driven stimuli, spawning additional OLTP, query processing, analytic processing, etc. resources as needed.
This is the stuff of radical difference.
“In the cloud … you’re basically snapping your fingers, and you’ve got additional resources,” deadpanned Teradata Chief Technology Officer Stephen Brobst, in response to a question from Kavanagh. Brobst was referring, first, to the ease of provisioning virtual hardware and software capacity in the cloud; however, he had in mind another advantage of cloud infrastructure – and, by implication, of cloud-native software: namely, the accessibility of discrete, function-specific services.
That is, rather than designing and maintaining their own ETL, QaaS, analytic processing, etc. services, cloud customers can design apps or workflows that exploit the different kinds of pre-fab services exposed by Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, etc.
What is more, most cloud providers are designing their PaaS and software-as-a-service products to exploit these services, too, Brobst said. In cloud-native design, he told Kavanagh, “you’re part of an ecosystem and you need to play in the ecosystem, and you need to leverage … the native object store provided by the CSP. You need to leverage the security infrastructure. [A]s the customer, you don’t want to build all that stuff yourself[; rather,] you want to leverage what the CSPs have provided.”
This cloud-native “feature” is also, potentially, a bug, Brobst cautioned: “You really have to look at … what is the trade off in terms of the technical debt I accumulate if I use a cloud-native service … tied into a particular cloud platform versus what is the speed and agility I get … by leveraging that” service.
Cloud-native design brings it all back home – to core business and IT services
Granular abstraction, the final ingredient of the cloud-native cocktail is, perhaps, the most important.
Abstracted services can be made more or less granular, such that they expose different kinds of role- or persona-based functions and interfaces that correspond to the needs and expectations of the people (or machines) that consume them. So far as a line-of-business worker is concerned, the services she interacts with need only expose the business features, functions, and settings she uses in her work, with a minimum of configuration. Her user experience is characterized by simplicity and ease-of-use; ideally, it should promote a sense of complaisance, rather than frustration or resentment.
Here, again, the database in the cloud is no different. Nine years ago, Snowflake introduced a cloud-native PaaS that radically simplified the work of creating, managing, and maintaining a data warehouse. Snowflake was – and is – designed for technical users in different roles, from DBAs and data warehouse architects to business analysts, data scientists, software engineers, and other less database-savvy types. Highly technical users, such as DBAs, can access low-level settings to tweak the configuration of the Snowflake data warehouse; less technical users can delegate to Snowflake’s PaaS the work of sizing, creating, deploying, and (at least in part) managing the data warehouse.
Even though Snowflake got a head start, other providers are catching up.
Today, for example, instead of installing, configuring, managing, and maintaining the RDBMS used to support the apps and services she builds, a software engineer can exploit the comparative simplicity of the cloud database-as-a-service paradigm. The database-as-a-service automates the creation of the RDBMS and exposes the features, functions, and settings the engineer needs to do her job.
“[C]oming from more of an application-development background, I really just want to get in and use the database to handle, you know the workloads however I need to, and, you know, be able to go about my business from the application development side,” commented Rob Hedgepath, director of developer relations with MariaDB. “With MySQL as a fully managed service and a database-as-a-service in the cloud … we’re really starting to hone in on something that developers have … keyed-in on over the last several years, which is this idea of … really balancing abstraction to where … you want those [configuration] knobs and levers available so you can just be dangerous enough with the database, but still be able to get back to … the solutions that you’re building, in order to … get the job done.”
[1] Decomposition was an integral concept in SOA, too; microservices architecture is an extreme application of the concept.
About Vitaly Chernobyl
Vitaly Chernobyl is a technologist with more than 40 years of experience. Born in Moscow in 1969 to Ukrainian academics, Chernobyl solved his first differential equation when he was 7. By the early-1990s, Chernobyl, then 20, along with his oldest brother, Semyon, had settled in New Rochelle, NY. During this period, he authored a series of now-classic Usenet threads that explored the design of Intel’s then-new i860 RISC microprocessor. In addition to dozens of technical papers, he is the co-author, with Pavel Chichikov, of Eleven Ecstatic Discourses: On Programming Intel’s Revolutionary i860.