How MDD works: conceptual, logical, and physical models and the RDBMS

If you grok relational databases, and, specifically, if you grok the difference between an RDBMS’s conceptual, logical, and physical data models, you grok the basics of model-driven development, or MDD.

In MDD, developers design models that are high-level abstractions of the objects, relationships, rules, etc. that are applicable to each business domain. An MDD tool processes these models and generates code in a supported programming language. This is similar to the way a data modeling tool uses a logical data model as a blueprint for generating SQL or DDL scripts that are optimized for specific database systems. In both cases, modeling software translates from a higher (i.e., a logical or conceptual model) to a lower (a physical model) form of abstraction

In MDD terms, a conceptual data model is a high-level representation – an abstraction – of the business and its operations. Put simply, it is the business’s representation of itself to itself – expressed, ideally, using terms and concepts that are lucid to the business: typically, business stakeholders themselves design a conceptual data model, usually with little or no input from architects, data modelers, or similar IT experts. By contrast, the logical data model, which is based on the conceptual model, is a detailed representation of business entities, their attributes, and the relationships that obtain among them. With reference to model-driven development, a logical data model could be seen as analogous to the totality of all MDD models across all business domains: i.e., it encapsulates detailed knowledge about the business and each of its distinct domains. The logical model is the collaborative product of input from business stakeholders and subject matter experts, business analysts, data modelers, architects, etc. Finally, the physical data model adapts this logical representation for a specific database target, such as Oracle or SQL Server. It defines tables, rows, columns, rules, and other core structures. Think of it as analogous to the code generated by an MDD tool.

In an ideal world, a business would design a conceptual data model and be done with everything. This Ur conceptual data model would in turn be used to generate a corresponding logical data model, which would be used to generate a physical model for a specific target database. Fini. In a slightly less ideal world, business and IT people might make a few changes to the resultant logical data model – e.g., modifying attributes or relationships – and use it to generate optimized physical structures in a target RDBMS. The point is that the conceptual data model, the business’s representation of itself to itself, would be used to generate the structure of the business that is instantiated in the database. The correspondence between business reality and business logic would be identical. Decades ago, visionaries imagined a future in which business change could be accommodated at just such a conceptual level. In this scheme, databases could easily be refactored to accommodate technological change – including, critically, database migrations. A single conceptual model could be used to generate physical models for DB2, Oracle, PostgreSQL, Redshift, SQL Server, Teradata, and so on.

This utopia did not come to pass – exactly. RDBMSs do not incorporate actionable conceptual data modeling facilities, and IT and business people cannot use them to create, modify, or port underlying logical and physical models. At best, conceptual data modeling provides a starting point for planning and discussion, especially in data warehouse design. What is more, even logical data models are only so actionable and just so portable, in spite of what vendors claim.

In other words, 40 years on, the data modeling revolution is, alas, incomplete.

Except, maybe, in the discipline of data warehouse design and maintenance.

Data warehouse automation and LCDP

In the data warehousing space, the closest thing to MDD is probably data warehouse automation, or DWA. What is more, if you grok DWA, you grok the basics of the low-code development platforms (LDCP) that are sometimes used as complements to MDD.

Unfortunately, DWA tools do not permit businesses to create conceptual data models that can automagically be tasked with deriving logical and physical data models. (Although at least one DWA vendor – Magnitude Software, the former Kalido – does achieve something proximal to this: call it a business-centric approach to logical data modeling.) All DWA tools do, however, incorporate MDD- and low-code-like features that aim to simplify the design, deployment, management, and (most important) maintenance of a data warehouse system.

For example, most DWA tools can process a logical data model and use it as a basis to generate database-specific SQL and/or DDL scripts for different types of data warehouse targets. Essentially all DWA tools provide a visual environment in which business people, data modelers, architects, DBAs, etc. can work together to make changes to a logical model; explore and modify data structures in the target data warehouse; explore, discover, and profile upstream (source) data sources or repositories; create and modify database rules; create, schedule, and modify ETL and data cleansing routines; test and validate changes; etc. Like MDD and LCDP tools, DWA tools aim to promote not only reuse but portability, too: a DWA tool that supports multiple target databases can simplify migration from one data warehouse target to another. So, for example, IT and business people could use the DWA tool to generate new database-specific DDL and related scripts (as well as, if applicable, code) for the target data warehouse. The process is not in any sense turn-key, but it can be made less frustrating.

The analogy to MDD and low-code is useful in another sense, too: no self-styled data warehouse “automation” tool totally or completely automates the lifecycle process of designing, deploying, maintaining, etc. a data warehouse system. This is impossible.

Rather, DWA tools are useful in that they can be used to accelerate common tasks, to automate repetitive tasks, to promote reusability (e.g., of pre-built ELT and data conditioning tasks), and because they provide a single environment that knowledgeable experts can go to when they need to make changes to or diagnose problems with a data warehouse system. Like MDD and low-code technologies, DWA tools aim to free these experts to focus on work that is creatively challenging, instead of burdening them with rote, tedious, repetitive, etc. tasks.

About Stephen Swoyer

Stephen Swoyer is a technology writer with more than 25 years of experience. His writing has focused on data engineering, data warehousing, and analytics for almost two decades. He also enjoys writing about software development and software architecture – or about technology architecture of any kind, for that matter. He remains fascinated by the people and process issues that combine to confound the best-of-all-possible-worlds expectations of product designers, marketing people, and even many technologists. Swoyer is a recovering philosopher, with an abiding focus on ethics, philosophy of science, and the history of ideas. He venerates Miles Davis’ Agharta as one of the twentieth century’s greatest masterworks, believes that the first Return to Forever album belongs on every turntable platter everywhere, and insists that Sweetheart of the Rodeo is the best damn record the Byrds ever cut.

How MDD works: conceptual, logical, and physical models and the RDBMS

About Stephen Swoyer

Related Articles: