Building appropriate architectures within large organizations can be a daunting challenge. Technology developers and data architects have addressed this problem for the past couple of decades with varying results. Often times one hears about whether the appropriate approach is to go top-down or bottom-up. Basically these things can be deconstructed in the following manner.
Top-down approaches to data architectures start with high-level concepts and structure information according to a given problem space (which can involve a multitude of various underlying problems). Ultimately, this type of modeling is conceptual in nature and hopefully, if done correctly, addresses some of the overarching concerns people are trying to solve from a, more or less, hypothetical or theoretical perspective. Bottom-up approaches, on the other hand, start from the data itself and then attempt to build an architecture that works from the piece-parts – the idea here being that, instead of being theoretical or hypothetical, the architecture reflects one’s actual state of affairs, as it pertains to the raw data itself. Basically, bottom up approaches use what is there in front of you to build a solution, top-down approaches deal in abstractions – this is the basic distinction, leaving out all the subtle nuances involved.
But what about the actual business case? Whether employing a top-down, or a bottom-up approach, how does one architect a system that really addresses the underlying goals of an enterprise, rather than just build new technology for the sake of building new technology? This question is often much more difficult to answer and is often not really addressed until an architecture is in place and, perhaps even worse, a full-blown information system has been built or deployed. In many cases, the business goals of the company are poorly reflected in the discussion of technology architectures, a problem which may not surface until the system’s development has run its course and it is not producing the intended answers for decision-makers. As Andrew White of Gartner pointed out in his July, 2012 post on Garnter’s Blog, “Good ideas don’t just come out of the woodwork or spring forth from a data mart. Business people have to have an idea, a question, an argument to test, a theory to explore, a posit to push against.” But are the appropriate people from the business side of the enterprise consulted when it comes time to build one’s next great data architecture (one that hopefully is going to save said enterprise from impending data doom)?
All of this discussion leads to a very salient point – involve your business drivers in your architecture discussions, decisions, constructions, deployments, etc. Don’t wait until it’s time to align models, turn on a query engine, fire up the hot new cloud repository, etc., to figure out what questions you want to ask this shiny new object into which you’ve invested a lot of money. To loosely paraphrase a thought from the Greek philosopher Aristotle, in order to build a ship, you need to have the concept of sailing in mind. This means, don’t just set out building the object (i.e., the ship) willy nilly (whether from top-down schematics or bottom up materials gathering), without thinking about its use value and what you intend to do with it. Namely, you wish to traverse a body of water using the power of the wind and ultimately move from point A to point B. The business value is the achievement of the sailing – you wish your enterprise to do something, you want your data architecture (as the modern-day vessel) to help you achieve that goal and serve as the instrument to get you there (and certainly there are other instruments needed as well, but let’s keep the metaphor simple here).
Some approaches are now surfacing that do attempt to bring the business value together with the technology components and architectures in order to solve these issues. This author applauds the people undertaking these types of endeavors. Ultimately it is a nice marriage between management consulting and software or service-level technology deliverables. But now the question on the tech-savvy mind should be – how do we actually get this done? What are the tools and techniques required to make good on this type of relationship between the business drivers and the technology drivers? How do they develop in a truly synergistic fashion, instead of as two separate drivers that, at some point, must be harmonized? This is a difficult set of questions and too large of a topic to cover in this piece, but one where the current technology components may still be lacking somewhat.
Some software vendors have identified the need to drive data architectures from the business and have built this capability directly into their tools, allowing users to map data entities together more easily, integrate processes, develop customized views and dashboards, etc. However, many such tools currently on the market are performing this technique using rather old hat methods. One such method is to utilize Entity Relationship Diagrams (ERDs). ERDs depict the logical structure of one’s data as it would be used in a relational database. Therein lies part of the current problem – the world is slowing moving away from using relational databases for everything. NoSQL databases are on the rise. Graph databases have been in existence for some time (and arguably growing in popularity). Unstructured data sources that utilize text extraction or natural language processing revolve more around terms and their usage within a domain of interest. None of these technologies easily fit ERD types of models. Thus, as the variety of data sources continues to increase, the use-value of ERDs will continue to decline in importance in terms of capturing all the types of data needing to be integrated within a given enterprise. One last consideration is that utilizing ERDs to perform ones integration, involves point-to-point mappings – meaning specific data elements, columns, tables, etc. are mapped (often manually by a user) to one another. If the data is volatile or highly stochastic in nature, then these mappings require a lot of re-work, which can be costly and inefficient over time. Certainly relational databases are not going away anytime soon, and ERDs will still be valuable tools/techniques for modeling their structures/entities, but these will not be enough to capture all of the data sources required to drive business decisions and analytics in modern enterprises. Thus, they will not suffice to drive architectures from the business perspective – regardless, again, of whether we’re talking top-down or bottom-up design.
Businesses should, instead, be actively looking for smart technologies that can utilize advanced techniques relying more on conceptual modeling, logic- or math-based analytics and high levels of automation to drive many of the components within their data architectures. These types of architectures become very hybrid in nature – meaning they revolve around several types of techniques interacting with one another and, in turn, rely on things like service-based approaches, linkages between internal and external data sources, integration of structured, semi-structured and unstructured data types, and ETL methods, to name just a few such elements. The underlying technologies often being employed in these kinds of contemporary enterprise architectures can include the following: semantic technologies, graph matching techniques, relational structures, cloud databases, statistical clustering algorithms, machine learning algorithms (and eventually potentially deep learning algorithms), natural language processing, and certainly much more. Utilizing these kinds of technologies can provide a means to capture not only the database components/entities of interest, but deeper business logics/rules that provide the concepts and goals of an enterprise.
Applying business logics to one’s technology decisions/designs is becoming an increasingly necessary criterion for success – now it is time to take the next step of examining and integrating new types of tools and techniques that can drive the success criteria for combining one’s technology stack with their underlying business drivers.