There is always excitement around the opportunities related to adopting a modified or a new system development methodology. However, one of the biggest challenges is that despite the promise of improving the manner in which the application design and development phases proceed, the focus remains on satisfying functional requirements while largely ignoring the data requirements. At the same time, the data design teams often fret about each detail of the model, resulting in designs that often do not resonate with the application development teams.
Register for the Briefing Room on October 27 when Embarcadero’s Ron Huizenga will brief analyst and article author, David Loshin, on current data modeling strategies in the webcast entitled, “Agile, Automated, Aware: How to Model for Success.”
Consider the traditional waterfall methodology. The process ratchets through requirements assessment and design phases before the data model is fully fleshed out. Yet in today’s world, we recognize the critical link between information and outcomes, suggesting a closer link between data engineering and functional design. In some ways, the Agile methodology could be employed to better facilitate information design, although bad habits may remain – you still have artificial boundaries between the data and functionality sides.
To get the best synergy between the two sides using the Agile approach, there are three challenges that must be overcome:
- Supporting a growing variety of data types.
- Managing granular changes to support incremental needs.
- Enabling views of data that cut horizontally across the enterprise.
If it hasn’t been hammered into your head a thousand times already, we are being inundated with a broadening variety of streaming data sets of widely different formats, containing either structured or unstructured data, often a combination of both. Not only that, these data sources are often quite volatile; changes are introduced in the format with little or no notification.
Traditional design methodologies may help in developing the initial cut at tools and applications to ingest and manage information derived from these variable data sources. But when considering the speed at which data source changes erupt, there is a need for more nimble ways to rapidly adjust internal application data models to accommodate those changes. Otherwise, you run the risk of losing information or eventually becoming completely out of sync with the data source, potentially creating more severe issues for the business.
Granular Changes, Incremental Needs
In the past, it would not be unusual for most systems to be functionally compartmentalized. Most systems would be designed to address specific business needs in support of particular business functions, such as an accounting system for the Accounting Department. For siloed design and development, the waterfall approach is satisfactory – assess the business expectations, identify functional requirements and engineer the application around the functional design. The design of the data environment follows the function design to fuel the execution of transactions and the capture of the results. Once the data models are done, they remained static with very few changes made. Any changes that are required to the data model necessitate some level of disruption. The modified design is just one part of the process; all functional applicationware needs to be reviewed to determine potential impact of underlying model changes, corresponding application design needs to be tweaked and a serious amount of testing needs to be done before moving the coordinated set of changes into production.
The amount of effort is great for any changes, limiting the opportunities to make changes. Today, though, app development is a more dynamic process with changes applied at a continuous rate. The evolution of microservices and API-based development demands incremental updates, both to the code base and to the underlying data environment. That means that the traditional waterfall approach is unsatisfactory for modern application design, especially from the data perspective. Supporting this agile approach to app development demands a corresponding agile approach to data modeling and data management.
Enabling the Horizontal View
The success of the enterprise data warehouse is evidence of the value of integrating data sets from numerous internal (and sometimes external) sources. One might say that the integrated data warehouse provides data consumers with a horizontal view of an organization’s data. Yet this view is necessarily constrained by its own design – the warehouse is engineered as a repository with a relatively static model, and accumulated data sets are transformed so as to fit into the warehouse model. This means that users are limited to a view of the data based on the warehouse model, which may not always meet their needs.
The reason is that increasingly, business analysts have an understanding of the data sets they want to use and are ready to bypass the warehouse because of those limitations. The users may not want to see an aggregated view, or they want to see source data elements that are filtered out during the warehouse integration processes, or they may want to put their own semantic layer on top of a set of federated data sources. For any number of reasons, the desire for accessibility to a horizontal cut of enterprise data implies a need for data federation and virtualization, requiring rapid modifications to virtual data models on an iterative cycle to provide the different horizontal views that are in increasing demand.
Using Automation to Support Agility
Data modeling tools must evolve in lock-step with evolving development methodologies. Adopting aspects of the Agile methodology to enable faster cycling, closely-coupled interactions between designers, developers and their business clients and more rapid turnaround for changes in underlying data architectures. Some facets of the data modeling approaches are prime for renovation. Some examples include:
- Granularity: Allow for changes to be made at more granular levels of a model, such as tweaking one aspect of the “customer location” sub-model without touching the rest of the customer subject area model. Maintain change records at that same level of granularity to capture the reasons for changes that can be shared among all enterprise users.
- Business Glossary: Automate methods for reverse engineering data assets to identify key business terms, and engage the community through automated notifications and interaction to unify definitions. This will support user ability in devising models for virtualization and federation, enabling the enterprise view.
- Collaboration: Automate the sharing of enterprise metadata with the community as well as the application of data governance policies, such adherence to naming standards, unification of use of standard modeling conventions (such as using the same data element names for the same real-world attributes), and capturing a catalog of horizontal views.
- Non-Relational Modeling: The popularity of NoSQL data management systems like MongoDB is not likely to wane, especially as more mobile apps are deployed using Mongo-based databases. Provide ways to extend the standard modeling notations to accommodate non-relational model design.
The underlying theme of these examples is teamwork and improving communication between designers and their business customers. This increased collaboration nicely suits an Agile methodology, and hopefully we will start to see how data modeling tools vendors will adapt their products to dovetail their functional and information designs using the Agile approach.