In many organisations today, data complexity is rapidly on the increase. This is happening both in the area of operational and analytical systems. In the area of operational systems, on-line transaction processing (OLTP) systems are now on the cloud and on-premises with data flowing between them as business processes like order-to-cash and procure-to-pay now execute across multiple applications in a hybrid computing environment.
In the area of analytical systems, we have moved way beyond a few OLTP system data sources feeding a data warehouse. Today, the thirst for new data to analyse has skyrocketed with data from new internal and external data sources being ingested all over the enterprise to add to what is already known. Popular new data sources include clickstream data from web server web logs, in-bound customer email, external data from open government websites and weather data. Sensors are being deployed in manufacturing production lines, supply chains, assets and products all under the name of Internet of Things (IoT) to produce data needed to optimise operations and understand product or asset usage. All of this is in demand. The number of data sources is exploding and many companies are struggling to cope with data of all varieties being consumed and analysed. Data volumes are on the up and the rate at which data is arriving has accelerated. The impact of all this data on analytical systems has been profound. Not only do we have multiple data warehouses, but many companies have added streaming analytics platforms, one or more Hadoop systems, cloud storage, Spark clusters and NoSQL graph databases to their portfolio of analytical systems. These different types of analytical systems are being used to analyse different types of data to produce much more comprehensive insights about customers, business operations and risks – well beyond the transactional activity recorded in a typical data warehouse. A consequence of all this new operational and analytical activity is that data management and governance has become very complex. Also, given that much of this data is being ingested and stored in a variety of data stores both in the cloud and in on-premises, trusted data difficult to find. Also, as the number of data sources grow, connecting to data to getting more difficult.
So the question is, can something be done about it? In particular, what about data connectivity? After all connecting to data in the data centre and on multiple clouds is challenging enough but with the emergence of IoT are we now expecting users to have to figure out how to connect to potentially thousands or even millions of devices at the edge as well?
The problem here is obvious. As pace quickens and the sources keep coming, is really does it continue to make sense to expect business users to have to keep up to date with and figure out connectivity to all these data sources? Is the picture shown in Figure 1 sustainable?
It just seems that as the number of data sources grows, connectivity just becomes harder with the user having to know how to get at all the data he or she needs. Going back to IoT for a second, consider if you walk in to the office in the morning and there are 1000 more devices on the network than there were yesterday. What happens if you want access to the data on those new devices? Is the user expected to just know how to connect? Are we not just pushing more and more complexity onto the user? Can we not simplify it? Could it not be more dynamic? Or is it that we will continue to see software like BI tools lengthening the list of connectors to data sources, release by release?
It just seems to me that we need to hide the connectors to all these data sources and connect to data at a higher level of abstraction. There are multiple ways of doing that. For example, you could use data virtualisation. Alternatively, we could look for some kind of advances in the data connectivity area itself – something that does not seem to have changed much in a couple of decades. If we look at the latter i.e. smarter connectivity, then what are we asking for?
What is needed is to keep connectivity simple from a user perspective so that they just connect to a ‘logical source’ from a BI tool for example and that we have a kind of gateway that presents data at the business concepts level (e.g. customer, products, orders, shipments, payments etc.) while hiding lots of connectors to multiple underlying data sources from the user. This is shown in figure 2.
In that sense the user sees data in a more business friendly way without having to understand connectivity to all underlying data sources irrespective of whether they are in the cloud, multiple clouds or in the data centre or across all of it. The more we can hide the complexity and simplify access to data, the quicker we can reduce time to value. One vendor trying to meet this requirement is Magnitude Software with their new Magnitude Gateway. It’s definitely worth a look.
By Mike Ferguson,
Managing Director, Intelligent Business Strategies
Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent analyst and consultant, he specializes in business intelligence, analytics, data management and big data. With over 35 years of IT experience, he has consulted for dozens of companies, spoken at events all over the world and written numerous articles. Formerly he was a principal and co-founder of Codd and Date Europe Limited – the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing Director of DataBase Associates.