Inside Analysis

IBM and the Modern Data Warehouse

On September 14, 2015, Bloor Group CEO Eric Kavanagh chatted with IBM data warehouse professionals Rich Hughes, Program Manager for Data Warehouse Marketing, and Dwaine Snow, Strategy Lead for IBM Big Data & Analytics Platforms and Integration. Read this interesting discussion of the evolving data warehouse.

Eric Kavanagh: Ladies and gentlemen, hello and welcome back once again to Inside Analysis, your source for in-depth insights about what’s happening in the field of information management and data. Of course, data is a hot topic these days. I am your host Eric Kavanagh. Today I’m talking to a couple of professionals from Big Blue, from IBM. We’ve got Rich Hughes and Dwaine Snow on the line and we’re going to talk about the modern data warehouse, what’s happening in this field that’s been rocking and rolling for 30 years or more. The data warehousing movement has been under a lot of change lately and I think that’s a very good thing. There have been a lot of innovations in data warehousing and how people get it done, what they use it for and it’s really catching on out there in some places where it wasn’t always the case. With that, I’m going to throw down the gauntlet and bring in Rich Hughes and Dwaine Snow. I guess Rich, maybe I’ll start with you. You’re out there in Overland, Kansas I believe that’s what you said.


Rich Hughes: Yes, Overland Park.

Eric Kavanagh: Overland Park, okay, so you’re in the Heartland and data warehousing is getting hot in the Heartland too, it’s not just in New York and Los Angeles and San Francisco, it’s all over the country. What are you seeing that’s new and interesting in the whole field of data warehousing?

Rich Hughes: Maybe we could step back and just look at what the drivers have been with data warehousing traditionally and over time and continue even through the present and that is really delivering value to the business. Speed of delivery is important. Another thing that’s crucial in gaining even more criticality is cost reduction and the way things are being delivered is through analytics. Instead of maybe in the past where there would be reports, now there’s more sophistication and more analytical thinking going on. What is part of this complexity or what’s producing more insight is that different data sources are being brought in. These data sources can be both the traditional structured and the semi-structured and unstructured data that’s more associated with the Internet and Hadoop.

As disrupting technology has come in, that’s where the businesses had to adjust. Still the purpose is how do you deliver and how do you get value to the business to make your business better and more competitive.

Eric Kavanagh: Yes, there are a lot more sources of data these days and they’re growing by the day it seems to me, certainly in the cloud but also because of mobile and because of a whole host of innovations. Like you say, there’s a whole process that has to come into place to be able to onboard that kind of data. You can’t just grab data from the cloud and dump it to the warehouse. You hear people talking about that with Hadoop schema on read, for example, just throw it in there and worry about it later. Even that’s not the best idea in Hadoop, but it’s sure not how you can do things with data warehousing, right?

Rich Hughes: Yes, for sure. Our vision of what’s going on is there are many data stores or many data sources out there and, ideally, you would have the infrastructure in place to be able to leverage and take advantage of all those different data stores in an integrated framework. When we talk about modernizing the data warehouse, we have a view of a logical data warehouse for this whole ecosystem where people, business users, data scientists can go at the data in a relatively straightforward manner.

Eric Kavanagh: Let’s talk about logical data warehouse for those who don’t understand. As I understand it, you’re talking about not necessarily the traditional warehouse where the data is quite literally centralized, but you have a more of a federated view and you have some access patterns and some rules that are baked into how all that data gets populated and gets updated and processed and so forth. But can you kind of shed some light on your definition of a logical data warehouse versus what people might consider the traditional enterprise data warehouse?

Rich Hughes: You gave a pretty good detailed explanation of the logical data warehouse. But you would have discovery zones, where people can experiment and sandbox and look at data to determine its value and that would be part of what we designate as part of the logical data warehouse is in terms of Hadoop, and MapReduce. That’s one thing that certainly is part of logical data warehouse. We also submit that the traditional data warehouse, the enterprise data warehouse, certainly still has a lot of good things that it can do and will continue to do over the next five to 10 years. That hasn’t lost its place, but if you have the need for reliability let’s say in service level agreement, or you really have the great need for speed in terms of deliveries of the applications and reporting analytics, then your enterprise data warehouse can still hold that value and still definitely has a place in the near and middle term for the logical data warehouse.

Eric Kavanagh: You brought up this great concept so I’d like to drill into it: discovery zone. That’s a great term. I haven’t even heard that term but I think it’s extremely compelling and very intuitive in terms of what it means. To me, this is one of the best innovations in data warehousing in some time because traditionally for those who have the battle scars, so to speak, of enterprise data warehousing 20 years ago, if you wanted to test out some new idea, onboard some new data or do some new analysis, it was a pretty long process to get that going. And you had to pull some strings; it was a political entity as well. The data warehouse still is, but not as much so perhaps. These discovery zones you’re talking about are extremely valuable for letting analysts test theories and iterate and figure out where they want to go before they go into a more hardened process of delivering that enterprise capability. Is that your assessment?

Rich Hughes: That’s a fair assessment. If you want to look back maybe five years ago, I guess in terms of terminology, I mean, what were you describing could be under the heading of data mining. It’s the same concept with these sources of data, there’s the anticipation or there’s the interest in gaining value so there are nuggets of gold buried somewhere in all this mound of data. The purpose of analysts or data scientists is that they need to go forth and look for those gems, look for the gold out there. That’s where sandbox or a discovery zone is certainly the place for that activity.

Eric Kavanagh: Yeah, that’s good stuff. I think this whole concept of logical data warehouse just kind of accepts the reality that trying to force all of your important data into one centralized location is just not a very realistic plan.

Rich Hughes: If I could jump in, the other important idea here is the iteration or how things in this discovery zone need to take place, where there is a hypothesis, it’s tested and it might beget another hypothesis if it’s proved wrong or further alteration of that hypothesis as things are discovered.

Eric Kavanagh: Yes, that’s a really good point. You want to be iterative, you want to be open to change because the analytical process is not linear, right? It’s multi-variant, I suppose, is the way to put it.

Rich Hughes: Right. I guess what comes to mind to is if you have statistical models that are being put together, if you’re, as an example, looking at your customer base and seeing if you want to rank or rate your customers, this iterative process is definitely in play for a use case or activity like that.

Eric Kavanagh: Are you seeing the use cases really expand these days because of more companies, more industries, more individuals coming in to the domain, or is it just kind of fleshed out versions of what we used to do?

Rich Hughes: I would say that from a comfort standpoint, maybe half or two-thirds of it are pretty much what has been happening in enterprise data warehousing over the years. If you want to have a 360 view of the customer, that’s certainly improved but also continue as it’s going on here. If you want to improve the operational aspects of your business to perform better and optimize the performance of your business, that certainly is a continuing use case. Yes, there is carry over and there are also novel use cases that are being explored now.

Eric Kavanagh: Just maybe a couple more questions and we’ll bring in Dwaine and then all just kind of have a roundtable discussion about what’s happening out there. It seems to me you’ve touched on this that speed and time to value are real critical factors these days, for all kinds of reasons. I mean, people in the business side have really lost patience in many cases. They’ve gone through the cloud or they’ve gone kind of rogue in some organizations and just gone out and bought a solution, whether it’s a cloud-based solution or an appliance, we saw a whole craze of appliances a few years ago. That speed to delivery is really important. Can you talk about what has changed to enable companies like IBM to deliver these kinds of solutions much faster than in the past?

Rich Hughes: Yes. The appliance, the data warehouse appliance, one of its appeals was its ability to deliver applications quickly and so I think the focus of IBM has been, that’s been the inspiration, is to continue the appliance as we go forward and with the focus on having the application be able to deliver in a short amount of time. We’re not talking about weeks or months, we’re really talking about delivering applications that deliver value in terms of maybe a week or less than a week. Those are the types of capabilities that we’re striving for.

Eric Kavanagh: Yes and changing the game to that point where you go from six months to a week to two weeks, I mean, that’s a sea change. I think that’s one of the reasons why data warehousing is becoming much more mainstream these days, because if you can deliver value that quickly, that’s going to open a lot of eyes, that’s going to inspire a lot of executives to get out the pen and sign contracts, right?

Rich Hughes: Hadoop, Apache, that Apache project has been around for at least five years now, but one of the things IBM brings with Apache Hadoop is big SQL, and it has some capabilities where SQL, structure query language, has been around for a number of decades, really. But dipping into that skillset, there are many, many people in your enterprise, in your company who know SQL and so Big SQL masks or makes transparent where on the user side, you can still compose and write your SQL to do a query or do some analytics, and then Big SQL is able to mask it and go get the data wherever it might be and then everybody’s happy with that arrangement.

Eric Kavanagh: Yes, that’s good stuff because like you say, there are a lot of people who know SQL and as my business partner, Dr. Robin Bloor said to me a long time ago, which I thought was very interesting, he said there are really two kinds of standards out there. You have standards that are delivered by an organization that is so powerful that you simply can’t get around it. Then there are standards that are just developed over time by the best and the brightest in a particular industry, and SQL is one of those standards. It’s not vendor-specific, it’s vendor-agnostic and for that reason, there are all kinds of people who can use it and therefore, it’s a nice tool to use even in the environment like Hadoop, which was not originally designed to support SQL. Now of course we have all these layers of abstraction that sit between the applications and reservoirs like Hadoop or data lakes, so to speak, right?

Rich Hughes: Exactly. You touched on the term “federation” and that’s kind of a loaded term, but if you can imagine within a SQL statement being able to federate or have a virtual data reality where you can go after and connect with both Hadoop data and maybe structured data in a DB2 or even Oracle or Teradata, something like that, that really extends your range of motion and really brings in a lot of other data sources that are valuable to be able to analyze.

Eric Kavanagh: That’s good stuff. Let’s go ahead and bring in Dwaine Snow as well, calling in from Philadelphia. Dwaine, welcome to the show. What do you see happening in the data warehousing space that’s new and interesting.

Dwaine Snow: I think you really touched on one with the hybrid warehouse, with Hadoop and unstructured data, big data, and I think the warehousing space is also being pulled in a different way, and that’s to what I would call the hybrid or pulled from the cloud. More and more data today is being created or generated on the cloud, in mobile devices, and that’s just a natural place on the cloud. If your data’s being generated there by apps that are running in the cloud, why do you want to bring them on premise? The ability to build a true hybrid analytics workspace that’s combining on-prem structured warehouse with on-prem or cloud-based, unstructured, semi-structured, Hadoop-based systems with cloud-based even structured or unstructured and bring all those together with a common interface: the ability to provide one-stop analytics.

What I mean by one-stop analytics is the ability to write and deploy anywhere. You talked about that workspace, the sandbox, and writing your query, writing your analytics some place and just being able to deploy it wherever you need to, wherever it makes sense. I think that the cloud-based system and the ability to either burst to the cloud or build your entire system on the cloud or a part of your system and part of your data on the cloud, is just another way that the world is being pulled into the world of analytics lately.

Eric Kavanagh: That’s a really great concept you just mentioned. This true hybrid analytics workspace, I love that. You’re speaking to a couple of huge trends that are taking place. These are very sensible trends that are happening for logical reasons, perhaps there was a pun there with the logical data warehouse, but the mainstay of data warehousing in the old days and still today, was this concept of ETL, extract transform load, and let’s face it, moving data is a painful process. Even with really powerful technologies, any time you move data, it’s going to be a time-intensive thing, it’s going to take resources, it’s going to cost money, it’s going to use up personal time and there are also mistakes that happen when you move data. I think a huge driver behind this whole hybrid data warehouse and the logical data warehouse concept is the reality that data does have gravity and ideally, you don’t really want to move it around unless you have to, right?

Dwaine Snow: Absolutely and you mentioned earlier when you were talking to Rich about data warehouse appliances and the first data warehouse appliance was Netezza, which IBM acquired. The concept of Netezza was move the processing to the data. I think that is the most important concept in this hybrid logical data warehouse, it’s just like you said, you can’t afford to move big data. You can’t afford to wait to move hundreds of gigabytes or terabytes or petabytes of data and transform it and extract it and move it all around. You want to send the smallest amount possible.

You want to send the processing to where the data is, where the gravity is, where it’s being kept probably in its data form, send the query to the data and only bring back the results that you need and do it that way, whether that be from on-prem to the cloud, from an on-prem relational database to Hadoop, or from the cloud back even into your on-prem system. No matter where you connect, you need that fabric across all of those to make the concept and the access transparent no matter where you exist and that’s a neat tool that we’ve developed here at IBM called IBM Fluid Query, which allows you to do that. It really makes access seamless and transparent across that logical hybrid data warehouse.

Eric Kavanagh: Yes, that’s really good stuff. We’ve talked about distributed queries for 20 years, but it seems to me it’s just in the last few years that the collection of technologies and methods necessary have been available and deployed and understood such that we can actually start doing this stuff now. That to me is just orders of magnitude better than it used to be, because of the fact that now you don’t have to move data, isn’t around as much, you can deliver value much more quickly, and when you adjust the cycle time with that significant of a change, you really change the workflow of the people and the business, and that allows the analyst to stay on top of what’s happening in the business, meaning you don’t just get incremental improvement in what’s changing, you get borderline revolutionary improvement in some cases, right?

Dwaine Snow: Absolutely. Now, this doesn’t mean that you’re never going to do ETL because there’s going to be times where, like you said, you’re going to find this amazing, brand new correlation or something that you never knew before that you want to reapply and you think, “You know what? I’m going to apply this over and over again and I may even need to put it into the stream, on streaming data there.” So you may transform data and bring it in house because that’s going to be quicker if you can go to one place and get everything there; it’s always faster than having to do that and reaching out to get it. You may still be doing it and you typically still will be bringing stuff into that warehouse, and just like you said, it’s not going to disappear for a long time.

But the requirement for that is what disappears and that’s where you get just exactly what you said, there’s this inflection for work, “You know what? I don’t have to wait for my data to be delivered to me, I can access my data where it is no matter what form it’s in, in a language and in a structure that I understand, and I don’t have to learn something new.” I don’t have to learn MapReduce, I don’t have to learn REST API or whatever it is, I just use whatever the SQL that I’m used to.

Eric Kavanagh: Right. That’s a really good point and to kind of drill into this a bit, the interesting thing about the cloud and to your point earlier, I don’t think the hybrid world is going to go away ever. Quite frankly, I think we’re always going to have on-premise systems and we’re always going to have cloud-based systems from this point forward. I don’t think that’s ever going to change. It’s very different environments, when you go up to the cloud as opposed to within your own network, for example. There are obviously some similarities, but when you get up on the cloud, it’s all through these APIs and so if you have a mechanism that allows you to intelligently manage how these APIs are called, what kind of information you pull through them, the timeliness of that kind of thing, that’s the magic for the future distributed query, right?

Dwaine Snow: Absolutely. The other thing that I think is going to change, or it really has changed over the last number of years and is changing even more and more with the advent of social media, is you don’t own all of the data you’re going to be using anymore. Some of it is going to be coming from social media, some of it you’re going to be paying for from an external source and you want to be able to just access that data where it is, because you don’t need to bring it inside. You just want to access it, whether it is postal code information or GPS information about every coordinate or ZIP-code around the U.S.

All of that stuff is, more and more we’re going to get into this market place of data, market place of analytics and the ability to call these things, whether be it a data access or an algorithm or a correlation through APIs, when we get to this point of the true hybrid environment, all of that API, I’ll call it the API economy, is what I believe is the right term for that, that’s going to become ultra important going forward.

Eric Kavanagh: Yes, that’s right.

Rich Hughes: If I could jump in here. You talk about the speed of delivery and speed of deployment, but this is where it’s crucial with the social media world we live in today. Almost every company, it behooves them to be able to monitor social media to see how their products are in the market place, how they’re being perceived. You don’t want to have to wait one or two days to be able to react to something that might adversely affect your brand.

Eric Kavanagh: That’s a really good point. You have to stay on top of what’s happening out there in the real world, and you’re kind of leading me up to another concept I’ll throw out. Maybe Richard you want to comment on this and then Dwaine as well.

The other key success factor going forward, it seems to me is this whole concept of collaboration, and near real-time collaboration or real-time collaboration, not building something for three months, throwing it over the wall, worrying about it for another month and then getting feedback for something. No, you need to be collaborating across your team and across departmental boundaries on a regular basis these days to stay on top of things, to make sure that you’re moving in the right direction, because now you can change direction much more quickly, right?

Dwaine Snow: Absolutely, I think that it goes beyond even the walls of your company. The developers today are looking on GitHub or wherever to look for code or examples that they can reuse. In the world of analytics today, a lot of the algorithms are already there. Whether be in R or SAS or SPSS or machine learning, MLlib, wherever. A lot of those algorithms are there or somebody else is writing them. You want to be able to collaborate and reuse what everybody has done, and I think what’s going to become important is some sort of central repository. Just like we have a system catalog and a database today, some sort of a catalog that has: where’s my data, where are my algorithms, where are my correlations, where are my different things, what’s the lineage and all that, and a central repository, central catalog that provides access to all of that.

I know there are a number of Open Source projects and things that IBM’s contributing to on that, and I think that with this logical warehouse, modern warehouse that’s going to span all of these different paradigms, that’s going to become ultra important.

Eric Kavanagh: Yes, that’s good stuff. So you bring up a really good point here, Dwaine, too about the fact that developers are going into the cloud, they’re going to Github, they’re trying to stay on top of what’s happening and let’s face it, it’s not an easy thing to do. There’s so much innovation happening in so many quadrants and so many places, and a lot of people hear Open Source, myself included, years ago and think “Okay, so it’s free, just go get it.” Well, yes it is free or can be free, depending upon the life cycle but knowing which bits of code to grab and how to weave them into some tapestry of functionality, that’s a whole separate ball game. This again speaks to the importance of collaboration across divisions, from your developers to your IT guys to your business people and having that constant dialogue that involves all of them. You can’t just throw something over the wall. You want to be in conversation with these teams on a regular basis, right?

Dwaine Snow: I 100% agree, and that’s really the only way, not the only way, but the path that these things are going to get done. You can sit in your cubicle or sit at your desk and try and figure that out yourself, but without that collaboration, you’re just going to lose time. It comes down to with the collaboration, not just how do I use this or what’s the right way to call this API, but what data should it be used on, where does it make sense, when does it make sense to call it and it goes way beyond that. Just that knowledge base or that sharing collaboration place or tooling, I believe is ultra important.

Again, it extends beyond the walls of the organization, I think, today because nobody is just using what’s internal anymore. There’s so much use, like you said, of Open Source and things like R today and the explosion of libraries in R or Spark, or all these new and evolving things. But the world is changing in a much faster rate today than it ever has. Without that collaboration, it’s hard to keep up and it’s hard to know what to use when.

Eric Kavanagh: That’s a really good point. The collaboration too, it seems to me, yes there is a ton of innovation happening on the very cutting edge. What you really need as an organization is an ability to triage that and a process for being able to test and then onboard and then harden some of these technologies to bring what you can into your data warehousing environment, for example, but you don’t want to just grab some Open Source code and throw it into production. You would have to have a process of vetting this stuff and talking to people, making sure that you’re going down the right path. That’s probably, let’s say two or three month process. It’s still not going to be an immediate thing because these things take time to vet.

If you watch the Open Source movement, it’s very interesting to me how this all kind of pans out because it seems that any given project will reach this peak of efficiency and then hits that inflection point and kind of fades a bit, and then some new project comes up. You really want to be staying on top of that or have either a consultant or a liaison or someone in your organization who pays attention to that stuff and really understands the impact of these new innovations such that you can intelligently weave them into your platform. I guess I’ll throw that over to Dwaine and then Rich if you too want to comment.

Dwaine Snow: I think you’re absolutely right. It really comes down to intelligently weaving it in. It is understanding where these technologies fit because there’s hype around everything. You can’t believe the hype. You need to understand where it fits in your environment and what makes sense in a way that it’s going to corporate there to provide the most benefit. Some things in Open Source won’t, some things will provide great benefit, and I think that going forward, that modern warehouse or that logical warehouse is got to be open to all of these new technologies and not something that’s a closed environment that, “You know what? I only bought from company X.”

It has to be an open platform understanding that Open Source is going to be a part of that going forward, just because the way that Open Source can evolve and build and expand and produce so much faster, because you have hundreds or thousands or millions of people contributing to those projects.

Eric Kavanagh: Right. I guess maybe let’s kind of focus on business objectives or on advice that you guys can give.

Rich Hughes: From business objectives and IBM certainly thinks about if from the persona of the data consumer. In terms of enabling the data consumer to go after these various data sources, I’ll draw it back to IBM Fluid Query and the seamless integration. That’s how it looks to the data consumer, where they don’t have to be bothered, really, by where the data is or what system it is or what platform it is. That transparency is really an enabling technology and that’s really the philosophy that IBM has with this Fluid Query.

Dwaine Snow: That can work across both IBM, other partner or competitor products as well as Open Source projects and that’s really core to that logical or modern data warehouse being that open platform.

Eric Kavanagh: That’s a really good point and Rich, I guess I’ll kind of dive in on that a bit, it seems to me that mobile really did a pretty significant service to the whole world of application design because what happened, in my opinion at least, is that these devices, these smart phones, of course the iPhone being the first, you have the Galaxy and a bunch of other different tablets out there and other technologies, but because the surface area is so much smaller, it really forced developers to get very strategic about two things; one is to design actually what you’re looking at on the screen and two is the work flow, is the process flow.

If you can intelligently work through what needs to be on someone’s screen now, what the options are and then decompose that business process, that’s how you come up with a really simple, intuitive design that you see these days with technologies like Twitter or Bitly or some of these other tools. They really focused in on the specific things, the options that are needed at that point in time and then made it much simpler for the end user to figure out what’s going on. It’s not like the old Windows-based applications where you’ve got File, Edit, View, etc., and you drill down those you have hundreds and thousands of options and sub-options and so forth. Like trying to learn Adobe Photoshop from scratch in the year 1999 or something. It was a pretty significant effort.

Rich Hughes: Right. That’s what makes this a really interesting and exciting time to be alive. You mentioned mobile and you mentioned just how it overturned the development because of the real estate management that was needed, but the other interesting thing to me about mobile is that it is personal, two out of three people on the planet have one of these devices and it’s also location-based. You’re carrying this device everywhere; you’re mobile. That puts in a whole another dimension as well. When you add those together, from a business standpoint, you have a lot more idea of how the customers are using your product and where they’re using your product. The possibilities that you have for gaining insight are going through a couple more dimensions that are crucial.

Dwaine Snow: There’s an interesting point that I read not too long ago and that is by 2017, about 70% of data is actually going to be generated by mobile apps. That’s to the point of cloud and hybrid cloud. Most of those servers that are running these mobile apps are on the cloud and running on the cloud because you have to have them across the world, across multiple geos, you can’t just expect them to run and connect into your data center. The only way to get that scalability and security is to do that on the cloud and have the cloud do that for you. That’s really key is, again to get the value and the insight from this mobile data because you have to do analytics on that as well. It’s not just a piece of data that you sit on the side.

The insight that you can glean from the data generated from these mobile apps, even if it’s just location, but it goes way beyond that, is key to providing the most positive business impact that you can to your organization.

Eric Kavanagh: You make a really good point there. Maybe we’ll close up with some comments from each of you on why this whole concept of a logical data warehouse/hybrid warehouse is so important going forward. With all these new data sources and with all these, let’s call them moving centers of gravity, you’re not going to be able to wrangle them all into one repository, that’s just never going to happen. It makes complete sense to have this more federated multidimensional environment with the understanding that you need to be able to see into the pipes and the different reservoirs and understand what’s moving around in order to manage it effectively, but to me, what you just mentioned there, Dwaine, is a perfect explanation for why the hybrid data warehouse, the logical data warehouse going forward is going to be the de facto way of doing all these stuff. I guess maybe Rich, first you and then Dwaine, if you want to comment on it.

Rich Hughes: Yes, I’ll try it, and it’ll be more of a continuation of that thought around mobile. It’s more or less just as the activities are much more mobile and out there if you will, that is reflected in the different options as far as platforms. But what I wanted to mention was it’s really no secret that you go into any type of retail environment, and you can jump on to their free Internet service that they’re providing in their store. There’s knowledge for that retailer about their consumer, about their customer when they’re in their bricks-and-mortar store, they also can find out when the customers visit their website. That’s an example of the variety of data sources that have to be ascertained and developed so that you can get the 360 view of the customer if you’re in the retail business. I’ll end it with that.

Eric Kavanagh: Yes, that’s a good point. Dwaine?

Dwaine Snow: I think you’re opening up a whole new – I wouldn’t call it a can of worms from a lay person’s perspective – but a whole new way of gaining insight from an analytics perspective once you start bringing all of this mobile data together. I’ll start on the can of worms side. Did you know, and I’ve read this and looked at it, that if you download a lot of free games and have them on your smartphone, that by accepting the disclaimers to play the game they’re broadcasting your location. So even though you have location services turned off for almost everything, they’re broadcasting your location. They’re selling that data, to Rich’s point then, to retailers and whoever’s using that.

That’s a way that you are unknowingly broadcasting and contributing to a lot of the internal extra insight. The ability to take a credit card and understand that “I clicked on your web, I ordered on your website three years ago. I haven’t ordered since, I was in your store, I bought something in the store and now I’m using a mobile app” and being able to bring all that together and have the master data management across unstructured data, clickstream, all of these different types of data, mobile apps, on-prem apps. The ability to bring it all together and tie all of that together to get a true view of the customers is a way that you’re going to have to be able to do it in the future just to compete not to gain advantage, the way you can today but in the not too distant future, you’re going to need all of that just to be competitive in the market place.

Eric Kavanagh: You make a couple of real good points there. One of course, there’s no such thing as a free lunch. Watch out for those free games and applications for sure.

Rich Hughes: On the Internet.

Eric Kavanagh: Exactly. Really, it’s all over the place. It is a can of worms, the Pandora’s box has been opened. We’re not going to close it again, but you at least as a consumer need to be aware of what’s happening and just kind of do the math on stuff to realize there is no such thing as a free lunch.

You also brought up a couple of good points there, Dwaine, about just how clear a picture smart companies can have of their customers. That’s good news in a lot of ways, mostly for customer experience and also for being able to deliver the right service, the right message or the right product at the right time and not overkill, not spam people, not drive people nuts with too many messages, because you will lose customers if you don’t understand who they are and what they’re doing, right?

Dwaine Snow: Absolutely, and I think it’s not necessarily not overkill, but not being creepy. We’ve probably all heard of the Target example of where their analytics identified that a young lady was pregnant before she had told her family, and they sent specific targeted advertising to them for baby goods and the father complained and then he went home after complaining to the store manager and the girl broke down and told him. Now what they’ve done is they’ve actually changed their advertising so they’re still noticing that, but they’re not putting that advertising dead center in the flier. They’re putting it in one of the corners, top, bottom, left, right corner. It’s still there, but understanding that you can get that insight but not stepping over the line, not becoming creepy with that and utilizing all the data you have but in the right way.

The reality is that everybody today, no matter whether it’s an application or website, whatever, everybody is tracking you, gathering information on you, sharing information on you and that’s giving them the ability to target and market to you as an individual. I think what we’re going to see in the future is that individualized marketing. No longer will you send out a coupon to everybody offering 10% off. It’s going to be uniquely targeted to each individual and that, I believe, is the future, combining all of the data that’s available to be able to do that.

Eric Kavanagh: Yes, I think that’s exactly right. It’s a great point to end on, folks. We’ve been talking to Rich Hughes and Dwaine Snow over at IBM, who have so much stuff going on over there. Obviously these guys have really stayed on top of what’s happening in the market place. Look up these terms folks: logical data warehouse, hybrid data warehouse and, of course, this IBM Fluid Query we’ve been talking about. Lots of great concepts, discovery zone, I love that concept. We’ve talked about it as a sandbox over the years. Of course, sandbox has kind of a different connotation, but the discovery zone is exactly what the analyst needs to play around, to understand what’s happening, where things are going, to formulate those theories, those hypotheses, and then put them into action.

Big thank you to both of you gentlemen today. Folks, you’ve been listening to Inside Analysis. Take care. Bye.

Leave a Reply

Your email address will not be published. Required fields are marked *