This episode of Inside Analysis features Eric Kavanagh, CEO of The Bloor Group, interviewing WhereScape USA President Mark Budzinski as he describes his data warehouse automation technology that has stunning results when it comes to operating speed and process manageability that keep both business and IT happy and engaged.
Listen to the mp3:
Eric: Ladies and gentlemen hello and welcome back once again to Inside Analysis. It’s the time to take the inside track figure out what’s going on in the industry of information management. So-called big data is the hot topic, but there is a lot of medium and small data out there, and let’s face it, that’s where all the action still is, or at least 90 odd percent of it, and that’s what runs our businesses and our information systems. It’s the kind of stuff that we really have to get right, and there have been a lot of problems. We really haven’t gotten it right and big data is not going to solve those problems. Today we have a wonderful guest. We’re going to be talking to Mark Budzinski of WhereScape USA so with that Mark, welcome to the show.
Mark: Thanks so much for having me, Eric. Nice to be here.
Eric: First of all for people who are not familiar with who WhereScape is why don’t you give us the overview of your company: who you are, where you come from, what are you focused on these days.
Mark: Very good. WhereScape is a company that actually surprises people. We’ve been around for quite a few years, pushing a decade now. We solve a very specific and unique problem in the industry and that is the actual building and deploying of data warehouses and other related data warehouse projects so think marts and other business semantic layer projects and those sorts of things. When I say build out, we do it through an automation technology. We have a product called RED, as in the color red, that essentially takes on many if not most of the pesky tasks that are otherwise done by hand by humans, whether they be employees or perhaps consultants that are brought into a project to build out a Teradata environment, to build out a Oracle environment, SQL Server, Greenplum, etc. We do it through this automation technology that has stunning results when it comes to the speed that you can operate on as well as the manageability that you’re left with on the other side.
Eric: Right and one of the really cool things that you do by automating stuff is you’re also automating the documentation side of the equation, right?
Mark: It’s absolutely true. It’s one of those things where anybody that’s actually done it or, perhaps worse, managed people who have done it realize that documentation is overpromised and underdelivered almost every time. It comes to haunt you – not so much at the initial delivery because you can kind of cheat and get something up and running – but the ongoing management can be a nightmare if you don’t have good metadata that has essentially driven your environment for business rules, how tables are created, data lineage, all those sorts of things. It’s part and parcel to our solution. When you push the button to create tables and populate the tables, documentation is an outcome that happens in parallel with no further effort.
Eric: And the cool thing is that by doing it this way, by using WhereScape, that documentation is going to be consistent. I think there are several issues with documentation. One is that nobody likes to do it. Really, people especially developers just want to develop. They want to stand stuff up. They want to get it running and then say, “thank goodness we got it running now let’s go home and have some fun.” The documentation side of it as you suggest up front might not be a huge issue but it’s going to be a big issue when you want to change something, when you want to audit something, when you want to move something around. The fact that WhereScape automates that part of the process has got to be a huge deal that the documentation is consistent in its form and its nature such that people can go back later on and really fairly quickly understand how they got here and therefore figure out where they’re going to go, right?
Mark: Well, you’re spot on. If you think about it in real life, people get promoted. They leave organizations to join different organizations. They have a family situation that requires them to abruptly leave. There’s an emergency in the health sense, whatever. Real stuff happens in real life and to have a documented environment that is really the safeguard so that you’re not dependent on an individual or a group of individuals is paramount to any kind of risk mitigation strategy that a company is serious about. I hate to see when consultants create a dependency on a particular customer base entirely because the situation and the solution are not well documented. So if you document it up front you’ve got something to manage, you’re mitigating risk for your own people as well as any external resources that you may have used along the way.
Eric: Right and I really love your kind of product. I sometimes refer to these as BASF products because it’s we don’t make the products you buy. We make the products you buy better. You’re sitting on top of or outside of or conjoining with all these other major players like Teradata. I think you’ve talked about EMC. Maybe you should list the six targeted environments that you guys work with so people understand.
Mark: There are what we think are certainly a popular six: SQL Server, Oracle, Teradata, DB2, Greenplum and Netezza are the environments that we support today. When I say support, just to be clear, we’re talking about the target data warehouse environments that we’re building to – so that’s where the data is going to live. That’s where the tables are created and populated, etc. Where does the data come from? Goodness, it can come from ERP systems. It can come from flat files. It comes from Hadoop. It comes from a spreadsheet that lives in the controller or perhaps the CFO’s laptop. It can come from all kinds of places but at the end of the day it basically gets built and deployed in that target environment.
Eric: One of the things that WhereScape does, and I think this is a very big topic and it’s going to remain a big topic for a quite some time, is you really do expedite that whole development process so the time to value is much shorter. Let’s face it, the days of a two-year ERP implementation or a two-year enterprise data warehouse implementation where there is no value from day one until day 730 – those days are gone, so I think.
Eric: You have really helped change the game by providing this service, this automation service, of course product that lets people ramp up quickly and that is just the way it has to be these days?
Mark: Well, I couldn’t agree more and the customers that end up buying into our story and our product and ultimately deliver become public and very rabid fans of the solution. A very common testimony is things happen approximately 5X faster than they would’ve using traditional methods of hand coding or a traditional ETL tools, Ab Initio, Informatica, DataStage, those kinds of tools. It has entirely to do with this automation notion and the fact that everything can be developed and deployed in the database so you don’t have this extra step to have to tune the database. If you think about an ETL tool, you typically have two development cycles that are going in parallel. One is to extract and load the data and then the other is to actually model and tune the data in the chosen data warehouse. We’re doing all of that together at the same time from one tool set, and the acceleration is vast. I can tell you that the business side of the equation – those that ultimately have to consume the data for analysis, for operational reporting, whatever the case may be – they’re thrilled with a solution like ours.
In fact, I guess the converse of that statement is to say they’re completely annoyed when IT says, “sorry you don’t understand how hard this is. We have governance and we have ETL tools and we have process and these guys are on vacation and this other guy has a problem with his laptop.” It’s excuse after excuse after excuse. WhereScape can come in and just completely blow up all of those assumptions. If you have what I like to call enlightened IT folks, leadership that truly is trying and in their heart to hearts to do the best job they can for business and not just trying to hide behind a wall of excuses, they embrace what we’re doing. They can collaborate with the business and work iteratively with them to develop a solution that the business will ultimately use in real life in a big way not just let it sit on the shelf. Very rewarding. Again, there’s a lot of customer testimony to that effect that comes out of our account base.
Eric: I have to think the IT people in organizations where you deploy even if at first are a bit skeptical about what’s going on, once they see that roll out and they wrap their heads around not having to worry about so much of the documentation and the darn troubleshooting are happy. That’s the beauty of documentation is it allows you to be more agile in the future and it allows you to troubleshoot stuff, because if you don’t have documentation troubleshooting could take a day or six years. Who knows?
Mark: You’re exactly right. Having the right access to the right part of the problem and you can see if a particular piece of code or a column gets changed in the table. What effect does that have on the rest of the ecosystem not only within the box that you’re working on, the system that you’re working with, but actually those that are linked to it? Huge advantage. You’re absolutely right. There are so many pieces to the equation here.
It depends on I guess what any vendor would say about how they experience their customers. There is the coined phrase “pain points” that customers experience. Does it have to do with manageability? Does it have to do with hair on fire: I’ve got to get a project out the door quickly and I don’t know how I’m going to do it if I don’t use something like WhereScape. We see lots of different points of view and depending on those we obviously are appealing based on that dimension.
Eric: There’s a quote I heard not too long ago that I’m guessing you’re going to love. It really kind of speaks to the disparity between what some people will think is happening in their environment and what’s actually happening. The line was this guy said the ETL developers do not read policy. What do you think about that?
Mark: Well, having your ETL, to call it that, your code and what have you, that has to do the logic of derived data and deal with joining data and all the rest of it. The closer you are to the data, first of all, the better off you’re going to be and the closer you are to the business user who’s actually going to consume that data the better off you’re going to be. So to push all of that toward the database and toward the business user is axiomatic in our approach in the way that we go about this. ETL methodologies and tools by definition have a challenge because they’re removed from the target environment. They really don’t know. I say they (those that work on the project in those tool sets) don’t have a real comprehension of what the modeling is all about in the data warehouse. They certainly don’t understand any view layer, semantic layer, cube layer.
They are completely oblivious to how business users are actually going to consume the data. “Hey, it’s extracted. It’s transformed. It’s loaded. We’re done. We’re taking the rest of the day off. It’s your problem now” is really the way that they operate. Not accusing them of being malicious in anyway understand but in terms of the division of church and state here. That’s the way they see the world.
Well, the guy on the consumer side, the business side, who’s trying to eat the data and use it in a productive way in the form that he wants it in the time frame he wants it, what is he supposed to do with that? Well, he’ll hire shadow IT people to work around ETL is what he’ll do. You can’t live in a world where ETL says, “we’re done, your problem now” and expect a business to be happy with that. They have to take action, and you see this in the form of what the Teradata guys call BTEQ scripts and all this hand coding that goes on in Teradata. It happens in Greenplum. It happens in SQL Server at this cube layer and other aggregations of data. It happens all over the place and it’s because the business ultimately is not satisfied with the direct form of the ETL as it’s delivered by that group.
Eric: You bring up a couple really good points. Shadow IT and what is essentially the same thing, workarounds, right because that’s exactly what happens when the businesspeople are trying to get something done. If they can’t get it done through the approved mechanism, through the governed process, going to IT, so on and so forth they go around it and they just do. They’re going to find a way to get around it and then, guess what, you’ve just lost your governance. You’ve lost your audit ability. You’ve really hampered your data quality.
All these bad things happen and it’s the old cliché of a stitch in time saves nine. If you could just address the problem at the pain point when it occurs and do so in a transparent way, you’re going to be saving yourself so many future headaches. So it’s like when and where you guys get through, and you have a lot of customers, so you’ve succeeded in explaining to people what the whole issue is in dealing with this kind of challenge. If the IT people understand that they will save themselves so many headaches down the road they won’t need to worry about being overburdened with stuff because a lot of this stuff is already going to be handled right.
Mark: Over on the business side they tend to look the other way. IT today when it comes to shadow IT: they will bring down rules that say if you want this stuff actually in production it has to come through our ETL shop. You let us know what you need to submit and what you end up with is the business that has unsanctioned unofficial data marts and they’re making real business decisions, don’t kid yourself, based on those instantiations – so you’re exactly right.
If through a technology like WhereScape you can automate the process of building out that database environment, and the DBAs and whoever on the IT side say oh, let me get this straight. You’re doing standard code? In SQL Server it’s standard T-SQL, for example, or PL/SQL in Oracle or Teradata SQL. It’s all the same kind of code that we would write by ourselves. It’s documented perhaps better than we could do ourselves because we don’t have that same priority and commitment to documentation. There’s nothing scary and spooky and proprietary that’s going on. There’s no WhereScape language or WhereScape thingy that we have to worry about here. Okay, well we can relax. We’re all happy, so you get what you need which is an automated, fail fast, prototyping, iterating, close to the business environment that you want and we IT guys can relax because you’re just doing something standard.
In the absence of that Eric, you just have these peculiar behaviors where the business, they’re survivalists. They have to be, right? The data has to come together in the way that the executive teams and the other constituencies that are going to eat the data and consume the data require, and IT are you with us or you not with us? If you think about Tableau’s point of view. How they have just made a business for themselves by enabling businesspeople to visualize data more readily and where do they get that data? Anyway they can, right so sometimes it comes from the data warehouse. Usually it doesn’t, but that’s an example of how raging this problem is on the business side and how hungry they are for a solution.
Eric: That’s a really good point and the other nice thing too about when a solution like this goes in place is you really are fostering collaboration among these parties. We’ve been talking about this business/IT divide for a long, long time and I do feel like that the story is changing a bit. Cloud is part of that. Theoretically big data is part of it just because it has drawn a lot of interest and people are talking about stuff, but the ideal the scenario that you want in any organization is where the business and IT people in fact talk to each other on a fairly regular basis. I think if you can solve a lot of these problems we’re talking about on the documentation side: the speed to delivery and the automation of a lot of the really painful tasks. That’s the bottom line. What you’re automating is not fun stuff to do manually.
Anytime you can automate some tedious task and do it effectively, there’s going to be less fear on the IT side. There’s going to be less what’s the word, compunction, I suppose, on the business side and guess what then people can start talking to each other and understanding each other. That’s when the IT person can shine because they can say, “hey, wait a minute. I realize something. There’s a reason why we’re not able to get to the numbers. It’s in the model. It’s in the data model and I see it now because I’m reading this documentation,” and the business person is like “oh, my goodness I think I just finally figured out what a data model is.” I think it’s a good conversation starter, right?
Mark: You raise an interesting point. It’s almost like there’s a chicken and an egg situation between the business and IT. When it comes right down to it, when you ask IT people why haven’t you been able to deliver to the satisfaction of the business, the answer that comes to the top of the page is because the business people can’t tell us exactly what they want. How do they expect us to deliver if they don’t tell us what they want and when they try to tell us what they want and we do deliver then they tell us it’s the wrong thing? How do you win with these people sort of an argument? Well the truth of the matter is in data and all the projects that go around data, this is common. In fact you would say this is expected.
This should be normal behavior because the business consumer doesn’t know by definition exactly how they’re going to use the data until they see it, right? So you have to start this collaboration process between IT and the business to get something down. We can go off in the back room and model until we’re blue in the face. It doesn’t matter. Until we get something that’s based on real data in front of that business user that they can see; it’s not until then that the real conversation starts. “Ah, now that I see it; add this, change this, delete that. Really what I need is this. This slowly changing dimension should probably be a direct override dimension. We don’t need all that data to be tracked.” You have all this conversation that comes out and so what does IT do now with an automation tool like WhereScape?
Frankly, you keep the attention span of the business user in check. Instead of saying I’ll get back to you in two months you say I’ll get back to you Monday and you get back to him Monday with the next iteration of the solution. It takes three, four, six, seven, some number of iterations until the business guy says, “you know what not only have you delivered but you’ve delivered something I’m going to use because I’ve been part of the process to define the solution. I see it. I taste it. I’m going to use it” and the IT guy goes “wow, that wasn’t so painful after all. We were able to iterate seven times. If I didn’t have an automation technology like WhereScape this would have taken a year. We were able to bang it out in three four weeks; now let’s talk about testing and moving it into production over the next six seven weeks and there you go.” You got a 90 day start to finish in production subject area that’s going to be meaningful to the business.
Eric: You raised a really good point about iteration and cycle time because the bottom line is that attention span is a finite thing. You cannot maintain attention on a particular subject forever. Of course, you’re not going to live forever, but it is a very finite and fragile thing and you will lose context. You’ll lose energy. You’ll lose inspiration if things take too long. That’s actually another funny quote I heard awhile ago refers to the culture of failure in the world of data management and it’s because people have been burned so many times in so many ways that they’re head shy about dealing with stuff.
I get really excited with products like the WhereScape offering because you are providing a conduit to solve a lot of these problems which once they’re automated they’re not really problems at all. We can talk about other stuff but that rapid cycle time between different versions and that whole fail fast concept that stuff is just absolutely critical to getting, as you suggest, a solution that someone is actually going to use, and then they’re engaged in this process and they see. They can kind of see the light at the end of the tunnel so that’s going to motivate them to have another meeting to say, “Alright, well we’re almost there: change this, change this, change that” and now you’ve got this actual dialogue going between the two parties and that’s called collaboration, right?
Mark: Yeah, you’re spot on and it turns into what I would consider to be stunning results. We have customers like Union Bank for example. Where they’ve been able to do things in eight or nine weeks that would have taken eight or nine months before. That’s now turned heads. What started as a shadow IT project has now been embraced in central, main IT as a going forward solution. That never would have happened if the shadow IT department, which by definition has urgency and time to value on the brain, didn’t invest in an automation technology like WhereScape. IT would have said we’re dumb, fat and happy the way tit is.
Another example is F5 Networks up in Seattle. They had a case where their IT and their business users had stress levels of significant proportions. Over the course of a year, as more and more was delivered – next thing you know you got a company meeting where the CEO is up on the stage handing out awards to the IT group for a job well done. He had never seen anything like it in his tenure – to have this kind of collaboration and teamwork between the business and IT. These are just stories I suppose at one level. Take them for what they’re worth but they’re real. They’re authentic. Customers have these kind of experiences and the culture of collaboration goes a long way not just for business value but for sheer teamwork.
Eric: Let’s close with something that you mentioned yesterday in a webcast. The so- called elephant in the room, which is pretty funny considering the whole Hadoop elephant thing, but we were talking about big data and you mentioned how the analyst firms and the media and a lot of businesspeople get all excited talking about big data, big data. “Yay, hooray. We’re finally going to solve everything.” And in reality no, we’re not. All the old problems don’t go away with big data.
In fact as Dr. Robin Bloor mentioned in the briefing, really big data complicates the issue for the data warehouse from a design perspective so you have to go back in and solve those old problems. I think it’s because of various factors. One is that culture of failure we’ve talked about is turning a blind eye to stuff, not wanting to look down into the dungeon where all the problems are, for example. And the other thing is just a new shiny thing, right. It’s not fun to talk about all the problems that you’re having. It’s fun to talk about the possibilities of the future and that’s all great but you’re not going to have a bright future unless you get down into the weeds and go start solving some of those problems.
Mark: There’s no question about that. I don’t ever want to be accused as the guy that rains on the parade of big data. We will be talking about big data because of the idea that we can go get at new sources that didn’t exist.
When I say new sources I mean blogs and emails and tweets and all the rest of it, where we can capture all this unwieldy, unstructured data in a very inexpensive well-organized form called Hadoop. Hey, that’s a good thing, right, and so all of us should be excited about that. Corporations should be investing in seeing how that could have an effect on their business over the long term not just as a short- term fad. It’s not like all the problems that we’ve had up until this date have somehow just magically vanished and let’s get on to tomorrow’s problems as if yesterday’s problems are all resolved. They’re not. They persist. Why do they persist? I don’t know. That is a tough question to answer but they do.
The idea is that it is hard to manage data. It is hard to build the structures around data warehouses and regardless of the form that they’re in that is a problem that persists, and we bury that in what I call human capital, whether it be an employee or a consultant that comes in. We just throw people at it. Ultimately, those people are doing the best they can. It’s not like we have underqualified dumb people on the project. We have some pretty good ones last time I checked but, in spite of that, the tools and the methodologies that we’ve stuck them with in terms of ETL and all that’s been over the last 15-20 years. That is the problem that I would profess is as big if not bigger than “let’s go explore how new data sources can help us understand our customer sentiment in some of these big data themes.” They’re every bit as big in terms of wasted money and ultimate business value if we could resolve them.
Eric: You can call it the hidden cost, because people have to do stuff and you wind up burning up your whole day, dealing with manual processes that are a result of not having done things properly six months ago or nine months ago or a year ago. So that’s the chasm you have to get over in selling people on why it’s important to automate this part of the process.
Well, this has been great stuff folks. We’ve been talking to Mark Budzinski of WhereScape. You can find them online at www.wherescape.com and the product is called RED. Mark, thank you so much for your time.
Mark: Pleasure, thanks a lot.
Eric: Okay, folks. We’ll catch up to you next time. This has been another episode of Inside Analysis Taking the Inside Track to Insight. Take care folks. Bye.