Bloor Group CEO Eric Kavanagh sat down to chat with Heine Krog Iversen, CEO of TimeXtender, on September 9, 2015. The conversation delved into the many aspects of data warehouse automation, cloud data sources and modernizing systems for today’s information landscape.
Eric Kavanagh: Ladies and gentleman, hello, and welcome back once again to Inside Analysis. My name is Eric Kavanagh. I’ll be your moderator for today’s conversation. I’m very pleased to have Heine Krog Iversen, CEO of TimeXtender, on the line. We’re going to learn about data warehouse automation. Heine, welcome to Inside Analysis.
Heine Krog Iversen: Thank you very much. Nice to be here.
Eric: You bet. I’m very interested in this whole space of data warehouse automation. You and I spoke before we hit the record button here, and we were talking about how the data warehousing industry has, of course, been very strong for the last twenty years. It really has fueled the entire industry of business intelligence, and it’s very valuable. Companies get a lot of insights from their analytics, if you will, from business intelligence, using data warehouses as the foundation, but it’s been such a time intensive process. At TimeXtender you have built some data warehouse automation tools. Can you kind of talk about what you mean when you talk about data warehouse automation? What does it actually do and how does it work?
Heine: Absolutely. When we talk about data warehouse automation, it’s way more than what a lot of people think about the ETL part. The ETL part is important – we need to move data from the data source to the data warehouse – but data warehouse automation, to us, is a platform you apply on top of your Microsoft SQLServer that basically will automate most of the words not only in building your data warehouse, but also changing and maintaining the daily operations and things like scheduling. It’s basically giving you one combined platform where you do your exploration of your data sources to figure out and find the right data. This is where you do your data modeling. The whole idea is that you do everything on a metadata layer. Basically, you don’t have to write a single line of code.
Eric: Yes, that’s good news for the business, and I’m guessing that much of the traction that you get comes from the whole time-to-value argument, right? Where if you automate these processes then, for example, a company can onboard new data sets much faster than in the old days?
Heine: Yes, from a business prospective I think that the core value is the need for speed. The need for speed is rising because of change in data sources, because business is taking over a lot of the decisions about which IT systems to use and because the lifespan of a lot of IT systems is way shorter today than it was ten years ago. We need a tool set and a way to work that is much more agile and flexible to be able to support the questions that the business needs to answer.
Today we see that companies are looking to cut cost, but that’s really not the most important thing about data warehouse automation. It will cut costs. You can save 80% of your development time. That’s because you can calculate to a cost savings but today the amount of work coming from IT down to the data warehouse staff is overwhelming. It’s basically very difficult for them to keep up and deliver fast enough, and that’s where we see that we need to figure out how to automate the tedious part of the work so that we can use the data warehouse people who have some great skills to add more business value and help out with data analysis, instead of writing the ETL scripts, right?
Eric: Of course. It’s so interesting because to me, when you really get down to brass tacks as I like to say, software is always about automating something. Any time you can automate a process, especially a tedious process, that’s a very good thing. That’s what software is supposed to do, and I’m guessing also that with data warehouse automation software, what you wind up doing is expediting the process, that’s key, that keeps the business happy. People don’t like to wait. It’s not going to be acceptable to wait six months, or nine months, or twelve months, which used to be the standard in the data warehousing world. That’s just not acceptable anymore. In addition to time-to-value, I’m guessing you also reduce errors significantly because you pull out that human element. Once you’ve correctly automated a process, you pull out the human element, which reduces errors right?
Heine: I completely agree. We used to say that we might have a bug in the software, so yes, we have an error in the ETL code, but we can guarantee that it’s the same error every time, until we fix the bug. We actually do see that from an IT department perspective, one of the reasons that we have a very fast-growing interest from the people who are doing it the old fashioned way inside the companies is that they start to see people leave. Nobody wants to document, right? It’s undocumented because he doesn’t understand the specific code generator. How do we bring in a new data warehouse developer trying to figure out what has already been done and he is looking at 25 requests of small changes that take him months just to figure out where he should make these changes.
For me it’s much more interesting, not that I believe that the data warehouse developers make a lot of errors, because hopefully they have a decent environment with a testing procedure in place. Bringing in people to the team is going to be so much easier when everything is a metadata model that is relatively easy to understand compared to crawling through a lot of hand written code. That’s where we see the real value.
Eric: That’s a really good point. I’m guessing what also happens is by exposing that metadata layer, you can have a much more meaningful conversation between businesspeople and the IT people who are working with the software, such that the businesspeople can much more quickly understand what is available to them, and the IT people can also better understand what the business function is that they’re trying to improve right?
Heine: Yes, I completely agree. One of the real benefits of working on a metadata model is that instead of writing with the old classical requirement model, you can actually productize but from the top down. Basically forget about the data flow. Model your data warehouse and the reporting structure you need. Use some safe data in a very limited amount and use whatever tool, even Excel, to communicate what it is you’re looking for.
If we give you all of this data in this structure, this is what you want to have from it. When you have agreed with business that “Okay, this is what I’m looking for,” then you can go back and build your mapping, your foundation, your metadata foundation for your ETL into the solution to the extent that it’s not already there, and that means that you can apply real strong use cases to the development process, which is very hard if you have to stop for three months all the time because now you have to try and develop some code that will give it to you.
We will actually, in a day or two, visualize to the business what they’re looking for. Is this the kind of data that you need? When they sign off on that, they will go back and fix the data and take care of data quality and all the issues, but again, in a very rapid way, they still might only take a week until it’s done.
Eric: That makes a lot of sense. What you’re basically saying is that from a design prospective, the business people can work with the IT people, using first the metadata layers, such that the entire conversation is around the kinds of data, meaning customer transactions or customer information or vendor information or whatever it is. Once the business agrees to that abstraction, or that process, that flow, then the IT department can go back and worry about all the technical stuff that is going to bore the businesspeople anyway, right?
Heine: Yes. We can see that the chance of success in the first hit from IT is much higher because businesses already have seen kind of the report they were looking for. They might even have built a report in Qlik or Tableau based on the metadata layer. Now they’re just waiting for real data.
Eric: Yes, that’s a really good point. Let’s also talk about all these new sources of information these days. I think this is another driver for why the time is right for data warehouse automation software. There are so many new sources of data coming out. There are some very big ones. Of course Salesforce is a giant. Salesforce is everywhere. I often talk about Salesforce as a disruptive factor in the whole data management business because they were so successful. They really forced a lot of different companies to play to their tune, and I understand that you have built a pretty good relationship there are you understand the Salesforce model, such that you can onboard Salesforce data very quickly right?
Heine: Yes, I mean Salesforce is of course big. In general, if you go 10 years back, people might have the classic ERP system and if you were really in the forefront of IT you might have a CRM system, but if you look at today, you suddenly have 20, 30 different data sources because you get Salesforce is coming in at the cloud level which is also going to change the game because you don’t have access to your data. You don’t have control. You don’t even have control when will your data model change. When will they upgrade this cloud solution, because Salesforce is upgrading and making changes as they see fit. They don’t take any concerns of the business who wants to connect to it and have the data. You have the API and that’s it.
You have hot spots, all these marketing automation tools that are coming out. All the data that floats around on the Web. This becomes more and more important and something that business say you need to have. The problem becomes that, of course, you can have BI built into these applications, but we need to combine data from Salesforce, from the ERP, from whatever external data sources, even some big data that you need to bring in such as “follow” numbers from Twitter, etc.
All this needs to go into your data warehouse. So you have this gateway of data for the business to combine and do whatever kind of analysis they want. You need intelligent application adapters to all these cloud solutions and you need to be able to bring this data in a useable form. You basically need to sit down and find someone who made a .net application that will connect and pull data from Salesforce, and then you need to figure out how to combine that to your ETL process. It’s very cumbersome work that really doesn’t add value and the business has probably have lost patience way before you’re done, right?
Heine: The nature of getting more and more new applications that generate data, important data, really fuels, as we can see, the need for data warehouse automation. We need a different way of doing this.
Eric: That makes a lot of sense because in the old days, let’s say 15, 20 years ago, oftentimes either the data warehousing vendor would offer some connectors or there were some fairly big integration vendors who would work on something like that, but these days there are so many new data sources, that it makes a lot of sense for there to be a broker, if you will, between these cloud sources and then the data warehouse itself. You are stepping into that place as an intermediary and you can worry about how to connect the Salesforce or Marketo or any number of other tools that a company might use and there are so many of them. That way, you can help expedite this process of delivering value to the business from the warehouse right?
Heine: Yes and we can even add to that. Because going forward, moving into the future, you also need to be able to deploy your warehouse on prem, in the cloud, in the private cloud – the landscape in general, from a technical point of view, is more and more complicated. What we are doing and offering around data warehouse automation is to put in that layer so that we take that complication out of the equation. You model your metadata layer. We don’t know and we don’t care where are the data sources and where your warehouse is residing. That’s information we put into the metadata model and when we hit deploy or ask it to run every hour or every five minutes or every day, we will take care of moving around between on prem and cloud, because that’s the future.
Eric: You’re right.
Heine: An IT department doesn’t have that control anymore, because business is buying software without asking IT.
Eric: That’s exactly right. From my perspective, we talked about this before we hit the record button today and I’ve talked about this on many shows like DM Radio and Inside Analysis. The business has gotten tired of waiting for IT, and so they go to the cloud and they purchase Salesforce and Marketo, and they purchase all these different tools and now the bottom line is that it’s up to the IT department to stay on top of that, so it makes complete sense that you would have this intermediary to help monitor that.
I’d like to dig into this concept you talk about: the intelligent application adapters, because as many people in the data field know, there are different ways you can access data of course. JDBC and ODBC are the old standards, but those mechanisms or those conduits will only get certain types of data. Where what you’re talking about with these intelligent adapters is that you’re digging deeper into the application, whether it’s the cloud application or an on premises application, and you’re pulling out additional bits of data to help flesh out the big picture, right?
Heine: Let’s take SAP as an example. If you go with ODBC against the database, yes, you can see 82,000 tables. They’re all in German three letter acronyms, so good luck. It’s not easy right? You have no idea how they were made or anything. When you use the adapter, you go in through the application. It still looks to you like you get exposed to the data model but suddenly through the application, we can put in translations. So instead of these three letter acronyms, we can actually give them meaningful names because there is this text layout within the SAP application that actually translates what this is all about.
There’s information about calculated fields. That’s information you want to have in your data warehouse but for some reason, it’s not stored in the database. It’s based on a calculated field that is calculated based on criteria when you open up your screen right?
That kind of information is hard to know: how do I replicate these calculations, how do I get the translations? If you look at the dynamics products one of the things that is very painful there is that from a database level, you have a lot of tables with just numerals: one, two, three, four, five and so on, but because you have to translate that into text, that’s not very useful for building a warehouse. How would the warehouse developer understand that number two is your North American customer? They don’t know.
We meet with the intelligent developer to be able to translate all this to meaningful information so he immediately knows: this is what I’m looking for, and this is the data I need, and I don’t have to maintain this manually. It simply flows into the data warehouse automatically and the same goes with Salesforce (that’s a little different story because you don’t have access to your database). You could call Salesforce and ask, “Could we get an ODBC connection directly to your data center?” I don’t think you would get it, right?
We have an API. The business decides what to expose. We need to be able to hook into that. We need to be able to understand everything that you can get from that API to expose, all the options you have. This is everything Salesforce has decided you can get. Now, you decide what part you want. Then we take care of moving the data out of Salesforce.
Eric: Right. That’s exactly right.
Heine: The problem with a lot of the cloud solutions, this is a small side note, is that they’re really (not even on the API side) not designed for data warehousing or BI. They’re not designed to pump out all the data in one sweep. The API and the concept for most systems is designed so that you can integrate one application into Salesforce that can talk to your ERP so when you do an invoice you get it over to finance or the accounting system, right? There’s a lot of technical stuff that someone needs to overcome to actually deploy Salesforce for data.
That’s part of what we do and the value we can bring to data warehouse automation that you don’t think about it. That’s our problem; we figure it out. You just tell us what you want and how often and due to incremental updating, etc., we will make sure you can do that in a very, very efficient way.
Eric: Yes, and it’s interesting because it reminds me how the whole data warehousing movement began many years ago. Namely, that companies wanted to do analysis on their data in an ERP for example, but the bottom line is that those ERP systems were not designed to facilitate analysis. They were designed to transact business. If you look at Salesforce or Marketo or Constant Contact or iContact or any of these sources like LinkedIn and Twitter, they’re all purpose-built to do the business functions that they were designed to deliver.
Heine: They did do that.
Eric: TimeXtender is stepping in and facilitating what used to be an ETL process, but because of how these tools are designed, old-fashioned ETL is not going to be sufficient to pull that data out. That’s why you have to have these intelligent adapters that sit in between the cloud solution and the warehouse right?
This is real interesting stuff. Let’s talk about a couple use cases. I see you have a bunch of different verticals so your use cases all over the map. I’m a marketer at heart so I think a lot of marketing use cases like cost per lead or cost per sale are some pretty big drivers for your customers, right?
Heine: Absolutely. In the beginning we saw it was finance starting to use data warehouses for reporting. They still do that, but we see that the need for access to data for analysis is moving into the Sales and Marketing departments. I think it has to do with that, if you go ten years back you were outsourcing. Everybody was using outsourcing. If you started in Europe, you outsourced to Eastern Europe, and then we ended up in China, and now we need to bring it back because everybody is doing it.
We are constantly looking for that competitive advantage. Speed and understanding your customer, understanding your market, understanding how do you generate more leads. How do you get the most efficiency out of your marketing investment actually calls for an enormous amount of data analysis. Then companies combine it with Twitter and LinkedIn and Facebook data. Part of the data might go into your data warehouse and part of it is still for line-of-business analysis so that you can monitor who’s talking about your company on Twitter. What are they saying? You have people actually doing that.
You also have statistical systems that claim that they can actually find patterns in these kind of text streams. More and more, we see that the value of data warehousing and the need to combine a lot of data sources is fast growing in the sales and marketing department, for good reason.
Eric: Right. I know that I’m on that edge myself. I see it happening every day and it’s a fast-paced world for the marketer. There’s always a lot of pressure on the marketing department to deliver those leads, to nurture the leads. You’re seeing much greater attention placed on understanding who the prospects are, who the customers are. What stage of nurturing are the prospects in, how do you cater different kinds of messaging to them and all that stuff relies on intelligence and insight to run, right?
Heine: Yes, from multiple systems. It would be so easy if I could just go online and buy one system that would do everything for me, right? That would be a beautiful thing, but you tend to have like five, ten, 15 different systems. You need to pull out the data and bring it all together to get this bigger picture of how things are actually co-relating in the market.
Eric: Right. Back to that time to value. Let’s talk about some of these different industry solutions. I noticed you called out a number of industries, manufacturing is one and retail is another. It’s back to that whole issue around understanding the customer. This is the kind of thing that businesses really want to hear. They want to know that you’ve thought about their specific business processes, their world, how they do business. I’m guessing that’s why you’ve got these industry solutions, right?
Heine: Yes, I think we have them for two reasons. Of course, I completely agree that there is buying behavior in the market where people would like to see that any vendor has experience in their industry. What we also focus on is pre-built stuff. If you are a classic manufacturing company, we basically know your industry at this warehouse perspective, based on 30 years in the industry. We know 80% of the data that you would use. We already have a template with a warehouse model that gives you that 80 percent, so we can cut that work out.
We still need to map it into your specific data sources, which is very easy because we do it on a metadata layer. Then we can start from there to add any specifics and maybe combine your different data sources. For us, it’s about being able to shortcut the start of building your data warehouse. I spoke to some analysts the other day and they said that even on the ERP side, regardless of industry, if you take all of the financials, people are asking, “why do I have to start from scratch building up my warehouse from financial information because it’s the same for all companies?” I think this notion comes from, again, cloud.
People are used to having all these small components that they can put together and have a solution. It’s hard for them to – and I agree with them – understand why every time they talk about their data, they have to start from scratch, because it’s the same. It’s credit ledger, it’s the GL account, it’s transactions, it’s dimensions. They’re getting used to having this idea of putting small pieces together instead of developing from scratch. That’s what we try to supply by building out these verticals and applying the best practices. We work with different industries, working with different thought leaders in those industries to apply best practices so that we also add some value – not only in speeding up the process, but also on the business side, stuff that they might not have thought about.
Eric: Yes that’s exactly right. I guess I’ll throw one more comment over to you for your feedback. You know we hear so much about big data, and there’s a lot of potential for big data obviously. We track that very closely. There’s a lot of innovation occurring around big data. The bottom line is small data is still what runs big businesses. Even though there was a hype cycle around Hadoop and big data analytics changing the game and pushing the warehouse to the side, I think that’s just a bit of hype at this point. Obviously there’s a lot of room to grow in both of these fields. It seems to me that the data warehouse for the reasons you describe is going be around for a long time. The data warehouse is going to continue to deliver value to specific people in the business. Whether it’s the financial people or the marketing people or the senior executives or whomever. A lot of that big data stuff, it’s interesting, but it doesn’t really apply to what we’re talking about here today, right?
Heine: Actually, to some extent we can get it to apply. One of the things that we are working on and have on the roadmap is adding Hadoop as a data source. We actually see that a lot of the big data that was put into Hadoop is actually what I call structured data. All these log files and all the stuff that we put in, it might be an enormous amount of log files, it might be enormous amounts – petabytes of data. These are actually structured in a way. If you can pull out more pieces of that information and put that into the warehouse, it would actually be very valuable to the business.
For me, the discussion about big data is more about are we talking about relatively structured data or unstructured data. To what extent does big data reside as part of the BI reporting side of the business? To what extent is it floating the data, Facebook data, into your marketing automation tool? It’s going to be very interesting to see how this will evolve in the future. I completely agree there’s a lot of hype. I think what I see in the market is that most companies think rather to understand the business value, to justify the business case to do this. I think it’s very interesting for the IT and the technical folks to find real use cases for the technology. It’s amazing what we can do, yes, but how does it help the business become more competitive. How does it help them make more money? If they can’t answer that question, I don’t think the business will really take on and do this.
Eric: Yes, that makes a lot of sense. Well folks we’ve been talking to Heine Krog Iversen, CEO of TimeXtender. You can find more information about them online at TimeXtender.com. Heine, thank you for your time today.
Heine: Absolutely, my pleasure. Thank you very much.
Eric: Okay folks take care. You’ve been listening to Inside Analysis. Bye Bye.