In a recent chat with 1010data, the company provided us with some interesting stats on some of its customer behavior. If you are not familiar with 1010Data, they are a SaaS-based big data analytics company. The company cut its teeth by providing big data analytics capability to the financial sector, long before the “Big Data” marketing term was invented – and has since moved on to acquire customers in telecoms, retail, gaming, the health sector and other areas.
When we mention big data in respect of 1010data, you can think in terms of tables with tens of billions of rows of data being manipulated/analyzed as if they were in a spreadsheet, with a wide range of statistical, text and other functions being applied to them. A significant aspect of the company’s capability is that such operations are carried out very quickly and hence the data analyst works in an interactive manner with the data set he or she is interested in.
What Do Companies Do with Analytical Data?
Here’s what interested us in our conversation with Sandy Steier of 1010data:
1. One of the company’s customers stores and works on a table that has about half a trillion rows of data. That is a massive amount of data to analyze at one go. This may set some kind of record or it may not – who’s to tell? But it certainly qualifies as genuinely big data. This single statistic speaks to the likelihood that there is probably no size of table that a company would not want to analyze interactively if it could. In our view interactive analysis is the ideal for a data analyst (if only those goddam computers weren’t so impossibly slow).
2. The average number of tables that 1010data’s customers actually hold is somewhere in the region of 2,000 to 3,000. Think about that. Even if some of those tables are discarded tests or no longer active, that’s a really big number. And since 1010data customers pay by usage there’s probably an incentive for customers not to be profligate with the resources they consume. So even if half of those tables could be archived or deleted, that is a large number of tables.
In our view, this speaks to the fact that, among 1010data’s customers at least, the amount of data analysis that is going on is extensive, occupies a good many staff and covers many aspects of the business.
What we are beginning to see emerge can be thought of as “The Evidence-Based Business.” Businesses who operate mainly on the basis of analytical intelligence certainly qualify for that title. Such businesses are not entirely new; insurance companies have operated that way for many years and, more recently, so have trading banks. Google, to some degree, pioneered this mode of operation, and other web-based businesses have imitated it. But now we see this spreading to many other sectors.
A final point that emerged from our breifing with 1010data is that some of its customers are sharing their data in a commercial manner by renting access to partenrs. One of 1010data’s customers actually runs its data warehouse – hosted entirely by 1010data – at a profit due to the revenue it receives from its data sharing arrangements.
This is an interesting “straw in the wind.” The direct market for data has existed for a long time and has experienced very little innovation. This kind of operation is, as far as we know, new and could usher in a distinctively different way to profit from data. This may be an opportunity that many companies could seize.