Inside Analysis

Magic and the Philosophy of Data

This article is part of The Bloor Group’s research program, Philosophy of Data, which has been underwritten by IRI, The CoSort Company.

Humanity seeks out knowledge without questioning why. Governments invested in education to confer knowledge skills onto our children, and they established universities so that restless men and women of letters could busy themselves with taming the uncertain and the unknown. These pursuits drove us into the information age, and we are marching through it without any understanding of where it might lead. It is enough for us that the end game is knowledge.

So what is data?

Let’s consider data in digital terms. Data is a molecule of information: a record of an event, a signal from some source, a report from some sensor, a measurement, a transaction, a message. On its own, it has no definite meaning, just as a word without the context of a sentence has no definite meaning. A word is a molecule of a language, and a language is a framework for creating data. But when we create data, it is within a context of some kind. Some hardware running some software produces some output. Whether that output stays where it was born or travels to some other digital domicile, it congregates with other data, and in doing so, it achieves the status of information.

When data becomes information, it achieves value, to some degree. Its value is holistic, born of its content and its structure, and it is valuable only to someone or some thing that can use it. It acquires a meaningful context. Paradoxes seem to arise when you consider this. Which is worth more, one of Shakespeare’s sonnets or the full history of tweets? One is less than a kilobyte, and the other is multiple petabytes. In truth, there is no paradox here. When you’re dying of thirst in the desert, water is worth more than gold. On the streets of any given first world city, the opposite is true. Information has value by context, and its value will be exercised accordingly.

But the value of information, in digital form or otherwise, can be damaged or destroyed. It needs to be protected against corruption. We know this, we always knew it, and nowadays we are reminded more frequently than ever how important data protection is. If data is born innocent, and it often is, it does not need to be changed. And if it is born corrupted, and it can be, the corruption can be expunged and then it also will never need to change.

In that sense, information is distinctly different to the other two fly-by-nights of the digital world: hardware and software. Hardware has the lifespan of a small rodent. It is born obsolescent and dies quickly, made irrelevant by its grandchildren. In comparison, software has longevity. A really useful piece of software might have the lifespan of a horse – a less useful one, that of a dog. Yet information is immortal if properly tended to.

The governance, the tender loving care information obviously needs, has only recently garnered the keen attention of the IT world. Data volumes grew exponentially for years, and one day someone made the quantitative observation: it’s big. And the marketing meme of the decade was born. So big governance was required, though in truth it was always required and, to some degree, provided. But there was the dimension of value to consider. Data was being created and harvested for a reason. If corrupted or stolen, its value could be lost. Taking care of data meant securing and organizing it too.

And this is where we are with information.

Knowledge

Information is somewhat magical. If you know what spells to cast on it, it becomes knowledge. These spells are called algorithms, and there are many such spells. The right incantation executed in the appropriate way can turn mere bits and bytes into pure platinum. Here we have exceeded ourselves. We were always capable of creating knowledge, and as a species we have done it for centuries without the assistance of silicon. Then we exceeded ourselves. We enshrined our knowledge in magical software and, almost in defiance of some universal law, we used knowledge to create knowledge.

The digital age has moved us to a point where in almost every intellectual activity the silicon has left us in dust. It exceeded us in calculation, information capacity and accuracy. It defeated us in intellectual contests: chess and Go. It beat us in quiz games. It tamed semantics and became a better linguist. Its artificial intelligence trumped our real intelligence. Its ability to generate knowledge is now surpassing our own.

This is where we are with knowledge. Our apprentice has become our master – and yet, it lacks arrogance, ambition and an ego, so it remains our servant.

Philosophically, that’s how I see it. There is data, there is information and there is knowledge. We have not yet come close to exhausting the possibilities, even though this march of technology seems to have exhausted ours. So we continue to forge on triumphantly, with genuine enthusiasm and no clear idea as to the destination.

Robin Bloor

About Robin Bloor

Robin is co-founder and Chief Analyst of The Bloor Group. He has more than 30 years of experience in the world of data and information management. He is the creator of the Information-Oriented Architecture, which is to data what the SOA is to services. He is the author of several books including, The Electronic B@zaar, From the Silk Road to the eRoad; a book on e-commerce and three IT books in the Dummies series on SOA, Service Management and The Cloud. He is an international speaker on information management topics. As an analyst for Bloor Research and The Bloor Group, Robin has written scores of white papers, research reports and columns on a wide range of topics from database evaluation to networking options and comparisons to the enterprise in transition.

Robin Bloor

About Robin Bloor

Robin is co-founder and Chief Analyst of The Bloor Group. He has more than 30 years of experience in the world of data and information management. He is the creator of the Information-Oriented Architecture, which is to data what the SOA is to services. He is the author of several books including, The Electronic B@zaar, From the Silk Road to the eRoad; a book on e-commerce and three IT books in the Dummies series on SOA, Service Management and The Cloud. He is an international speaker on information management topics. As an analyst for Bloor Research and The Bloor Group, Robin has written scores of white papers, research reports and columns on a wide range of topics from database evaluation to networking options and comparisons to the enterprise in transition.

2 Responses to "Magic and the Philosophy of Data"

  • John O'Gorman
    September 27, 2016 - 11:41 pm Reply

    Robin – Seriously, man. you need to get your head on straight. The first two paragraphs under the heading: “So what is data” are tossed salad..

    “Data is a molecule of information.”

    Good so far.

    “On its own, it means nothing, just as a word on its own means nothing”

    A word is a representation of Information, so how can it ‘mean nothing’? In may mean many things, and that might cause confusion, but a word on its own can’t ‘mean nothing’.

    “But when we create data, it is within a context of some kind.”

    Also good, so if data is created within a context of some kind, it is – according to your definition in the first line – Information.

    “… it congregates with other data, and in doing so, it achieves the status of information.”

    Again, according to your first line in the paragraph it is *already* Information.

    “When data becomes information, it achieves value, to some degree. Its value is holistic, born of its content and its structure, and it is valuable only to someone or some thing that can use it.”

    Come on, Robin. Someone who has been in this game can do better than that.

    • Robin Bloor
      Robin Bloor
      September 30, 2016 - 10:02 am Reply

      Hi John

      Thanks for your feedback. I think you’re right that I never articulated the point I was making well, and I have edited the post accordingly.

      It’s true that a word on its own, without any context, seems to have meaning, but without the context its meaning cannot be known. Similarly the meaning of a single record without context cannot be known. The example I’ve usually given for this is a sports score, say Tigers 1 Cubs 4. It is a record of something, but you cannot know what. You cannot even know for sure that it is a baseball score – it just seems likely. You do not know it is a final score. You do not know the date or time. You do not know where the game was played or why, and so on. I classify this as data (better perhaps to think of it as a datum) but not information. If you combine it with more data to provide context you have information.

      In discussing or proposing a philosophy of data one inevitably brushes up against a complex area of philosophy: theories of reference, theories of semantics, theories of meaning and so on, about which numerous books and paper have been written. I was trying to circumvent that by confining myself to data in the context of computing, and offering a definition for data, information and knowledge in that context alone. However it is not easy to avoid the fact that ultimately the computer user is human and hence reference, semantics and meaning are relevant.

      Thanks again for the feedback

Leave a Reply

Your email address will not be published. Required fields are marked *