CIO Folder: Data, the most precious material
13 October 2015 | 0
We had an interesting spelling mistake here in TechPro a few weeks ago when an email exchange talked about our feature on ‘Bog Data’. If you’re Irish, you will appreciate the inadvertent analogy. We are certainly talking about deep, dense material, going from something nearly as black as coal down at the bottom to light brown and even greenish flaky stuff on the top. That is certainly a metaphor for Big Data.
In fact to push the analogy even further any interesting bog offers a compelling resource for a range of sciences, from palaeontology to botany, pre-history to forensic anthropology, archaeology and straightforward history. You could add in the even more fascinating/entertaining activities like criminal investigations and treasure hunting. The slight implicit suggestion is also there that modern data mining is more like bog snorkelling than orthodox mining with its foundations in engineering disciplines.
One of our recent interviewees pointed out that analysis has to cope with the V for Variety in dealing with Big Data, meaning that the data is not tidily classified and cleansed in the traditional serried ranks of a data warehouse. That warehouse metaphor is better than cloud, even if any vision of racks and forklifts and even mechanised handling is just as much of an illusion as the water vapour version of solidly earthbound data centres.
In smart modern data analytics, that most ancient 80/20 rule applies. You can do a first or indeed several cuts at the total mass of data and achieve potentially useful insights. Not only is there no definitive need to cleanse or prepare the data — some considerable part of the value actually resides in the sheer comprehensiveness of the data resources, its totality.
Once you have got some insight or clues, on the other hand, of course a more targeted analysis is the way to go or indeed a series of alternative slice/dice approaches to see if anything more detailed or more nuanced might be discovered. That is after all the scientific method: hypothesis, test, informed by results, re-calibrate and try again. Error is informative.
The objective and the methodology of data mining and business analytics can be summed up in a little mantra exclusive (for the moment) to this column: Find Any, Find Many, Find All — and that is about it. Patterns, deep insights, infographics, modelling of possibilities all then follow.
But the other major challenge of Mass Data (who cares how Big it is) is that some of it has owners — with rights. That can be a little inconvenient because there are all sorts of laws and regulations and Chinese walls (should they not be Japanese since the analogy is with paper?), Best Practice and voluntary codes and ‘the right to be forgotten’ that all have to be provably built in to the analytics engines. That means a complex set of business rules and multiple moving parts as the regulatory framework changes.
Then there is the hugely complicating factor of political jurisdiction — EU or USA or Scandinavia or the Far East or, (preserve us) places like Russia or some of the lesser known ‘stans or Latin America. Many multinationals have been trying for some years, with varying degrees of success, to develop reliable systems to manage their differing obligations in different markets while at the same time trying to be global in their systems and governance.
Just because there is more clarity emerging in the regulatory frameworks does not make that challenge any easier. In fact it probably accentuates the risk because what is now better defined will almost certainly be better enforced. The sheer costs, legal and otherwise, of an e-discovery order in any jurisdiction are a serious business risk — even with the presumption that the organisation will in due course be cleared of wrongdoing and no sanctions or punishments applied.
Data protection officer
Within the EU, organisations employing more than 250 will have to have a designated Data Protection Officer. So will all state authorities and, significantly, any size of organisation that has data and its processing as a core activity. That suggests all forms of business function outsourcing, perhaps employment agencies and the like, legal and accountancy firms, booking agencies. The fine simply for failing to appoint a DPO can be up to 2% of turnover, so this is potentially a very serious area.
The salient point, however, is not about the DP role per se but that data and its protection, the laws and the ethics, the regulations and the best practices, the sheer breadth of the limitless variations that are going to arise, all suggest a new set of career paths. Not unlike analytics, data officers or Datalysts or whatever we come to call them in their various incarnations will be necessarily be a cross between ICT and administration.
The job will require an understanding of both, in some depth if not necessarily to a technical expert level. Other components of the job spec, or at least a convincing CV, might include security and something of a legal understanding. ‘I worked in the Office of the Data Protection Commissioner’ would sound good, wouldn’t it? Or some ombudsperson’s office or an authority.
This rising role of data will concern any CIO, in fact it should already be on the checklist. At the simplest level, IT will have to work with the designated DPO so assisting in setting it up appropriately for the organisation must lie between CIO and CEO, HR and the company secretary function. It is open to enterprises to a designated third party as their DPO and share such a service with other entities. That certainly suggests the advent of the DPO Division in the Big Four and other consultancies, legal firms, and potentially independent specialist firms.
Because any individual with a query, complaint or request for some action, e.g. deletion of certain data, and contact the relevant DPO in the first instance. That automatically argues for a certain degree of independence, and indeed rank. So a simple line manager type role would be likely to be discredited the first time a serious issue or dispute arose if there were to be anything less than a visibly objective and fair response.
But to get back on track and theme: the advent of the statutory DPO is simply an indicator of what is happening in society and in business. The new frontier is data, not technology. Technology can only get higher, smarter, more advanced and even to the level of artificial intelligence. It can probably go where no man has gone before — fine.
But the challenge remains strictly earthbound where man has gone throughout history, impelled by curiosity, greed, anger, opportunity, riches and temptation. The balance of business convenience is a tempting slope down which many corporations have already slipped.
Knowledge and power
Knowledge is power, says the old axiom. Scientia potentia est, attributed to Roger Bacon, was originally ipsa scientia potestas est or ‘knowledge itself is power’ according to the combinative scholarship of Wikipedia. Today we might substitute ‘data’ for knowledge. It too can be erroneous, of course, but for the most part data is correct and absolute and it is only in the interpretation that error arises.
All in all, there is a credible argument that all of the data-related roles and disciplines are becoming the leading edge of applied ICT — acquisition and structuring, database and data storage design, security and protection, data science and analytics, data administration and law. The CIO has information in the title. So where sits the Chief Data Officer? Or that rising star, the Chief Information Security Officer? Of that lot, the new Biggest Challenge is likely to be Data Destruction: who has authority and when and how thorough and automated can deletion ever be?
These data questions are far bigger than the simple mass of Big Data and right now the CIO is the only one in the organisation who has to start examining them in the light of impending reality. Responsibility may pass elsewhere in time. But the CIO will never not be involved.