Data doesn’t need discipline — we do

Digital security concept
(Source: Stockfresh)

Print

PrintPrint
Pro

Read More:

9 March 2015 | 0

Before the dawn of recorded history, or at least before many readers of TechPro were born, a visit to one of our army barracks spotted an open cupboard door and bottles of what was unmistakably liquid ink. Already long supplanted by the ballpoint, except maybe for some hobby calligraphers and stubborn fountain pen stalwarts, the obvious question was why still in stock? The friendly stores sergeant explained: “I can purchase up to X quids’ worth with just my lieutenant’s signature but to dispose of anything officially requires two commandants or a colonel — and there is no one of that rank in this unit anyway. The bottles could meet with an accident, of course, and then I’d have to fill in a few forms. But at this stage everyone in the office here feels a bit nostalgic about them so there they stay.”

Ink is still in the data production mix, although only liquid in inkjets and those irredeemable fountain pensters. But it is the data itself that is easily obtained but difficult to get rid of — officially. Surely no need to go on about the Data Explosion? We all know and we probably share some scepticism about the various rates of increase in Gross National Data Product, but we are acutely aware that the rate of data growth does not actually matter a damn. It is the growth.

“The records of The Honourable East India Company and a dozen other mercantile empires, plus the Royal Navy, have provided more English language historical information than almost any other resource from the regions where they operated. Digital is different. No one can tell at a glance what any segment means”

It is huge, it is a problem now and it is rapidly becoming an acute one. It affects the consumer, the home, every commercial enterprise and every state institution. Now if a punter loses a bunch of data it might be kick-the-furniture time but it is a small personal incident. If a home or family loses its carefully digitised archives, that is more serious and may have contained some things more precious and irreplaceable than ephemeral pop music. At that personal level also, the economics of dealing properly with growing volumes of miscellaneous data may simply be unaffordable. So if the kit is out of financial reach you let the stuff go.

But in the world of organisations there is no commercial, legal or practical choice other than to do and spend whatever it takes to cope. The reasons are multiple, starting with clear and binding responsibilities to others — clients and customers, partners, staff, regulatory authorities. Some of the challenges are to do with costs and technical constraints. Data storage media are progressively reducing in cost still, at least for longer term retention. But increasing scale and its costs are still constraints and more likely to become tighter than otherwise.

Fuzzy term
Standing back and looking coolly at it, data is a fuzzy term. It is meaningful in an ICT context because it is all bits and bytes and invisible to us mere humans. But within it are documents from emails to legal agreements to authors’ ‘manuscripts’ and fiction, audio and video.

This kind of captured history is close to the level about which Vint Cerf quite rightly expressed concern recently. For all previous generations, history and culture centred in many ways around libraries and artefacts and historical records — physical records. The Rosetta Stone was studied for years by scholars before the language was even identified. The records of The Honourable East India Company and a dozen other mercantile empires, plus the Royal Navy, have provided more English language historical information than almost any other resource from the regions where they operated.

Digital is different. No one can tell at a glance what any segment means. Then there is the question of how to read both the code and the media. Early word processors at least produced documents from which the formatting code can be stripped to convert the content to plan text or at lease ASCII characters. But early graphics and scanning programs used code that is long since abandoned, archaic if that can be applied to stuff that has yet to qualify as vintage much less antique.

There is, or so tech folklore has it, a small company somewhere in the US Mid-West making a quiet killing because it bought up a bunch of ancient equipment — Burroughs and Hollerith punched card readers, mainframe tape drives, Vax minicomputers, 5.25” floppy disk drives — and serves clients like local government, libraries and educational institutions in retrieving data onto modern media. Today, they probably work with cloud.

Future historians
Our first steps into the digital world will presumably as significant or interesting to future generations as say atomic energy, the moon landing or DNA sequencing are to us. Yet it could all be lost to future historians, Cerf pointed out to the American Association for the Advancement of Science. We face a forgotten generation or even a forgotten century, he asserted, through what he called ‘bit rot’, where old computer files become useless junk.

He calls for the development of ‘digital vellum’ to preserve old software and hardware so that out-of-date files can be recovered no matter how old they are. Perhaps ironically, today’s data/content is natively digital and independent of the storage media so it is quite conceivable that we will be able to develop automated translation/conversion software to keep it all accessible as the generations of reading tools progress.

Read More:



Leave a Reply

Back to Top ↑