Days of Ore

Pro

1 April 2005

Once upon a time the simple objective of data management was to record and store information so that it would be secure and available quickly when retrieval was required. Ah, innocent times! Today, we are beginning to assume that almost all data can yield added value if we collect enough of it and analyse it smartly enough.

Data Mining is very smart technology that employs the most sophisticated software (‘artificial intelligence techniques’ are mentioned a lot) to look for patterns and relationships in huge information stores. Data mining allows organisations to get at valuable nuggets and patterns of information stored, and often lost in, vast databases. It was originally focussed on static data that probably stretched back over quite a long time in large organisations like banks, utilities and major retail chains.

This was and still is exciting for anyone looking for competitive edge marketing information because, in essence, data mining can examine data in ways that were not even thought of when the information was first collected. 

 

advertisement



 

Most business people have probably come across the obstacle of only getting from your system the very specific information it was set up to yield. For example, every customer address may contain Mr or Mrs, but you can’t do a breakdown by sex because that was not set up as a separate ‘field’. Well, data mining techniques can do much more sophisticated analyses.

Today, the latest generation of smart search algorithms is equally at home with ‘live’ data. They have to be, since the focus now is on a fast-moving e-commerce world with direct electronic connections. Data that was generated in the last few seconds’ business may need to be analysed in case it could inform how we do the next few seconds’ worth of trade!

Assembling all the clues

There are various estimates that suggest the volume of information stored on the world’s computers is doubling about every ten months or even less. In this context the ‘data mining’ metaphor is still one of the most apt in IT, since looking for the right information for your purpose is so often like mining: digging your way through lots of dross to get at the relatively tiny nuggets of ore.

In essence, data mining is the extraction of knowledge — meaningful relations and patterns — from simple data. One definition is ‘…the extraction of implicit, previously unknown and potentially useful knowledge from data.’ So the knowledge acquired is new, not obvious and capable of being used.

The basic concept behind data mining is that in any organisation each and every ‘transaction’ gives us one more clues about the nature and behaviour of the customer and the business environment. 

It could be a purchase in a supermarket or online, a withdrawal from a bank ATM or a quick online account balance check. It could be the collection of a pension from a post office, a visit to a hospital outpatients’ department, a delivery or whatever. Every data record generated may yield another tiny little jigsaw piece to help build up a picture of the whole — for example, the day of the week or the time of day that the activity takes place. 

The software techniques involved in data mining are not themselves new: advanced relational database technology, statistical analysis, machine learning methods, expert systems or artificial intelligence.

A closely related concept to data mining, and an equally simple metaphor, is the data warehouse. Putting it simply, it is a special way of organising data so that it can be easily accessed and used by many people within an organisation in much the same way as items can be logged, tracked and recovered in the metaphorical warehouse.

Data mining collects and orders relevant data from a wide range of formats, possibly different locations, so that it can be then all made available in one unified database. This data is transformed into a consistent and easy-to-use form for business use. It is in fact a decision support system, extracting information from a mass of data.

Nowadays the terms data mining and data warehousing seem to be linked all the time to our other current darling, Customer Relationship Management. So it is worth pointing out that what they actually do — and can do very successfully — is to help the organisation generate information that can be valuable when applied to a CRM system.

The term that is increasingly being used in relation to data mining is ‘predictive’. For instance, with all of the information it has on its customers, a business can begin to predict fairly accurately what an individual or sub-set of customers is likely to do in certain circumstances. It is also, logically enough, possible to predict even more accurately how large groups or categories of customers would behave.

Eircom, with its customer and call data going back decades, undoubtedly has one of the largest data stores in Ireland. It has developed its own in-house data mining expertise since the early 90s. ‘CRM is now a buzz word but we have been developing what we call our ‘CRM machine’ for nearly four years,’ says Eircom’s David Kelly. ‘It is all about a single, unified view of the customer that will empower managers to make better informed decisions — and to make them more quickly. Our “Options” discount scheme, for example, is based on what we know of our customers’ habits and call patterns.’

IBM is one of the pioneers of data mining (Dublin-based Dr Barry Devlin of IBM wrote one of the definitive books on the subject in the early 90s) and has tracked the evolution from the early techniques through data warehousing to the smart predictive applications of today. 

‘There are many smartly managed data stores out there that span the evolution of data mining,’ says Oisin Byrne, head of the Software Group in IBM Ireland. ‘The banks and large financial services were into it early on and of course the most publicly visible are the loyalty schemes of major supermarket groups like Superquinn, which was set up very early, and Tesco and others today.’

He points out also that the marketing and CRM applications are the glamour end, but data mining and its analytical tools are being used very successfully at the other end of that spectrum, in fraud detection and prevention. 

‘There are also other major non-commercial areas like healthcare and hospital management or government where what we are now inclined to call ‘intelligent mining’ techniques are yielding valuable information for decision makers,’ says Byrne.

Database doyen Oracle has espoused data mining to the extent that such a capability is actually built into the latest (Oracle 9i) edition of its core product. ‘CRM is undoubtedly a driver,’ says John Caulfield of Oracle Ireland, ‘in that it is very often data mining that provides the underlying foundation for intelligent CRM systems. In contact-centre systems, for example, it is reaching the point where analysis is taking place almost as the latest data is coming in from customer interactions and transactions.’

Data mining and direct marketing are also good partners, John Caulfield points out. ‘You can target your promotion and your prospects better with mining techniques, so that instead of a million shots with a one or two per cent response you may send 50,000 shots and get a 20 per cent response rate or even better. Same return for less outlay.’

But he too is anxious to point out that data mining is now playing a major role in non-marketing areas. ‘Purchasing and the whole area of supply chain management has a lot to gain. Many large organisations are deploying data mining applications to look at costs, deliveries, quality and all of the thousands of variables so as to identify ways in which savings or improvements can be made.’

Client Solutions is an Irish company, now part of the Horizon Group, that has been involved with data mining for many years. Managing director Shemas Eivers is firmly of the view that the essential strategy for such applications today is to get the data store right first. 

‘Classic data mining was a process of automatic discovery and had to do with getting answers to questions we hadn’t thought to ask,’ says Eivers. ‘Now we are inclined to ask questions and expect the data to provide answers. I think any organisation getting into this territory should not get too hung up on what users will want in their reports — which will change in time anyway. The priority is to focus on what sets of data are available and to design the data store accordingly.’

He is also inclined to be a bit dismissive of the data warehouse/data-mart approach: ‘That may be over complicating the task for the scale of enterprise we have in Ireland. If the data is good, the sources are validated and the data store is well designed and managed, then it should be possible to get it to yield the best information available. Data mining technology is one very clever way of doing that but there are others as well.’

Cleaning it up

Shemas Eivers is one of several experts who points out that the quality of data is fundamental to the success of any process that aims to gain ‘business intelligence’ from raw data. ‘Data cleansing’ is a term that anyone venturing near this topic will quickly come across. ‘Data Quality is the weak underbelly of a variety of mission critical applications and infrastructure components in customer relationships, supply-chain management, data warehousing and data mining’ according to a report last year from the Giga Information Group.

One software company specialising in data cleansing tools that are an essential complement of analytical systems is Similarity Systems. Its core product is Athanor, which provides analysis, standardisation and enhancement capabilities for organisations with large data volumes. ‘The driver is that businesses have invested heavily in CRM, data warehousing/data mining and e-business applications,’ says marketing manager Tommy Drummond, ‘but they have not put data quality processes into place. As a result, the applications are not providing sufficient return as they are trying to perform with low quality data.’

He points out that an authoritative PWC Global Data Management Survey 2001 found that ‘75 per cent of respondents reported significant problems as a result of defective data.’ In a similar vein a Gartner Group report on CRM last October stated: ‘Only when a foundation of good data has been built will enterprises find that subsequent investments will generate acceptable payback.’

Duplication of customer information from different databases is a common and well understood problem when consolidating data, but Tommy Drummond points to subtler and potentially even more serious problems. ‘Consistency and conformity of data from different data set sources is always tricky. Completeness is an issue – being the proportion of data fields containing usable data, not empty or just with default values. You can also have issues of integrity with data that is missing important relationship linkages, eg family or subsidiary relationships.’

One of the international leaders in data mining for many years is SAS Institute, now the largest privately owned software house in the world. Irish head Patrick Durkin says that traditional data mining has mutated into analytical applications. ‘The early users were banks and giant retailers, and they still are today, but most organisations do not have the in-house expertise to use esoteric software tools. They want user-friendly stuff to help them see how to cross-sell financial packages to their customers or analyse shopping baskets — on-line or in actual supermarkets — to find ways to get customers to put more into them!’

Pure data mining is an off-line exercise, Patrick Durkin points out, where business intelligence is gleaned through rules-based analytical code. But many e-business applications need to work in real time and the analytic tools from data mining have matured into that new environment. 

‘You start with a model that is based on history,’ he says, ‘then from what you learn you can build in prediction by assigning scoring for different factors. But of course ‘history’ can be a few seconds ago and the analytical system is learning and feeding back all the time.’ This is the kind of smart system that enables, for example, instant on-line decisions about loans and even mortgages (subject always to the small print!)

The last word goes to Barry McIntyre, CEO of Irish data mining specialist Habaca: ‘The hype is about the ‘predictive’ capabilities of the CRM, data mining/warehousing set of state of the art software. But we must not confuse prediction with forecasting — forecasts are all about specific time periods whereas prediction is not time-dependent and is much more qualitative and subtle. You will always want to answer the ‘What would happen if…?’ type of question. But what the smartest system will give you is the factors that will affect the outcome and the kind of outcome.’

What also happens, of course, is that key performance indicators emerge which can be tested against forecasts and fed back into an ever more accurate loop. But one key fact of business life is unchanged: managers still have to manage, people still have to make decisions and take responsibility for them. Still, those who need to can now blame much smarter systems!

Analysing the Anonymous

It is easy to appreciate how knowledge of customer behaviour and patterns can help a mobile operator such as Eircell to make decisions and plan its strategies. But when you don’t know who your customers are, there is a real challenge. 

Habaca (formerly SPSS Ireland) is a specialist data mining company that was retained by Eircell to help it make sense of the prepaid market and ‘churn’ because ‘Ready to Go’ customers are in some measure always ready to do just that. 

The Habaca team set about segmenting and profiling these anonymous users, working just with dates and times for purchase and top-up and call usage. It then worked out the churn patterns in the user base and finally predicting churn by identifying vulnerable user profiles across the customer base. 

Managing churn allows Eircell to target campaigns and interactions with their customers to ensure they are treated in the most suitable way according to their needs. As Anthony O’Neill, Customer Insights Manager explains: ‘Data Mining’s most fundamental advantage is that one does not get blinded by a mass of data. It allows you to try out lots of ideas quickly and eliminate irrelevant ones from consideration. Without Data Mining, valuable pieces of customer information can pass unnoticed.’ 

Eircell uses the results of predictive data mining to make strategic marketing decisions to minimise churn and to implement targeted promotional campaigns. The ability to predict behaviour in its customer base allows Eircell to develop long-term CRM strategies. On-going analysis of the data gives an up to the minute understanding of the different customer segments using predictive churn models.

Read More:


Back to Top ↑