Tag Archives: data cleansing

Exactly How Bad is Your Data?

Underestimating the damage caused by an out of date and duplicate riddled database is the most frequently made and damaging mistake.   If I had ten bucks for every time I’ve heard a client admit that their database is lousy, I probably wouldn’t quite be able to retire yet, but I’d sure have a lot more money than I do right now.

When I ask the question -“Just how bad is it? ” the average answer is that about 10-20% of the records are probably out of date and the reason it can’t be fixed is that there just isn’t the time, the money or the directive from senior management to do the clean up.  I can only guess that they have decided  the data is better than it really is and that it’s not important enough to worry about. That really needs to change.

Databases require constant, relentless maintenance. A database is – in my occasionally humble opinion, the greatest example of entropy that exists in the modern business universe. A database is constantly degrading. It doesn’t take a vacation from falling apart and it becomes virtually useless long before it reaches the point where it can’t get any worse.

A database is an asset which when left alone turns into a liability.  Would you tolerate a furnace or air conditioner in your home that only ever delivered half of what you were paying for? Would you put up with a car that only EVER got you half way home or an elevator in your office that never made it up to your floor?  Then what is the logic of taking something as vital to your business as your customer and prospect universe and ignore the fact that 20% of the records are no good?

You really have no idea just how bad it is.  That 10-20% out of date is probably a lot worse.  It’s probably more like 30-40% out of date and even that wouldn’t be quite so disastrous if you could tag and isolate which records are no good, but even that hasn’t happened in many cases.

Your sales team is NOT cleaning and updating your database as they make their calls. How can anyone actually expect that their sales team is cleaning up the database? Really.  I don’t understand if this is a case of completely underestimating the true value of sales time, if its just a pathetic excuse for ignoring the problem or its something that actually might apply to companies who do not have a real sales team, just a bunch of telephone order takers who are perfectly suited to take the role of overpaid inaccurate data entry staff. But if you have a real sales team, making real sales calls, with real sales quotas you expect to be achieved, don’t believe for a split second that they are expending one minute of precious time updating the database. It’s not happening.

So what is it costing you?  The short answer is a larger small fortune that you might think. Your marketing team is working with budgets that will often limit the number of contacts messaged with any given type of campaign. Simplistically, you could say that if 30% of your data is our of date, you’re throwing away 30% of every campaign investment, but it’s not quite that simple. Since contacts are usually pulled based on different criteria  the bad records will be scattered randomly throughout any list. No two will be the same, which means that there is no way to provide a consistent value for either the built in wasted money or the response failures due to bad data.

That means you can’t even accurately measure your responses. So you cannot test or at some point improve ANYTHING.  You cannot measure your creative, you can’t evaluate your offers, you can’t accurately benchmark a single metric.

Your marketing team looks incompetent because your response levels are always going to be lower than they should be (or really are) and you can’t improve them through any mechanism beyond dumb luck.  You will throw away bad ideas without ever understanding why they’re bad and you’ll throw away brilliant ideas because you never figured out they’re any good. One  of the most important tasks a CMO faces is to deliver a measurable and improving ROI on marketing investment and demonstrate a contribution to revenue and the bottom line.  Just how bad do you look as a CMO when everything your measured on is based on immeasurable data?

No one using your database will care about entering more junky information which gives your sales reps the prefect excuse not to keep their notes for client information up to date and will probably mean that not only will your customers not receive any new sales or marketing information, they’ll also miss your administrative updates.

Licensing and maintenance renewals will be lost or late and your A/R results will be similarly affected. If you add up all the ways that your company can lose revenue opportunities and incur unnecessary expenses all because of databases that are out of date.

There are many companies offering software and services that will allow you to evaluate your data and some fixes are more easily secured than others, like address, telephone and email information.

Other, potentially more important updates like finding the correct contact names might require a more individual form of intervention like the Boxpilot’s Data Filler Service





Clean Data is Everyone’s Responsibility

Too many businesses seem to think that maintaining the quality of the database can be managed by the sales team and the accounting group.  While in absolute terms this might be a bigger problem in smaller businesses without a budget to purchase data hygiene and append services on a regular basis, large organizations could probably save a small fortune in data costs if everyone working with the data took some measure of responsibility for it’s management.

Last week I spoke with a company executive who truly seemed to believe that because the sales team was actively working the prospect database of some 10,000 businesses, they could expect a 90% accuracy rate. How do you tell someone that they’re delusional?

According to Netprospex, the average B2B database decays at a rate of 2% a month, which means that in a year, one quarter of your contact information is useless unless it is regularly maintained. If you believe that  your sales people can adequately manage that job in addition to the real reason you have them on the payroll- which is to sell- then I suggest you sit yourself down with a calculator, look realistically at how many different companies they have contact with in a year and you’ll start to get an idea of how ugly your prospecting base might be if you lift up the lid and look in the box. Not to mention that sales teams are not exactly renowned for their meticulous attention to detail.

Make your data everyone’s concern.  With well distributed and clear standards for how data should be entered, no one who accesses the data base is too big or too small to contribute in small ways, like tagging/flagging duplicates, filling in fields that they might have the information for and correcting simple, obvious errors.

If, like many businesses, your database is key to the success of your marketing and sales programs, everyone benefits when the information is improved.



Avoid a Database Disaster with 5 Simple Steps

There’s no denying that your company’s customer/marketing database(s) is an invaluable asset to your business and at the same time a major pain in the neck.  There are just too many ways that it can be damaged – as far as usability is concerned- and unless you’ve been through it all before, chances are you will not anticipate how you can go wrong.

get help here

It’s difficult to define the exact information you need to input in the first place.  Don’t just go with the software defaults, unless of course it really is important for you to record the President’s secretary’s birthday.

When you’re buying data, it’s equally easy to be seduced by countless fields of nice-to-know stuff, but if it doesn’t stay up to date is pretty worthless in the long run.

When you have different individuals who are inputting data (including the dreaded sales team), consistency can quickly go out the window. While adding partial data seems much better than adding nothing at all, you’ve just kicked the “Duplicate My Records” door wide open and if some of your original source information comes from self-filled on-site forms, you’ll quickly find that much of what people give you is not true.

Complicating the problem of errors in the design of your database and inconsistent input, it’s horribly true that a database is a fabulous example of entropy because the people and companies in your database are constantly changing.

Taken together, these (and many other factors) spell Data Disaster, unless you can consistently follow 5 Simple Rules:

  1. Remove your duplicates and establish standards of how the data in your fields is entered to avoid adding more duplicates
  2. Use it or Lose it.  Untouched data does not remain accurate, regardless of how good it was when it was originally entered or how much you paid for it.
  3. Look at a manual or automated append service to bring your records up to a usable standard. It’s much easier to avoid entering duplicated when you append simple address information against which you can match the files.
  4. Verify your key information.  One thing I can’t personally buy into is to allow anything automated to update actual contact names given the many different ways that job titles can be interpreted. Even using a verification source like LinkedIn can still allow for the insertion of contacts who have already left a company before you even enter them.  Ironically enough, I’m far for comfortable accepting automatic information for C-Level and Board Member contacts in major corporations than the information for their subordinates.  Any data for middle management should be confirmed by a call to the company.  This is where the volume of your contacts will probably be and the most errors.
  5. Stay on top of your data.  Consistently applying a relatively small amount of time, attention and money to maintenance, will help to keep your database as an asset to your company instead of an albatross.







Nothing About Data Cleansing Is Easy

Too many companies are building their marketing programs based on lousy data.  While consumer databases are easily overwhelmed with the staggering volume of available information, B2B databases are inherently more complex and once they start to deteriorate- downright ugly.

This is actually a post for smaller businesses about setting up your database to run an append, but the more you look into the subject, the more you’ll tighten up the controls on what goes into your database in the first place so that you’re not overwhelmed with what are actually the first simple steps. You might even want to consider setting aside a small portion of the budget you assign to any marketing program that uses your database in order to improve your database information every time you use it.

Appending your database is simply a process by which the companies in your database are matched up with those in another master database and once matched, some of the empty fields in your data can be filled with the information from the other base. There are of course limitations to what can be filled in with any hope of accuracy and while you can use appends to pull up standard industry information, phone numbers, some web data and executive names, I’m highly dubious of the quality of the contact, title and individual direct line and email information for anyone but the most public figures within an organization. Additionally, you have to consider the limitation contained in  the phrase “matched with another master database”, because  the match rates might actually be very poor, which means you’ll still have a lot of holes when you’re finished.  Oh yes, and its far from free.

What that means is that before you can move ahead there are three things that must be done:

Select Your Files.  You need to determine which data in your base is worth spending the money on.  Lead data that is very fresh is one thing, but do you really think its a good use of your money to fill in empty data fields for a lead you generated five years ago and never responded to your subsequent efforts to convert?   Once you’ve made that decision, your first task will be to isolate that data. How will you do that?  If it’s by sorting to a code/date/source that was never entered in the first place, you will have just hit the first of what will probably be many snags.

Identify and Remove Your Duplicates.  As with the previous task, this sounds easy but it can be a terrible job, but really, you don’t have much of a choice if you ever want to clean up your data. There are a few obvious places to start. For example if you actually have contacts in your database that are flagged as as duplicates or no longer working with the company and/or you have companies that are flagged as duplicates or out of business, why are they still there.  It might seem as if they are already discounted enough to ignore, but that’s just because you’re not the sales rep who is manually working with the data and might just not notice that little field in the corner that identifies the contact file you just entered your notes and next steps into is the duplicate file?  Sound stupid?  I can assure you it happens a lot and now you have good information in a bad file.  The other challenge many companies will face with the simple question of duplicates is to identify which is actually the good record and which is the dupe that can be removed.  It’s not at all out the question that at some point you’re going to have to put a real set of eyeballs on your data to make the decisions you can’t trust the software to make. Tedious, expensive and time consuming work it is, too.

Clean and Standardize Your Remaining Records.  When you begin the append, mop to clean datayou’ll be able to get a half decent match rate if you’re working with data that has been cleansed.  That means that at the very least, numbers have to be formatted consistently, address formatting and abbreviations also need to be standardized.  There is software that will help you clean up your data and get it into the right format to maximize your match rates.

Right about this point, if not already, you’ve probably at least made a few scratch notes on new database entry policies around data formatting, duplicate checking and key information fields, so that you might not have to go through this again, or at least, not for a while.