When and where to deduplicate your CRM – Part 1

Summary: This article provides guidance on when and where to deduplicate your CRM system, and the processes required to manage those duplicates. This is part one of the series – a new CRM.
Article Type: Information/Guidance
Related Product(s): This article relates to the following products:

  • Paribus 365
  • Microsoft Dynamics CRM/365
  • Infor CRM
  • Saleslogix

Deduplicate your CRM

Duplicate records occur for many reasons, in the life cycle of a CRM system.  We have built the Paribus 365 products to address this issue head on.  However, having a technology to address the problem is only part of the solution. You also need to consider where and when to deduplicate your CRM.

This series of articles looks at a couple of common scenarios and offers recommendations on how to plan the attack on your duplicate data.  Some of the details in the recommendations are made with Paribus in mind, but may apply to other deduplication products.

Scenario 1 – A brand new CRM system to deduplicate

When and where to deduplicate your CRM
In most cases, a new CRM system will have data to be imported or migrated to it and  commonly, there may be a number of data sources.  Where data quality is considered, it is often the case that the customer will plan to outsource the data quality process, including deduplication, and bring it in to the CRM “all nice and clean”.

Our experience is that there are many issues with this approach.  We have no issue with having the data re-validated, i.e. have the addresses corrected, updated etc. or having the data enriched, i.e. adding additional data to it, like turnover, number of employees, demographic information, etc.

These processes can be helpful in the duplicate identification step, particularly the address validation.  However, we often see the following issues, when running the deduplication process externally:

    • The level of matching is generally not very sophisticated.

Many data bureau companies make broad assumptions on what is a duplicate, primarily or solely, based on Company Name and Address being the same across matching records.  It will also not take into consideration the nuances of your business and your definition of what a duplicate record is.

    • Are the duplicate records merged or simply deleted?

In a lot of cases, we have seen the later.  By simply deleting the duplicate record(s) you could be throwing away high valued associated data.  For example, when deleting a duplicate account, you may be also deleting one or more associated contacts which are not linked to the retained account. And this means you may also throw away other useful data such as a phone number or web site address available in the duplicate record(s) but not in the master record.

    • Which record should be the Master?

Equally, what determines the Master record to be retained in the first place? If you were able to view the list of duplicates before choosing the Master,  you may make a more informed choice as opposed to a fairly random one.

    • Which data source is the Master?

If you have multiple data sources, does one source have more value than others? Or is it known to be more accurate and up to date?  Should this source be considered the preferred Master against the others that are matched?

When Accounting or ERP system data are being imported, it is quite common that this is the preferred Master, as it contains live customer data that should be up to date and accurate.  However, this Master may not contain all the appropriate contacts for marketing, or support for example, which you may have in another source.

Deduplicate Your CRM - Business Benefits

Our Recommendations

  1. Firstly, review all the possible sources of data to confirm each one has sufficient value to spend time and money to import/migrate it into CRM.  If you have data of a certain age, its value may be very low and not worth migrating.  For example, when we migrated to Microsoft Dynamics 365, we defined that our “data of a certain age” was data that didn’t have any activity logged against it within the last 3 years.  This saved us both time, effort and money on not re-validating that old data, as well as reducing the time to migrate it.
  2. Consider which of your data sources is to be considered the “Preferred Master” source.
  3. When importing/migrating your data, consider tagging each data source with some form of identifier to reference its origin (this can achieved by storing the identifier against each imported data item).
  4. Having imported/migrated the data into your CRM system, now apply Paribus  to identify duplicates.  Initially you can deduplicate the Preferred Master source data.  Using the Filter functionality you can “fence off” this master set of data for review.  Having merged and cleansed this Preferred Master set of data, you can make further use of the Filters to deduplicate the remaining data.  By creating a filter for the other data sources, it will be possible to run a review of the other data sets (in bulk or each in turn) against the Preferred Master data set. Note: The settings enable you to ensure that the Preferred Master record is the primary record when it comes to a merge.

The above approach has many significant advantages against deduplicating outside your CRM system.

    • You are in control

You know your data and when it comes to confirming what is or isn’t a duplicate, and which should be the Preferred Master, you decide.

    • You retain more valuable data

The merging of records within CRM, as managed by Paribus Discovery,  means that the Master record is enriched with data from the duplicate(s).  Contacts, Activities, Opportunities, Cases etc. are all moved to the Master.  Where the Master record is missing a phone number and a duplicate has one, it is moved also, ensuring valuable data is not lost.

    • You can manage Contacts and Leads

We have not seen many external bureaus that handle Contacts and Leads.

    • You will find more duplicates!

The key strength of Paribus Discovery is identifying duplicates not likely to be found by other means. Read our data diary on What is Fuzzy Matching?  for more information.

Clearly there is some time and effort to be invested, deduplicating is an iterative process, there is no single silver bullet (sorry to burst that bubble!), but invest the time and your users will thank you for it.  Actually, hold that, they probably won’t, they will just get on with using the system more effectively.  They will however complain – or simply not use the system – if you don’t deduplicate your CRM!

In the next article in this series we look at managing duplicates when you have bought marketing data.

 Related Resources:
Further Information:


See the Paribus Help Center User Guidelines for important considerations of use.