Summary: | This article discusses data matching sequence variation, what is it and why is it important |
Article Type: | Technical Article |
Related Product(s): | This article relates to the following products:
|
Data Matching Sequence Variation, what is it and why is it important
When it comes to matching data records within a standard SQL database, there are a number of options that are within the capability of standard Structured Query Language (SQL). However, in order to be more effective when trying to match records, you sometimes have to go outside the box.
Sequence Variation is a technique that will match data irrespective of the order of the data within fields, for example:
- Birmingham University | University (of) Birmingham
- John Smith | Smith John
These simple but common data capture errors are not going to be matched and found using standard SQL.
So if you are trying to match data, more specifically, if you are trying to find duplicate records within a CRM system, having the capability of sequence variation can be very helpful.
Not only does it apply to the above example of company names and contact names, but equally and very beneficially to addresses.
If you are running a duplicate detection scan of your database/CRM, having sequence variation in your kit bag means that you do not have to worry if the ZIP Code is in the City field, etc. By using a composite address string with all the address elements included means that you will still have an effective search with inconstant data entry.
Building this sort of capability into a “fuzzy search” or matching process is not easy, but the effects are very evident in the results identified when using it.
Our data deduplication solutions – Paribus Discovery (Cure) and Paribus Interactive (Prevention) use a fuzzy matching algorithm in their search capabilities, to find out more, go to www.paribuscloud.com.
Related Resources: | |
Further Information: |