By Heather Todd, Principal Consultant
Data Integrity: Duplicate matching via BrightVine Data Link Enhanced Fuzzy Matching
Do you have confidence in your data match settings? You’ve configured imports and batches, set your global data entry settings, and specified the thresholds for automatic processing, review, and new record creation, but those pesky duplicate records keep appearing. There’s a problem, but where does it lie?
The usual sources of data entry (and the usual culprits for bad data) are manual input and data import; if we put manual data entry aside, to have confidence in our imports, we need to make sure that our data matching capabilities are working the way we need them to. Because it’s this incorrect matching that can lead to duplicate records, either complete (where the record is a copy of another) or partial (where it may start with limited 1:1 copy details and then progress to include other erroneous data), as well as incorrect records where data was added to the wrong record.
The existence of duplicate records in the system is one of those “sad but true” scenarios which needs to be managed and kept to a minimum; not only does it lead to bad reporting and skewed metrics, but if you have incorrect contact information in your system it can lead to an array of mistakes when it comes to mailing and solicitation. Embarrassing mistakes with your mailing list reflect badly on your business, can spoil your reputation with donors, and do irreparable harm.
Obviously, data deduplication (the process of eliminating copies of repeated data and merging them into one single instance storage) is a necessary and crucial part of database maintenance and should be performed regularly. It minimizes the amount of space required to save data on a given database, and by doing this, you can instantly improve the quality of your data, gaining strategic marketing and business decisions and the value of a single constituent view. But wouldn’t it be nice if we had fewer duplicates in our system already?
In Blackbaud CRM™ an exact match is only made on the following fields, which can prove to be limiting:
Last Name
First Name
Email
Lookup ID
The BrightVine Data Link can enhance your duplicate matching by using our Enhanced Fuzzy Match Dynamic Operation Orchestration.
Fuzzy matching uses a distance algorithm mathematically determined to be an effective string-matching algorithm. You pick the fields to include, and an algorithm compares them against the indicated fields to determine the distance between the values. Matching on additional fields such as date of birth, Maiden Name, Nickname, Previous Name, Title will raise your chances of matching data.
Here is an example of what results might look like when using this method.
Constituent 1
Martin McFly
1955 Twin Pines Lane Hill Valley, CA 92805 714 444 4444
mmcfly@backtothefuture.com
Constituent 2
Marty McFly 1955 Twin Pines Lane Hill Valley, CA 92805 714 444 4444 mmcfly@backtothefuture.com
Criteria 1: Full Name and Email Address
Comparing the two strings:
MartyMcFlymmcfly@backtothefuture.com
MartinMcFlymmcfly@backtothefuture.com
Similarity Score: 94%
Criteria 2: Full Name and Address
Comparing the two strings:
MartyMcFly1955TwinPinesLaneHillValleyCA92805
MartinMcFly1955TwinPinesLaneHillValleyCA92805
Similarity Score: 93%
Criteria 3: Full Name, Birthdate, Primary Education College, Gender Code
Comparing the two strings:
MartyMcFly29091968EngineeringMale
MartinMcFly29091968EngineeringMale
Similarity Score: 93%
Enhanced Fuzzy Match is just one example of BVDL enhanced matching capability. There are also Enhanced Match Score and Enhanced Constituent Match Selection. If you'd like to learn more about Enhanced Matching or the BrightVine Data Link don't hesitate to get in touch with our team.
Comments