The Importance of a Match Engine in your MDM Strategy
I came across an article, The What, Why and How of Master Data Management, that gives a good overview of what master data is, why it is important, what master data management is, and what it entails.
There is quite a lot that fits under the umbrella of MDM, but a key part of it is the matching of records from different systems, or new/updated records entering the system. From the article:
Generate and test the master data: This step is where you use the tools you have developed or purchased to merge your source data into your master-data list. This is often an iterative process requiring tinkering with rules and settings to get the matching right. This process also requires a lot of manual inspection to ensure that the results are correct and meet the requirements established for the project. No tool will get the matching done correctly 100 percent of the time, so you will have to weigh the consequences of false matches versus missed matches to determine how to configure the matching tools. False matches can lead to customer dissatisfaction, if bills are inaccurate or the wrong person is arrested. Too many missed matches make the master data less useful, because you are not getting the benefits you invested in MDM to get.
Also:
Implement the maintenance processes: As we stated earlier, any MDM implementation must incorporate tools, processes, and people to maintain the quality of the data. All data must have a data steward who is responsible for ensuring the quality of the master data. The data steward is normally a business person who has knowledge of the data, can recognize incorrect data, and has the knowledge and authority to correct the issues. The MDM infrastructure should include tools that help the data steward recognize issues and simplify corrections. A good data-stewardship tool should point out questionable matches that were made, customers with different names and customer numbers that live at the same address, for example. The steward might also want to review items that were added as new, because the match criteria were close but below the threshold. It is important for the data steward to see the history of changes made to the data by the MDM systems, to isolate the source of errors and undo incorrect changes. Maintenance also includes the processes to pull changes and additions into the MDM system, and to distribute the cleansed data to the required places.
The article also describes cardinality, or the number of entities, as being an important criterion in deciding when to use an MDM solution. A data set with “thousands of customers” is described as being a large data set and there are many MDM solutions that can handle this volume. What separates the truly scalable solutions is how they deal with problem domains that are much larger. For example, customer databases with tens of millions of records require very efficient algorithms not only for speed, but also to minimize the manual data stewardship required for such data sets.
Clearly, the match engine that the selected tool provides is critical to the performance and ROI of the solution.
So what are key features of a match engine? At a high level they are:
- Configurability – A match engine is useful only if it can be configured and tuned for the application at hand.
- Integration – A match engine must be able to integrate with other systems to receive records and propagate changes/updates back to the systems.
- Data Governance – A user interface and tools must be provided for comparing, merging, editing, and tracking the records.
- Security / Auditing – Securing access to the data with role-based access controls as well as a complete audit trail of access is critical.
- Accuracy – The whole point of a match engine is to automatically match the appropriate records so as to the number of records a human has to match manually.
- Performance / Scalability – Initial load performance as well as ongoing performance is important, especially as the volume of data grows.
- Reporting – Getting reports on match engine performance and data quality metrics can be very valuable.
- Extensibility – The ability to extend backend functionality for specific domains or customize the user interface is often required.
In future entries on this blog, we’ll explore each of the above in much more detail. Stay tuned!

One Response to “The Importance of a Match Engine in your MDM Strategy”
Trackbacks