Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explanation on the entity matches for Identity resolution #1178

Open
KarthikeyanTWL opened this issue Jan 3, 2024 · 1 comment
Open

Explanation on the entity matches for Identity resolution #1178

KarthikeyanTWL opened this issue Jan 3, 2024 · 1 comment

Comments

@KarthikeyanTWL
Copy link

Hi Folks, I have a few questions on dedupe,

  1. Dedupe provides a confidence score for each match, but can it also provide the explanation of why the matching is done? eg: "These two records are matched because the First name and the phone number are the same", or something like that?

  2. Given the new data that comes in, will it match existing identities? eg: consider the data is coming from kafka stream and the identity resolution should be done in real time for the new data.

  3. Is there an enterprise option available? If yes, what are the additional things that will be provided?

Thanks in advance!

@ArVar
Copy link
Contributor

ArVar commented Jul 2, 2024

To point 1: Theoretically it should be possible, since the pairing is based on hierarchical clustering and linear logistic regression. But the problem would be the potential vast amount of predicates which are actually learned. As far as I understood, the hierarchical tree is build upon the weights, learned for the predicates. This would make it hard to derive a real explainability. But I might be wrong.
Nevertheless, such a feature, although probably very costly, would be very nice. 👍

To point 2: That is what the (Static)RecordLink- and (Static)Gazetteer-Part is for. (The hard part will be, to maintain the groundtruth somewhere ;-) )

See also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants