Skip to content
@cleanzr

cleanzr

Popular repositories Loading

  1. record-linkage-tutorial record-linkage-tutorial Public

    A tutorial on entity resolution (record linkage or de-duplication)

    TeX 62 15

  2. dblink dblink Public

    Distributed Bayesian Entity Resolution in Apache Spark

    Scala 57 9

  3. fasthash fasthash Public

    Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).

    Python 14 3

  4. clevr clevr Public

    Clustering and Link Prediction Evaluation in R

    R 11 3

  5. representr representr Public

    Create representative records post-record linkage

    R 8

  6. exchanger exchanger Public

    Bayesian Entity Resolution with Exchangeable Random Partition Priors

    C++ 6

Repositories

Showing 10 of 21 repositories
  • blink Public

    This is main code for Steorts (2015), which is also on CRAN. Please cite the paper/code if you find this useful.

    cleanzr/blink’s past year of commit activity
    HTML 5 4 0 0 Updated Jan 10, 2024
  • exchanger Public

    Bayesian Entity Resolution with Exchangeable Random Partition Priors

    cleanzr/exchanger’s past year of commit activity
    C++ 6 GPL-3.0 0 0 0 Updated Jan 7, 2024
  • clevr Public

    Clustering and Link Prediction Evaluation in R

    cleanzr/clevr’s past year of commit activity
    R 11 GPL-2.0 3 1 0 Updated Sep 23, 2023
  • representr Public

    Create representative records post-record linkage

    cleanzr/representr’s past year of commit activity
    R 8 0 0 0 Updated Sep 5, 2023
  • exchanger-experiments Public

    Scripts for reproducing the experiments in our JSSAM article on Bayesian Graphical Entity Resolution

    cleanzr/exchanger-experiments’s past year of commit activity
    R 0 GPL-3.0 1 0 0 Updated Jan 24, 2023
  • microclustr Public

    Package for Betancourt, Zanella, and Steorts

    cleanzr/microclustr’s past year of commit activity
    C++ 2 1 0 1 Updated Aug 22, 2022
  • dblink-experiments Public

    Details for reproducing the experiments in our d-blink paper

    cleanzr/dblink-experiments’s past year of commit activity
    R 0 MIT 0 0 0 Updated Jun 10, 2021
  • dblinkR Public

    An R interface for the dblink Spark application

    cleanzr/dblinkR’s past year of commit activity
    R 5 1 2 0 Updated Jun 10, 2021
  • dblink Public

    Distributed Bayesian Entity Resolution in Apache Spark

    cleanzr/dblink’s past year of commit activity
    Scala 57 9 4 0 Updated Jun 10, 2021
  • italy Public

    A sample survey conducted by the Bank of Italy every two years containing duplicated data.

    cleanzr/italy’s past year of commit activity
    R 0 0 0 0 Updated Apr 19, 2021