A Clojure library for querying large data-sets on similarity
-
Updated
Feb 17, 2019 - Clojure
A Clojure library for querying large data-sets on similarity
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Sketching data structures for scala, including t-digest
UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting
ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale
This project aims to use Yahoo Theta Sketch api as Spark sql UDFs
A barebones implementation of the simhash data sketching algorithm.
Type-classes to interface isarn-sketches with Algebird
Add a description, image, and links to the data-sketching topic page so that developers can more easily learn about it.
To associate your repository with the data-sketching topic, visit your repo's landing page and select "manage topics."