You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> -Bulk embedding loading and k-nearest neighbor search
12
12
13
13

14
14
@@ -252,29 +252,45 @@ In a json, CogDB treats `_id` property as a unique identifier for each object. I
252
252
253
253
## Using word embeddings
254
254
255
-
CogDB supports word embeddings. Word embeddings are a way to represent words as vectors. Word embeddings are useful for many NLP tasks.
256
-
There are various types of word embeddings, including popular ones like [GloVe](https://nlp.stanford.edu/projects/glove/) and [FastText](https://fasttext.cc/).
255
+
CogDB supports word embeddings with SIMD-optimized similarity search powered by [SimSIMD](https://github.com/ashvardanian/SimSIMD). Word embeddings are useful for semantic search, recommendations, and NLP tasks.
In the above code, the sim method is used to filter vertices based on their cosine similarity with the word embedding for "orange". The operator and threshold arguments determine how the similarity is compared to the threshold value, which can be a single value or a range.
305
+
#### Get embedding stats:
306
+
307
+
```python
308
+
g.embedding_stats()
309
+
```
310
+
> {'count': 50000, 'dimensions': 100}
311
+
312
+
The `sim` method filters vertices based on cosine similarity. The `k_nearest` method returns the top-k most similar vertices.
0 commit comments